Change log
2024-11-16
- Add Data Processing Practices section which covers common data processing problems that arise in data engineering and how to solve them with basic data processing techniques.
2024-11-03
- Update Data Camping - Week 8 Data Orchestration and Automation section, inclduing the review of how to automate data development process. As we have learned Containerizaton from previous weeks and practiced it for a while, this is the time to review and starting to explore the xOps area.
- Update Data Camping - Week 9 Capstone Projects with the curated list of project of side-development, discovering space and other community project.
- In the Capstone project: we emphasize the consideration when you want to develop somethings that depends on business need. Otherwise, It is the technology discovery or POC, learning curve.
- I also mention about What is your resonsibility according the Srum project. Follow the Senierity Levels and the Responsibility Matrix.
- I hope this 2 sections for closing the camping will help you to understand the concepts and the practices of Data Engineering and extend next more weeks, months, and years to be a better data engineer.
2024-10-27
- Updating How Data Pipeline Structured section
- Discovering that multi-hop architecture is a common pattern in data warehouse and data pipeline.
- Trying to understand the pros and cons of each patterns included the 3 types of data pipeline design.
2024-10-20
- Adding Data Structures and Algorithms section and how it related to Data Engineering.
- This section covers the fundamentals of data structures and algorithms, including arrays, linked lists, hash tables, trees, graphs, stacks, queues, and dynamic programming.
- Also, I added the visualization, but it is limited of course, but I hope it will help you to understand the concepts.
2024-10-08
- Adding Cloud System Understanding section
- This section provides a comprehensive guide on how to start with cloud technologies
- Covering the fundamentals of cloud computing, including different service models (IaaS, PaaS, SaaS), Introduction to major cloud providers
- Overview of key cloud services essential for data engineering: storage, compute, and database services
- Best practices for how to get starting with Cloud.
2024-09-07
- Adding Week 7 Data Quality
- This is the essential part in any data engineering project as the MUST HAVE item. Good Data is Better Than Big Data.
- How to use data quality in data warehouse and data pipeline; e.g: Snowflake and dbt. I experience with Deequ and implement that into Spark Job.
- Adding Common Components for Analytics Platform and how Data Architecture looks in real-world. That helps you understand the daily works of Engineers who works on data industry.
2024-08-12
- Add Week6 of Data Engineering: Streaming with Kafka, Kafka use cases, Simple Kafka EcoSystem, etc.
- I mention about thinking process to resolve the problem of streaming data with Kafka.
- Add the Deep Work section in How to Read.
- Add Scope of Data Engineer & Better Data Engineer
2024-08-02
-
Update Basic content, Updated Analytics Platform, Updated Data Achitecture Design
-
I have added a subscription option for half of part two and whole part three. This was a difficult decision for me, as I genuinely want to share everything I write with all of you, you can check the blogs site. However, I must acknowledge the significant amount of time and effort I have invested in compiling and distilling my free resources into this comprehensive book.
-
If you find value in this book, I believe the small subscription fee is justified to ensure you receive the latest, up-to-date content in data engineering. I apologize if this decision disappoints anyone. This also allows me to dedicate more time to this book and my writing in general.
-
For more details, please visit the subscription page. Additionally, I have updated the copyright and legal notice, as well as the privacy policy, to reflect the technology used for managing subscriptions.
-
Furthermore, I am going to pack all the ebook into single package so that you can checkout and download it. Your support always meaningful with me. Thanks
2024-07-21
- Deployment Process and Plans
- Week 5 - Batching processing with Apache Spark
- Recommended books
2024-07-19
-
Project Structure
-
Changed structure of project
-
Added Structure by topic
-
Deployment Process and Plans
-
Add chart of deployment
-
Add Plans for each chapter
2024-07-15
- Data Camping
- Contents for Week 4 Analytics Engineering
- Reword contents
Changed
- Fix dbt_booking project
2024-07-10
- Added Hands-on:dbt project Data Modeling and Analytic Engineering
- Added What make you better Sections for each Week in Data Camping
- Added Sponsor
- Added Subscription page
Changed Structure
- Re-Structured Book
2024-06-30
- Added Data Engineering Camping - Week 3: Data Warehouse
2024-06-23
- Content
-Added Data Engineering Camping - Week 2.1: Install Airflow, Light version
2024-06-16
-
Content
-
Added Data Engineering Camping - Week 2: Data Ingestion and Orchestration
2024-06-09
-
Content
-
Added Data Engineering Camping - Week 1: Introduction and Setting up environment