Course Outline
Week 1 — Introduction to Data Engineering for Government
- Fundamentals of data engineering and modern data stacks
- Data ingestion patterns and sources
- Batch versus streaming concepts and use cases
- Hands-on lab: ingesting sample data into cloud storage
Week 2 — Databricks Lakehouse Foundation Badge for Government
- Fundamentals of the Databricks platform and workspace navigation
- Delta Lake concepts: ACID transactions, time travel, and schema evolution
- Workspace security, access controls, and Unity Catalog basics
- Hands-on lab: creating and managing Delta tables
Week 3 — Advanced SQL on Databricks for Government
- Advanced SQL constructs and window functions at scale
- Query optimization, explain plans, and cost-aware patterns
- Materialized views, caching, and performance tuning
- Hands-on lab: optimizing analytical queries on large datasets
Week 4 — Databricks Certified Developer for Apache Spark (Prep) for Government
- Deep dive into Spark architecture, RDDs, DataFrames, and Datasets
- Key Spark transformations and actions; performance considerations
- Basics of Spark streaming and structured streaming patterns
- Practice exam exercises and hands-on test problems
Week 5 — Introduction to Data Modeling for Government
- Concepts: dimensional modeling, star/schema design, and normalization
- Lakehouse modeling versus traditional warehouse approaches
- Design patterns for analytics-ready datasets
- Hands-on lab: building consumption-ready tables and views
Week 6 — Introduction to Import Tools & Data Ingestion Automation for Government
- Connectors and ingestion tools for Databricks (AWS Glue, Data Factory, Kafka)
- Stream ingestion patterns and micro-batch designs
- Data validation, quality checks, and schema enforcement
- Hands-on lab: building resilient ingestion pipelines
Week 7 — Introduction to Git Flow and CI/CD for Data Engineering for Government
- Git Flow branching strategies and repository organization
- CI/CD pipelines for notebooks, jobs, and infrastructure as code
- Testing, linting, and deployment automation for data code
- Hands-on lab: implementing Git-based workflow and automated job deployment
Week 8 — Databricks Certified Data Engineer Associate (Prep) & Data Engineering Patterns for Government
- Certification topics review and practical exercises
- Architectural patterns: bronze/silver/gold, CDC, slowly changing dimensions
- Operational patterns: monitoring, alerting, and lineage
- Hands-on lab: end-to-end pipeline applying engineering patterns
Week 9 — Introduction to Airflow and Astronomer; Scripting for Government
- Airflow concepts: DAGs, tasks, operators, and scheduling
- Astronomer platform overview and orchestration best practices
- Scripting for automation: Python scripting patterns for data tasks
- Hands-on lab: orchestrating Databricks jobs with Airflow DAGs
Week 10 — Data Visualization, Tableau, and Customized Final Project for Government
- Connecting Tableau to Databricks and best practices for BI layers
- Dashboard design principles and performance-aware visualizations
- Capstone: customized final project scoping, implementation, and presentation
- Final presentations, peer review, and instructor feedback
Summary and Next Steps for Government
Requirements
- An understanding of fundamental SQL and data concepts
- Experience with programming in Python or Scala
- Familiarity with cloud services and virtual environments
Audience for Government
- Data engineers at all career stages, including those new to the field
- ETL/BI developers and analytics professionals responsible for data integration and business intelligence
- Data platform and DevOps teams tasked with supporting and maintaining data pipelines