Course Outline

Week 1 — Introduction to Data Engineering for Government

  • Fundamentals of data engineering and modern data stacks for government use
  • Data ingestion patterns and sources relevant to public sector operations
  • Comparison and application scenarios for batch vs. streaming processing in government contexts
  • Hands-on lab: ingesting sample data into cloud storage solutions for government

Week 2 — Databricks Lakehouse Foundation Badge for Government

  • Fundamentals of the Databricks platform and workspace navigation for government users
  • Core concepts of Delta Lake, including ACID transactions, time travel, and schema evolution in a governmental setting
  • Workspace security measures, access controls, and Unity Catalog basics tailored for government data management
  • Hands-on lab: creating and managing Delta tables to support government data initiatives

Week 3 — Advanced SQL on Databricks for Government

  • Advanced SQL constructs and window functions at scale, with a focus on governmental datasets
  • Query optimization techniques, explain plans, and cost-aware patterns for government analytics
  • Materialized views, caching strategies, and performance tuning methods suitable for large-scale government data
  • Hands-on lab: optimizing analytical queries on extensive government datasets

Week 4 — Databricks Certified Developer for Apache Spark (Prep) for Government

  • Deep dive into Spark architecture, RDDs, DataFrames, and Datasets, with applications in government data processing
  • Key Spark transformations and actions, along with performance considerations for government use cases
  • Basics of Spark streaming and structured streaming patterns for real-time government data analysis
  • Practice exam exercises and hands-on test problems to prepare for certification in a governmental context

Week 5 — Introduction to Data Modeling for Government

  • Fundamental concepts of dimensional modeling, star/snowflake schema design, and normalization for government data
  • Comparison between lakehouse modeling and traditional warehouse approaches in a governmental setting
  • Design patterns for creating analytics-ready datasets to support government decision-making
  • Hands-on lab: building consumption-ready tables and views tailored for government reporting and analysis

Week 6 — Introduction to Import Tools & Data Ingestion Automation for Government

  • Connectors and ingestion tools for Databricks, including AWS Glue, Azure Data Factory, and Kafka, with a focus on government data sources
  • Stream ingestion patterns and micro-batch designs suitable for continuous government data streams
  • Data validation techniques, quality checks, and schema enforcement methods to ensure reliable government data integrity
  • Hands-on lab: building resilient ingestion pipelines to support government data workflows

Week 7 — Introduction to Git Flow and CI/CD for Data Engineering in Government

  • Git Flow branching strategies and repository organization practices for government data projects
  • CI/CD pipelines for notebooks, jobs, and infrastructure as code, tailored for government deployment environments
  • Testing, linting, and deployment automation techniques to enhance the reliability of government data engineering processes
  • Hands-on lab: implementing Git-based workflows and automated job deployments in a governmental context

Week 8 — Databricks Certified Data Engineer Associate (Prep) & Data Engineering Patterns for Government

  • Review of certification topics and practical exercises to prepare for the Databricks Certified Data Engineer Associate exam in a government setting
  • Architectural patterns such as bronze/silver/gold, change data capture (CDC), and slowly changing dimensions, with applications in government data engineering
  • Operational patterns including monitoring, alerting, and lineage tracking to ensure robust government data pipelines
  • Hands-on lab: building an end-to-end pipeline using advanced data engineering patterns for government operations

Week 9 — Introduction to Airflow and Astronomer; Scripting for Government

  • Airflow concepts, including DAGs, tasks, operators, and scheduling, with a focus on governmental workflows
  • Overview of the Astronomer platform and best practices for orchestration in government data environments
  • Scripting techniques for automation, including Python scripting patterns for government data tasks
  • Hands-on lab: orchestrating Databricks jobs using Airflow DAGs to support government data operations

Week 10 — Data Visualization, Tableau, and Customized Final Project for Government

  • Connecting Tableau to Databricks and best practices for building BI layers in government settings
  • Dashboard design principles and performance-aware visualizations tailored for government reporting needs
  • Capstone project: scoping, implementing, and presenting a customized final project relevant to government data initiatives
  • Final presentations, peer reviews, and instructor feedback sessions to enhance government data visualization skills

Summary and Next Steps for Government

Requirements

  • An understanding of fundamental SQL and data concepts
  • Experience with programming in Python or Scala
  • Familiarity with cloud services and virtual environments

Audience for Government

  • Data engineers, both aspiring and practicing professionals
  • ETL/BI developers and analytics engineers
  • Data platform and DevOps teams responsible for supporting data pipelines
 350 Hours

Number of participants


Price per participant

Upcoming Courses

Related Categories