Course Outline

Module 1: Informatica Data Engineering Management Overview

  • Data Engineering concepts for government
  • Features of Data Engineering Management for government
  • Benefits of Data Engineering Management for government operations
  • Architecture of Data Engineering Management for government systems
  • Developer tasks in Data Engineering Management for government
  • New features in Data Engineering Integration 10.4 for government use

Module 2: Ingestion and Extraction in Hadoop

  • Integrating Data Engineering Integration with a Hadoop cluster for government
  • Hadoop file systems for government data management
  • Data ingestion to HDFS and Hive using SQOOP for government
  • Mass ingestion to HDFS and Hive – Initial load for government datasets
  • Mass ingestion to HDFS and Hive – Incremental load for government updates
  • Lab: Configuring SQOOP for processing data between Oracle and HDFS for government
  • Lab: Configuring SQOOP for processing data between an Oracle database and Hive for government
  • Lab: Creating mapping specifications using the Mass Ingestion Service for government

Module 3: Native and Hadoop Engine Strategy

  • Data Engineering Integration engine strategy for government applications
  • Hive Engine architecture for government data processing
  • MapReduce for government data tasks
  • Tez for government data workflows
  • Spark architecture for government big data processing
  • Blaze architecture for government high-performance computing
  • Lab: Executing a mapping in Spark mode for government
  • Lab: Connecting to a deployed application for government use

Module 4: Data Engineering Development Process

  • Advanced transformations in Data Engineering Integration, including Python and update strategy for government
  • Hive ACID use case for government data integrity
  • Stateful computing and windowing for government data analysis
  • Lab: Creating a reusable Python transformation for government
  • Lab: Creating an active Python transformation for government
  • Lab: Performing Hive upserts for government datasets
  • Lab: Using the LEAD windowing function for government data
  • Lab: Using the LAG windowing function for government data
  • Lab: Creating a macro transformation for government use

Module 5: Complex File Processing

  • Data Engineering file formats – Avro, Parquet, JSON for government
  • Complex file data types – Structs, Arrays, Maps for government datasets
  • Complex configuration, operators, and functions for government data processing
  • Lab: Converting flat file data objects to an Avro file for government
  • Lab: Using complex data types – Arrays, Structs, and Maps in a mapping for government

Module 6: Hierarchical Data Processing

  • Hierarchical data processing for government datasets
  • Flattening hierarchical data for government use
  • Dynamic flattening with schema changes for government data
  • Hierarchical data processing with schema changes for government systems
  • Complex configuration, operators, and functions for government hierarchical data
  • Dynamic ports for government data mapping
  • Dynamic input rules for government data processing
  • Lab: Flattening a complex port in a mapping for government
  • Lab: Building dynamic mappings using dynamic ports for government
  • Lab: Building dynamic mappings using input rules for government
  • Lab: Performing dynamic flattening of complex ports for government
  • Lab: Parsing hierarchical data on the Spark Engine for government

Module 7: Mapping Optimization and Performance Tuning

  • Validation environments for government data mappings
  • Execution environment for government data processing
  • Mapping optimization for government systems
  • Mapping recommendations and insights for government
  • Scheduling, queuing, and node labeling for government data workflows
  • Mapping audits for government compliance and accountability
  • Lab: Implementing recommendations for government data mappings
  • Lab: Implementing insights for government data optimization
  • Lab: Implementing mapping audits for government systems

Module 8: Monitoring Logs and Troubleshooting in Hadoop

  • Hadoop environment logs for government data monitoring
  • Spark Engine monitoring for government applications
  • Blaze Engine monitoring for government systems
  • REST Operations Hub for government data management
  • Log aggregator for government data analysis
  • Troubleshooting for government data issues
  • Lab: Monitoring mappings using REST Operations Hub for government
  • Lab: Viewing and analyzing logs using the Log Aggregator for government

Module 9: Intelligent Structure Model

  • Overview of intelligent structure discovery for government data
  • Intelligent structure model for government datasets
  • Lab: Using an intelligent structure model in a mapping for government

Module 10: Databricks Overview

  • Databricks overview for government use
  • Steps to configure Databricks for government systems
  • Databricks clusters for government data processing
  • Notebooks, jobs, and data management in Databricks for government
  • Delta Lakes for government data storage and processing

Module 11: Databricks Integration

  • Databricks integration for government applications
  • Components of the Informatica and Databricks environments for government
  • Run-time process on the Databricks Spark Engine for government data tasks
  • Databricks integration task flow for government workflows
  • Pre-requisites for Databricks integration in government systems
  • Cluster workflows for government data processing
  • Demo: Setting up a Databricks connection for government
  • Demo: Running a mapping with the Databricks Spark Engine for government

Requirements

Developer Tool for Government Big Data Professionals

This tool is specifically designed to support big data developers in their work for government agencies. It provides advanced functionalities that enhance data processing, analysis, and management, ensuring that government projects meet the highest standards of efficiency and accuracy. The tool includes features such as robust data integration, scalable processing capabilities, and secure data handling, all of which are critical for government operations. By leveraging this tool, big data professionals can streamline their workflows, improve governance, and ensure accountability in their data-driven initiatives for government.

 21 Hours

Number of participants


Price per participant

Testimonials (2)

Upcoming Courses

Related Categories