Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Module 1: Informatica Data Engineering Management Overview for Government
- Data engineering concepts
- Features of data engineering management
- Benefits of data engineering management
- Data engineering management architecture
- Developer tasks in data engineering management
- New features in Data Engineering Integration 10.4
Module 2: Ingestion and Extraction in Hadoop for Government
- Integrating Data Engineering Integration with a Hadoop cluster
- Hadoop file systems
- Data ingestion to HDFS and Hive using SQOOP
- Mass ingestion to HDFS and Hive – Initial load
- Mass ingestion to HDFS and Hive – Incremental load
- Lab: Configure SQOOP for processing data between Oracle and HDFS
- Lab: Configure SQOOP for processing data between an Oracle database and Hive
- Lab: Create mapping specifications using the Mass Ingestion Service
Module 3: Native and Hadoop Engine Strategy for Government
- Data engineering integration engine strategy
- Hive engine architecture
- MapReduce
- Tez
- Spark architecture
- Blaze architecture
- Lab: Execute a mapping in Spark mode
- Lab: Connect to a deployed application
Module 4: Data Engineering Development Process for Government
- Advanced transformations in data engineering integration using Python and update strategy
- Hive ACID use case
- Stateful computing and windowing
- Lab: Create a reusable Python transformation
- Lab: Create an active Python transformation
- Lab: Perform Hive upserts
- Lab: Use the LEAD windowing function
- Lab: Use the LAG windowing function
- Lab: Create a macro transformation
Module 5: Complex File Processing for Government
- Data engineering file formats – Avro, Parquet, JSON
- Complex data types – Structs, Arrays, Maps
- Complex configuration, operators, and functions
- Lab: Convert flat file data objects to an Avro file
- Lab: Use complex data types – Arrays, Structs, and Maps in a mapping
Module 6: Hierarchical Data Processing for Government
- Hierarchical data processing
- Flatten hierarchical data
- Dynamic flattening with schema changes
- Hierarchical data processing with schema changes
- Complex configuration, operators, and functions
- Dynamic ports
- Dynamic input rules
- Lab: Flatten a complex port in a mapping
- Lab: Build dynamic mappings using dynamic ports
- Lab: Build dynamic mappings using input rules
- Lab: Perform dynamic flattening of complex ports
- Lab: Parse hierarchical data on the Spark engine
Module 7: Mapping Optimization and Performance Tuning for Government
- Validation environments
- Execution environment
- Mapping optimization
- Mapping recommendations and insights
- Scheduling, queuing, and node labeling
- Mapping audits
- Lab: Implement a recommendation
- Lab: Implement an insight
- Lab: Implement mapping audits
Module 8: Monitoring Logs and Troubleshooting in Hadoop for Government
- Hadoop environment logs
- Spark engine monitoring
- Blaze engine monitoring
- REST operations hub
- Log aggregator
- Troubleshooting
- Lab: Monitor mappings using the REST operations hub
- Lab: View and analyze logs using the log aggregator
Module 9: Intelligent Structure Model for Government
- Overview of intelligent structure discovery
- Intelligent structure model
- Lab: Use an intelligent structure model in a mapping
Module 10: Databricks Overview for Government
- Databricks overview
- Steps to configure Databricks
- Databricks clusters
- Notebooks, jobs, and data
- Delta Lakes
Module 11: Databricks Integration for Government
- Databricks integration
- Components of the Informatica and Databricks environments
- Run-time process on the Databricks Spark engine
- Databricks integration task flow
- Pre-requisites for Databricks integration
- Cluster workflows
- Demo: Set up a Databricks connection
- Demo: Run a mapping with the Databricks Spark engine
Requirements
Development Tools for Government Big Data Professionals
21 Hours
Testimonials (2)
Very useful in because it helps me understand what we can do with the data in our context. It will also help me
Nicolas NEMORIN - Adecco Groupe France
Course - KNIME Analytics Platform for BI
It's a hands-on session.