Course Outline

  1. Scala Primer for Government

    • A quick introduction to Scala for government use
    • Labs: Getting acquainted with Scala for government applications
  2. Spark Basics for Government

    • Background and history of Spark for government
    • Integration of Spark with Hadoop for government workflows
    • Core concepts and architecture of Spark for government
    • Overview of the Spark ecosystem (core, SQL, MLlib, streaming) for government applications
    • Labs: Installing and running Spark in a government environment
  3. First Look at Spark for Government

    • Running Spark in local mode for government testing
    • Exploring the Spark web UI for government monitoring
    • Using the Spark shell for government data exploration
    • Analyzing datasets – part 1, tailored for government needs
    • Inspecting RDDs in a government context
    • Labs: Exploring the Spark shell for government applications
  4. RDDs for Government

    • Concepts of RDDs for government data processing
    • Partitioning strategies for government datasets
    • Operations and transformations on RDDs for government tasks
    • Types of RDDs suitable for government use cases
    • Key-Value pair RDDs for government applications
    • MapReduce operations using RDDs in a government context
    • Caching and persistence techniques for government data
    • Labs: Creating, inspecting, and caching RDDs for government projects
  5. Spark API Programming for Government

    • Introduction to the Spark API and RDD API for government developers
    • Submitting the first program to Spark in a government environment
    • Debugging and logging techniques for government applications
    • Configuration properties for government deployments
    • Labs: Programming in the Spark API, submitting jobs for government tasks
  6. Spark SQL for Government

    • SQL support in Spark for government data queries
    • Dataframes and their use in government datasets
    • Defining tables and importing datasets for government analysis
    • Querying data frames using SQL for government reports
    • Storage formats (JSON, Parquet) for government data storage
    • Labs: Creating and querying data frames, evaluating data formats for government needs
  7. MLlib for Government

    • Introduction to MLlib for government machine learning tasks
    • Overview of MLlib algorithms suitable for government applications
    • Labs: Writing MLib applications for government projects
  8. GraphX for Government

    • Overview of the GraphX library for government data analysis
    • GraphX APIs and their application in government workflows
    • Labs: Processing graph data using Spark for government tasks
  9. Spark Streaming for Government

    • Overview of streaming capabilities in Spark for government real-time data processing
    • Evaluating streaming platforms suitable for government use
    • Streaming operations and their application in government scenarios
    • Sliding window operations for government data streams
    • Labs: Writing spark streaming applications for government tasks
  10. Spark and Hadoop for Government

    • Introduction to Hadoop (HDFS, YARN) for government data storage and processing
    • Architecture of Hadoop + Spark integration for government workflows
    • Running Spark on Hadoop YARN in a government environment
    • Processing HDFS files using Spark for government applications
  11. Spark Performance and Tuning for Government

    • Broadcast variables for optimizing government data processing
    • Accumulators for tracking government data metrics
    • Memory management and caching strategies for government applications
  12. Spark Operations for Government

    • Deploying Spark in a production environment for government use
    • Sample deployment templates for government IT teams
    • Configuration settings optimized for government requirements
    • Monitoring tools and techniques for government deployments
    • Troubleshooting common issues in government Spark environments

Requirements

PRE-REQUISITES

Familiarity with either the Java, Scala, or Python programming languages (our labs are conducted in Scala and Python)
Basic understanding of a Linux development environment, including command line navigation and file editing using tools such as VI or nano, is required for government participants.

 21 Hours

Number of participants


Price per participant

Testimonials (6)

Upcoming Courses

Related Categories