Course Outline

  • Section 1: Introduction to Big Data & NoSQL for Government
    • Big Data ecosystem
    • NoSQL overview
    • CAP theorem
    • When is NoSQL appropriate for government use cases
    • Columnar storage
    • HBase and NoSQL applications in the public sector
  • Section 2: HBase Introduction for Government
    • Concepts and Design
    • Architecture (HMaster and Region Server)
    • Data integrity in government systems
    • HBase ecosystem for government applications
    • Lab: Exploring HBase for government use cases
  • Section 3: HBase Data Model for Government
    • Namespaces, Tables, and Regions
    • Rows, columns, column families, versions
    • HBase Shell and Admin commands for government data management
    • Lab: Using HBase Shell in a government context
  • Section 4: Accessing HBase Using Java API for Government
    • Introduction to Java API for government applications
    • Read / Write path for secure data handling
    • Time Series data in government systems
    • Scans and efficient data retrieval
    • MapReduce for large-scale data processing in government
    • Filters for data segmentation
    • Counters for tracking and auditing
    • Co-processors for enhanced functionality
    • Labs (multiple): Implementing time series, MapReduce, filters, and counters using HBase Java API in government scenarios
  • Section 5: HBase Schema Design: Group Session for Government
    • Students are presented with real-world use cases for government
    • Students work in groups to develop design solutions for government data needs
    • Discuss, critique, and learn from multiple designs tailored for government requirements
    • Labs: Implement a scenario in HBase for government use
  • Section 6: HBase Internals for Government
    • Understanding HBase under the hood for government applications
    • Memfile, HFile, and WAL (Write-Ahead Log) for data integrity in government systems
    • HDFS storage for scalable and secure government data management
    • Compactions for optimizing performance in government environments
    • Splits for efficient data distribution
    • Bloom Filters for fast data lookups in government databases
    • Caches for improved query performance
    • Diagnostics for maintaining and troubleshooting government HBase systems
  • Section 7: HBase Ecosystem for Government
    • Developing applications using HBase for government projects
    • Interacting with other Hadoop stack components (MapReduce, Pig, Hive) in government workflows
    • Frameworks around HBase for government data processing
    • Advanced concepts (co-processors) for enhanced government applications
    • Labs: Writing HBase applications for government use cases
  • Section 8: Monitoring and Best Practices for Government
    • Monitoring tools and practices for government HBase systems
    • Optimizing HBase performance in government environments
    • HBase in the cloud for government data management
    • Real-world use cases of HBase in government agencies
    • Labs: Checking HBase vitals for government systems

Requirements

  • Comfortable with the Java programming language
  • Proficient in navigating the Java programming environment, including the Linux command line and editing files with vi or nano
  • A Java Integrated Development Environment (IDE) such as Eclipse or IntelliJ

Lab Environment:

A functional HBase cluster will be provided for government students. Students will need an SSH client and a web browser to access the cluster.

Zero Installation: There is no requirement for students to install HBase software on their personal machines!

 21 Hours

Number of participants


Price per participant

Testimonials (5)

Upcoming Courses

Related Categories