Course Outline
- Section 1: Introduction to Big Data & NoSQL for Government
- Big Data ecosystem
- NoSQL overview
- CAP theorem
- When is NoSQL appropriate for government use cases
- Columnar storage
- HBase and NoSQL applications in the public sector
- Section 2: HBase Introduction for Government
- Concepts and Design
- Architecture (HMaster and Region Server)
- Data integrity in government systems
- HBase ecosystem for government applications
- Lab: Exploring HBase for government use cases
- Section 3: HBase Data Model for Government
- Namespaces, Tables, and Regions
- Rows, columns, column families, versions
- HBase Shell and Admin commands for government data management
- Lab: Using HBase Shell in a government context
- Section 4: Accessing HBase Using Java API for Government
- Introduction to Java API for government applications
- Read / Write path for secure data handling
- Time Series data in government systems
- Scans and efficient data retrieval
- MapReduce for large-scale data processing in government
- Filters for data segmentation
- Counters for tracking and auditing
- Co-processors for enhanced functionality
- Labs (multiple): Implementing time series, MapReduce, filters, and counters using HBase Java API in government scenarios
- Section 5: HBase Schema Design: Group Session for Government
- Students are presented with real-world use cases for government
- Students work in groups to develop design solutions for government data needs
- Discuss, critique, and learn from multiple designs tailored for government requirements
- Labs: Implement a scenario in HBase for government use
- Section 6: HBase Internals for Government
- Understanding HBase under the hood for government applications
- Memfile, HFile, and WAL (Write-Ahead Log) for data integrity in government systems
- HDFS storage for scalable and secure government data management
- Compactions for optimizing performance in government environments
- Splits for efficient data distribution
- Bloom Filters for fast data lookups in government databases
- Caches for improved query performance
- Diagnostics for maintaining and troubleshooting government HBase systems
- Section 7: HBase Ecosystem for Government
- Developing applications using HBase for government projects
- Interacting with other Hadoop stack components (MapReduce, Pig, Hive) in government workflows
- Frameworks around HBase for government data processing
- Advanced concepts (co-processors) for enhanced government applications
- Labs: Writing HBase applications for government use cases
- Section 8: Monitoring and Best Practices for Government
- Monitoring tools and practices for government HBase systems
- Optimizing HBase performance in government environments
- HBase in the cloud for government data management
- Real-world use cases of HBase in government agencies
- Labs: Checking HBase vitals for government systems
Requirements
- Comfortable with the Java programming language
- Proficient in navigating the Java programming environment, including the Linux command line and editing files with vi or nano
- A Java Integrated Development Environment (IDE) such as Eclipse or IntelliJ
Lab Environment:
A functional HBase cluster will be provided for government students. Students will need an SSH client and a web browser to access the cluster.
Zero Installation: There is no requirement for students to install HBase software on their personal machines!
Testimonials (5)
The training instruments provided.
- UNIFI
Course - NoSQL Database with Microsoft Azure Cosmos DB
Intresting presentation and excercises
Szymon - Agora SA
Course - Scylla Database
During the exercises, James explained me every step whereever I was getting stuck in more detail. I was completely new to NIFI. He explained the actual purpose of NIFI, even the basics such as open source. He covered every concept of Nifi starting from Beginner Level to Developer Level.
Firdous Hashim Ali - MOD A BLOCK
Course - Apache NiFi for Administrators
That I had it in the first place.
Peter Scales - CACI Ltd
Course - Apache NiFi for Developers
It gives me an insight on Redis, and also guide me to the right path if I want to know more about Redis