Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction
- Introduction to Cloud Computing and Big Data Solutions for Government
- Overview of Apache Hadoop Features and Architecture
Setting up Hadoop
- Planning a Hadoop Cluster (On-Premise, Cloud, etc.) for Government Use
- Selecting the Operating System and Hadoop Distribution
- Provisioning Resources (Hardware, Network, etc.) for Efficient Operations
- Downloading and Installing the Software to Ensure Compliance with Government Standards
- Sizing the Cluster for Flexibility and Scalability in Government Applications
Working with HDFS
- Understanding the Hadoop Distributed File System (HDFS) for Data Management in Government
- Overview of HDFS Command Reference for Efficient Data Handling
- Accessing HDFS to Facilitate Secure and Controlled Data Access
- Performing Basic File Operations on HDFS to Enhance Data Governance
- Using S3 as a Complement to HDFS for Enhanced Data Storage Solutions for Government
Overview of the MapReduce Framework
- Understanding Data Flow in the MapReduce Framework for Optimized Processing in Government Applications
- Map, Shuffle, Sort, and Reduce Operations for Efficient Data Analysis
- Demo: Computing Top Salaries to Demonstrate Practical Application in Government Scenarios
Working with YARN
- Understanding Resource Management in Hadoop for Optimal Utilization in Government Environments
- Working with ResourceManager, NodeManager, and Application Master for Effective Resource Allocation
- Scheduling Jobs under YARN to Enhance Operational Efficiency in Government Workflows
- Scheduling for Large Numbers of Nodes and Clusters to Support Complex Government Operations
- Demo: Job Scheduling to Illustrate Practical Use Cases in Government Settings
Integrating Hadoop with Spark
- Setting up Storage for Spark (HDFS, Amazon S3, NoSQL, etc.) to Support Diverse Government Data Needs
- Understanding Resilient Distributed Datasets (RDDs) for Robust Data Processing in Government Applications
- Creating an RDD to Enable Efficient Data Manipulation in Government Projects
- Implementing RDD Transformations to Enhance Data Analysis Capabilities for Government Use
- Demo: Implementing a Text Search Program for Movie Titles to Demonstrate Practical Application in Government Contexts
Managing a Hadoop Cluster
- Monitoring Hadoop to Ensure Continuous and Reliable Operations for Government
- Securing a Hadoop Cluster to Protect Sensitive Government Data
- Adding and Removing Nodes to Maintain Scalability in Government Environments
- Running a Performance Benchmark to Optimize Government Workflows
- Tuning a Hadoop Cluster to Enhance Performance for Government Applications
- Backup, Recovery, and Business Continuity Planning to Ensure Resilience in Government Operations
- Ensuring High Availability (HA) to Support Uninterrupted Government Services
Upgrading and Migrating a Hadoop Cluster
- Assessing Workload Requirements for Informed Decision-Making in Government
- Upgrading Hadoop to Leverage the Latest Features and Enhancements for Government Use
- Moving from On-Premise to Cloud and Vice-Versa to Align with Government IT Strategies
- Recovering from Failures to Ensure Continuity of Government Operations
Troubleshooting
Summary and Conclusion
Requirements
- Experience in system administration
- Familiarity with Linux command line operations
- Comprehension of big data principles
Audience for government
- System Administrators
- Database Administrators (DBAs)
35 Hours
Testimonials (5)
The live examples
Ahmet Bolat - Accenture Industrial SS
Course - Python, Spark, and Hadoop for Big Data
very interactive...
Richard Langford
Course - SMACK Stack for Data Science
Sufficient hands on, trainer is knowledgable
Chris Tan
Course - A Practical Introduction to Stream Processing
Get to learn spark streaming , databricks and aws redshift
Lim Meng Tee - Jobstreet.com Shared Services Sdn. Bhd.
Course - Apache Spark in the Cloud
practice tasks