Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Big Data Overview:
- What is Big Data
- Why Big Data is gaining popularity
- Big Data Case Studies
- Big Data Characteristics
- Solutions to work on Big Data for government
Hadoop & Its Components:
- What is Hadoop and what are its components
- Hadoop Architecture and its characteristics of data it can handle/process
- Brief on Hadoop History, companies using it, and why they have started using it
- Hadoop Framework & its components—explained in detail
- What is HDFS and reads/writes to the Hadoop Distributed File System
- How to set up a Hadoop cluster in different modes—stand-alone/pseudo/multi-node cluster
(This includes setting up a Hadoop cluster in VirtualBox/KVM/VMware, network configurations that need to be carefully looked into, running Hadoop daemons, and testing the cluster).
- What is the MapReduce framework and how it works
- Running MapReduce jobs on a Hadoop cluster
- Understanding replication, mirroring, and rack awareness in the context of Hadoop clusters
Hadoop Cluster Planning:
- How to plan your Hadoop cluster
- Understanding hardware and software to plan your Hadoop cluster
- Understanding workloads and planning the cluster to avoid failures and perform optimally
What is MapR and Why MapR:
- Overview of MapR and its architecture
- Understanding & working of the MapR Control System, MapR Volumes, snapshots, & mirrors
- Planning a cluster in the context of MapR
- Comparison of MapR with other distributions and Apache Hadoop
- MapR installation and cluster deployment
Cluster Setup & Administration:
- Managing services, nodes, snapshots, mirror volumes, and remote clusters
- Understanding and managing nodes
- Understanding Hadoop components, installing Hadoop components alongside MapR Services
- Accessing data on the cluster including via NFS, managing services & nodes
- Managing data by using volumes, managing users and groups, managing & assigning roles to nodes, commissioning decommissioning of nodes, cluster administration and performance monitoring, configuring/analyzing and monitoring metrics to monitor performance, configuring and administering MapR security
- Understanding and working with M7—native storage for MapR tables
- Cluster configuration and tuning for optimal performance
Cluster Upgrade and Integration with Other Setups:
- Upgrading software version of MapR and types of upgrade
- Configuring a MapR cluster to access an HDFS cluster
- Setting up a MapR cluster on Amazon Elastic MapReduce
All the above topics include demonstrations and practice sessions for learners to have hands-on experience with the technology.
Requirements
- Fundamental understanding of the Linux file system
- Basic proficiency in Java
- Familiarity with Apache Hadoop (highly recommended for government)
28 Hours
Testimonials (1)
practical things of doing, also theory was served good by Ajay