You will get unlimited access
We provide a 24/7 support help-desk wherein you can contact dedicated engineers via phone, live chat, email and video calls.
You will get access to 120 byte-sized lessons featuring the most detailed & interactive explanations of BIG Data & Hadoop.
You will get unlimited access to our real time multi-cluster cloud based labs to implement your practicals & projects.
Every week, you will get a live instructor masterclass featuring either a concept discussion or a live project implementation.
We have a repository of 7+ projects featuring domains such as Retail, Finance, Healthcare, Banking and Entertainment.
The certification you get at the end of the course is recognized by all our 50+ corporate partners. We’re also compatible with the Cloudera & Hortonworks certification.
What is the course curriculum?
You will get the entire Hadoop 2.x ecosystem broken down into step-by-step lessons, making it very easy for you to grasp all the concepts & components.
Introduction to Big Data
• Big Data – Use Cases, Challenges & Apache Hadoop Versions, Clusters, Vendors
• HDFS & MapReduce
• Initial Configuration
• Single Node & MultiNode Setup
• JobTracker & TaskTracker Setup
• HDFS Read/Write, HDFS Replication
• NamdeNode & DataNode Communications
• Data Loading Workflows
• Logging, Configuring Rack Awareness, Hadoop Balancer
• Storage, User Setup & Quota’s
• Planning Hadoop Cluster: Sizing, Hardware, Network & Software Considerations
Distribution and Scheduler
• FIFO, Fair Scheduler
• Configuring the schedulers and run MapReduce Jobs
Backup and Recovery
• Backup of Metadata, Configuration Files
• Copy Data across clusters using distcp, Fsimage Viewer
• Setup Trash for Hadoop
Upgrade and Security
• Configuring Iptables, kerboros, Snapshots, Web HDFS & proxies, Central Repository
• Using Clush to Execute Commands and Alternatives to HDFS
Sqoop and Flume
• Importing Data from MySQL into Hadoop using Sqoop
• Installing and Configuring Flume
• Data Streaming
Pig and Hive
• Introduction to Hive & PIG – Setup & Scripting
• Job Scheduling Protocols
• Streaming Operations
• JYARN Framework: Execution & WorkFlow
• Configuring High Availability Using QJM
• HDFS Federation – Introduction, Activation and Applications
• Utilizing HDFS Federation
NoSQL in Big Data
• CAP Theorem, NoSQL – Flavors & Types
• Hbase – Introduction, Architecture & Use-Cases
• Introduction to CDH. Open Source Projects & Integrations
• Complete CDH Project Lifecycle
• Introduction to MapR.
• Open Source Projects & Integrations
• Complete MapR Project Lifecycle
Performance Tuning and Benchmarking
• Introduction to Performance & Reporting. Success Benchmarks
• Troubleshooting. Standard Protocols
• Hadoop Deployment Pipeline
• Checklist & Considerations
• Complete Deployment Lifecycle
• Service Oriented Monitoring
• HDFS Metrics, MapReduce counters, YARN metrics, ZooKeeper Metrics