Hadoop training course is designed to provide knowledge and skills to become a successful Hadoop Developer. In-depth knowledge of concepts such as Hadoop Distributed File System, Hadoop Cluster- Single and multi node, Hadoop 2.0, Flume, Sqoop, Map-Reduce, PIG, Hive, Hbase, Zookeeper, Oozie etc. will be covered in the course.
Hadoop Course Modules
INTRODUCTION
- What is Hadoop?
- Who is behind Hadoop?
|
- History of Hadoop
- What Hadoop is good for and why it is Good
|
- Building Blocks – Hadoop Eco-System
|
|
HDFS
- Configuring HDFS
- Additional HDFS Tasks
- Hadoop File System Shell
|
- Interacting With HDFS
- HDFS Overview and Architecture
- File System Java API
|
- HDFS Permissions and Security
- HDFS Installation
|
|
MAPREDUCE
- Map/Reduce Overview and Architecture
- Input and Output Formats
- Practicing Map Reduce Programs (atleast 10 Map Reduce Algorithms )
|
- Installation
- Job Configuration
|
- Developing Map/Red Jobs
- Job Submission
|
|
Getting Started With Eclipse IDE
- Configuring Hadoop API on Eclipse IDE
|
- Connecting Eclipse IDE to HDFS
|
|
Advanced MapReduce Features
- Custom Data Types
- Partitioning Data
|
- Input Formats
- Reporting Custom Metrics
|
- Output Formats
- Distributing Auxiliary Job Data
|
|
PIG
- Pig Overview
- Pig with HDFS
|
|
|
|
Hive
- Hive Overview
- Hive Unstructured Data Analyzation
|
- Installation
- Hive Semistructured Data Analyzation
|
|
|
HBase
- HBase Overview and Architecture
- CRUD operations
- HBase Key Design
|
- HBase Installation
- Scanning and Batching
|
|
|
ZooKeeper
|
Sqoop
|
CONFIGURATION
- Basic Setup
- Cluster Configurations
- Large Clusters: Multiple Racks
|
- Important Directories
- Small Clusters: 2-10 Nodes
|
- Selecting Machines
- Medium Clusters: 10-40 Nodes
|
|
Putting it all together
- Distributed installations
|
|
|
|