Learn Big data - Course Introduction
About The Course
About Hadoop Developer Course
Course Curriculum
Introduction to BIG Data and Hadoop
Unit 1: Introduction to BIG Data
Challenges associated with Big Data
Requirement of BIG Data
Introduction to Hadoop
Utility of Hadoop
Basic Components of Hadoop
Introduction to Hadoop Ecosystems
5 daemons of Hadoop
Overview of Hadoop Implementation
Unit 2: Hadoop Distributed File Systems
Understanding HDFS Architecture
Understanding How Namenode maintains the file system metadata
Understand how data is stored in HDFS
Understand the relationship between Namenode and Datanode
Understand replication factor, under and over replication
Namenode High Availability
Rack Awareness
Introduction and hands-on to Hadoop fs commands
File Reading and Writing over HDFS
Unit 3: MapReduce
Overview of MapReduce
Understand the architecture of MapReduce
Understand the various phases of MapReduce through Word Count Paradigm
Programming algorithms
Combiners and Counters
MapReduce Types and Formats
Understand the architecture of YARN
Understand the components of YARN Resource Manager
Demonstrate the relationship between Node Managers and application masters.
Demonstrate the relationship between Resource Managers and application masters.
Unit 4 : Hadoop Installation
Understand minimum hardware and software requirements
CDH4 Hadoop Installation
Understand complete deployment layout
Understand how to configure and manage different services
Commands Practice
Understand different configuration parameters.
Practicing HDFS Commands
Introduction to Hue- GUI Console
Unit 5 : PIG
Introduction to PIG
PIG Components
PIG Data Types
Functions and Macros
Writing Scripts in PIG
PIG Execution
Unit 6 : Hive
Introduction to Hive
Hive Architecture
HiveQL
Partitioning and Bucketing
Managed tables and External tables
Joining
Functions
SerDes
Unit 7 : Hbase
Introduction to NoSQL Databases
Why to use NoSQL Databases
CAP Theorem
HBase Concept
HBase Architecture
HBase Data Model, Bloom Filter, Block Cache, Schema Design
Unit 8 : Impala
Real Time Analysis
Impala Overview
Impala Architecture
Unit 9 : Basic Administrative Activities
Service Management
Cluster Installation
Commissioning and Decommissioning Nodes
Adding more clusters
Checking Alerts and Events
Unit 10 : SQOOP
Introduction to Sqoop
Sqoop Tools and Commands
Data Importing to HDFS, Hive, HBaseetc
Data Exporting and connectors
Unit 11: Flume and Chukwa
Introduction to Flume
Source, Channel, and Sink
Data Analysis using Chukwa
Practical Use Cases
Unit 12: Oozie
Data Level Scheduling
Time Bound Scheduling
Comments
Post a Comment