Hadoop Admin Online Training Course

£81.33

Price is per online user of this course.

EU buyers pay £97.60 including VAT.

This online course prepares you for the Cloudera Certified Administrator for Apache Hadoop (CCAH).

Hadoop Administration training for System Administrators is designed for technical operations personnel whose job it is to install and maintain production Hadoop clusters in the real world. We will cover Hadoop architecture and its components, the installation process, monitoring and troubleshooting of complex Hadoop issues. The Hadoop admin training is focused on practical hands-on exercises and encourages open discussions of how people are using Hadoop in enterprises dealing with large data sets.

Who should take this course?

System Administrators and Support Engineers who will maintain and troubleshoot Hadoop clusters in production or development environments.

See below for further information.

 

Sold By: Intellipaat Software

Description

Key Highlights!

  1. 26hrs duration of videos which means we cover all the topics in great detail compared to others in the industry
  2. 24/7 access to the training material and videos
  3. 3 months access to the enrolled courses
  4. The course is designed for passing Cloudera certification
  5. 200 questions quiz simulator for interview preparation and Cloudera certification.
  6. Project work at the end of the training to show how Hadoop is used in the Industry and explains each and every aspect of it including designing, architecture and data movement.

 Course Objectives 

  1. Understand Hadoop main components and Architecture
  2. Be comfortable working with Hadoop Distributed File System
  3. Understand MapReduce abstraction and how it works
  4. Plan your Hadoop cluster
  5. Deploy and administer Hadoop cluster
  6. Optimise Hadoop cluster for the best performance based on specific job requirements
  7. Monitor a Hadoop cluster and execute routine administration procedures
  8. Deal with Hadoop component failures and recoveries
  9. Get familiar with related Hadoop projects: Hbase, Hive and Pig
  10. Know best practices of using Hadoop in an enterprise world

Course Outline

Introduction to Hadoop

  • The amount of data processing in today’s life
  • What Hadoop is and why it is important?
  • Hadoop comparison with traditional systems
  • Hadoop history
  • Hadoop main components and architecture

Hadoop Distributed File System (HDFS)

  • HDFS overview and design
  • HDFS architecture
  • HDFS file storage
  • Component failures and recoveries
  • Block placement
  • Balancing the Hadoop cluster

Planning your Hadoop cluster

  • Planning a Hadoop cluster and its capacity
  • Hadoop software and hardware configuration
  • HDFS block replication and rack awareness
  • Network topology for Hadoop cluster

Hadoop Deployment

  • Different Hadoop deployment types
  • Hadoop distribution options
  • Hadoop competitors
  • Hadoop installation procedure
  • Distributed cluster architecture

Lab: Hadoop Installation

Working with HDFS

  • Ways of accessing data in HDFS
  • Common HDFS operations and commands
  • Different HDFS commands
  • Internals of a file read in HDFS
  • Data copying with ‘distcp’

Lab: Working with HDFS

Map-Reduce Abstraction

  • What MapReduce is and why it is popular
  • The Big Picture of the MapReduce
  • MapReduce process and terminology
  • MapReduce components failures and recoveries
  • Working with MapReduce

Hadoop Cluster Configuration

  • Hadoop configuration overview and important configuration file
  • Configuration parameters and values
  • HDFS parameters MapReduce parameters
  • Hadoop environment setup
  • ‘Include’ and ‘Exclude’ configuration files

Lab: MapReduce Performance Tuning

Hadoop Administration and Maintenance

  • Namenode/Datanode directory structures and files
  • File system image and Edit log
  • The Checkpoint Procedure
  • Namenode failure and recovery procedure
  • Safe Mode
  • Metadata and Data backup
  • Potential problems and solutions / what to look for
  • Adding and removing nodes

Lab: MapReduce File system Recovery

Hadoop Monitoring and Troubleshooting

  • Best practices of monitoring a Hadoop cluster
  • Using logs and stack traces for monitoring and troubleshooting
  • Using open-source tools to monitor Hadoop cluster

Job Scheduling

  • How to schedule Hadoop Jobs on the same cluster
  • Default Hadoop FIFO Schedule
  • Fair Scheduler and its configuration

Hadoop Multi Node Cluster Setup and Running Map Reduce Jobs on Amazon Ec2

  • Hadoop Multi Node Cluster Setup using Amazon ec2 – Creating 4 node cluster setup

Testimonial

“I found the tutorial very useful, with step-by-step explanations. The course material manages to cover a large spectrum of aspects, both architectural and operational, in a dense short lesson”. Jack Mike

Discounts

10% discount for between 10 and 19 users purchased, 15% discount for between 20 and 29 users purchased and 20% discount for 30 or more purchased. For larger purchases or to find out if larger discounts are available on mixed course purchases please phone a customer advisor on 0844 854 9218.

Discounts are calculated during the checkout process.

Payment Options

The most straight forward method of payment is to select the number of users you require and add the product to your shopping cart by selecting Add to Cart. You will then be able to make payment using most credit and debit cards or a Paypal account. If you would like to pay by BACs transfer or by invoice please contact a customer advisor on 0844 854 9218 or email [email protected]

 

Additional information

Subject

level

guided-learning-hours

mobile-compatible

online-access-duration

accredited-qualification