Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
HP.com home
Education & Training  >  Find a course 

Hadoop for Systems Administrators

» 

Education & Training
US & Canada

» Contact Us
» Register for a class
» Education Centers
» HP RAIL
» Onsite & Dedicated Training
» What's new
» Find a course
» Big Data
» Business Analysis & Project Management
» Cloud
» Data Center
» HP ExpertOne Customer
» HP ExpertOne Partner
» HP Integrity
» Graphic Solutions
» ITSM / ITIL
» Linux
» Microsoft
» HP Networking
» HP NonStop
» HP OpenVMS
» HP Project Odyssey
» HP ProLiant
HP BladeSystem
» Security
» HP Storage
» HP Tru64
» HP-UX
» Virtualization
» HP VISPEL-Video Training
» VMware
» HP Education Consulting
» Certification
» HP Virtual Rooms
» eLearning
» HP Software Education
Content starts here
At a glance
View schedule & enroll Sorted by: location or date
Course number H6C60S
Length 3 days
Delivery method Remotely assisted instructional learning ( RAIL)
Instructor-led training ( ILT)
Onsite dedicated training ( OST)
Price USD $2,400
CAD $2,400

Course overview

This course covers the essentials of deploying and managing an Apache™ Hadoop® cluster. The course is lab intensive with each participant creating their own Hadoop cluster using either the CDH (Cloudera's Distribution, including Apache Hadoop) or Hortonworks Data Platform stacks. Core Hadoop services are explored in depth with emphasis on troubleshooting and recovering from common cluster failures. The fundamentals of related services such as Ambari, Zookeeper, Pig, Hive, HBase, Sqoop, Flume, and Oozie are also covered. The course is approximately 60% lecture and 40% labs.


Prerequisites

  • Qualified participants should be comfortable with the Linux commands and have some systems administration experience, but do not need previous Hadoop experience

Audience

  • Systems Administrators who will be responsible for managing and administering Hadoop clusters

Ways to save

Next steps

Benefits to you

  • Hands-on coverage of Hadoop gives systems administrators the skills they need to properly deploy, manage, and maintain Hadoop clusters

Course outline

"Big Data", the big picture

  • Distributed processing and data locality
  • Hadoop core architecture:
    • HDFS
    • MapReduce
  • Hadoop distributions:
    • Cloudera, MapR, Hortonworks
  • Hadoop ecosystem:
    • Ambari, Pig, Hive, Zookeeper, HBase, Sqoop, Flume, Oozie

HDFS

  • Design and operation:
    • NameNode and Secondary NameNode
    • Meta-data storage and updates
    • Data storage and flows
  • Planning and creation:
    • Performance considerations
    • Loading and managing data files
    • Tuning and maintenance

MapReduce

  • History and theory of operation
  • Apache Hadoop implementation:
    • Jobtracker
    • Tasktrackers
    • DataNodes

Authentication and Authorization

  • Hadoop users
  • HDFS:
    • File ownership and permissions
    • Quotas
    • Kerberos

MapReduce schedulers

  • FIFO
  • Fair
  • Capacity

Cluster monitoring and maintenance

  • Adding and removing DataNodes
  • Monitoring and balancing HDFS storage
  • Jobtracker and Tasktracker status

Troubleshooting

  • Slow or long running jobs
  • Location and use of Hadoop Jobs and Log files
  • NameNode Recovery and Failure
  • Cluster re-balancing
  • Others

Appendix


H6C60S - A.00
 
Privacy statement Using this site means you accept its terms Feedback to Education & Training
© 2014 Hewlett-Packard Development Company, L.P.