Cantitate/Preț
Produs

Beginning Hadoop: Understanding Hadoop Scalability and Performance of Clusters

Autor Gurmukh Singh
en Limba Engleză Paperback – 7 apr 2016

There are many challenges in setting up and scaling distributed frameworks like hadoop.

Despite, Hadoop being an Open Source product and with so many good documentations and books, it is difficult for an individual or an enterprise to define various use cases or working models, that too with a clear understanding of its workings and tuning it for optimal performance.

Pro Hadoop Administration by Gurmukh Singh, a Hadoop specialist and an infrastructure architect, takes a deep dive into configuring Hadoop services and its integration with various tools or frameworks. The book covers the processes right from scratch to building a Hadoop cluster at the production level, with best practices and optimal performance.

You will learn:

  • Use Cases and set of recipes for the Hadoop production environment.
  • From Compiling Hadoop to setting up Cluster with Highly available services.
  • It's integration with various tools like Sqoop, Flume, HBase, Hive and many more.
  • Performance tuning and Cluster Planning.
  • Hadoop security like Kerberos, Encryption and other aspects of security like OS and Network Level.

Citește tot Restrânge

Preț: 18960 lei

Nou

Puncte Express: 284

Preț estimativ în valută:
3629 3769$ 3014£

Carte nepublicată încă

Doresc să fiu notificat când acest titlu va fi disponibil:

Preluare comenzi: 021 569.72.76

Specificații

ISBN-13: 9781484213544
ISBN-10: 1484213548
Pagini: 250
Ilustrații: Bibliographie
Dimensiuni: 178 x 254 mm
Ediția:1st ed. 2016
Editura: Apress
Colecția Apress
Locul publicării:Berkeley, CA, United States

Public țintă

Popular/general

Cuprins

Chapter 1: Introduction to Distributed Computing and Hadoop.Chapter Goal: Talk about the Distributed computing, challenges and some of the existing platforms in the market.
Sub -Topics
  1. Introduction to Distributed computing.
  2. Introduction to Hadoop and its history
  3. Current Hadoop distributions and its market.
  4. Problem statement why Hadoop is needed and its use cases
Chapter 2: Hadoop as a PlatformChapter Goal: Install and configure Hadoop basic Services
Sub - Topics
  1. Hadoop Compilation.
  2. Hadoop Installation and its various modes
  3. Hadoop Daemons Configuration.
  4. Basic Hadoop Configuration Parameters.
Chapter 3: Hadoop Daemons and ServicesChapter Goal: Setup Hadoop Secondary namenode and its purpose.
Sub - Topics:
  1. Secondary NameNode Setup.
  2. Namenode Metadata Concepts.
3. Recovery from Secondary namenode
4. Failover to Secondary
Chapter 4: Concepts of redundancy and Data AccessChapter Goal: Understand how replication works and setup rack awareness
Sub - Topics:
  1. Configure Hadoop Clients
  2. Multi-A record Clients
4. Disk Storage Concepts.
Chapter 4: Hadoop Administration TasksChapter Goal: Learn about day-to-day activities, which are performed by Hadoo
p Admins like Cluster balancing, disk space issues etc
  1. Hadoop Cluster balancing.
  2. Cluster Membership.
  3. Adding Disks to Data Nodes
  4. NameNode Metadata Operations
  5. Trash Space Configuration
Chapter 5: User Quota Management and SchedulersChapter Goal: Learn about User management and Space Quota etc
  1. User Management.
  2. Space Quota Management.
  3. Job Schedulers
  4. Queue setup and management.
  5. ACL’s for Queues.
Chapter 6: Hadoop 2.x and YARN ConfigurationChapter Goal: Learn about Hadoop 2.x features and YARN framework.
  1. Introduction to Hadoop 2.x.
  2. Hadoop 2.x features.
  3. Introduction to YARN and its components.
  4. Installation and Configuration of YARN.
  5. Setup Job Queues.
Chapter 7: Making Services High AvailableChapter Goal: Learn about High Availability for Namenode and Resource Manager.
  1. Namenode HA using Shared Storage.
  2. Namenode HA using QJM.
  3. Resource Manager HA.
Chapter 8: Data Ingestion using HIVE, PIG, SQOOP, FLUMEChapter Goal: Learn about Hive, sqoop, and flume for data ingestion.
  1. Introduction to Data Ingestion.
  2. Introduction to PIG and its installation.
  3. Introduction to Hive and its installation.
  4. Introduction to SQOOP and its installation.
  5. Introduction to Flume and its installation.
  6. Examples for Data Ingestion.
Chapter 9: Database for the Hadoop Platform.Chapter Goal: Learn about HBase and its integration with other tools of Hadoop.
  1. Introduction to HBase.
  2. HBase Installation.
  3. HBase with Hive
  4. Im
porting Data from HBase
  • Phoneix with Hbase
Chapter 10: Hadoop Security.Chapter Goal: Learn about securing Hadoop with Kerberos and other tools.
  1. Introduction to Kerberos.
  2. Installation and Configuring Kerberos.
  3. Hadoop with Kerberos.
  4. Securing Hadoop at the OS level.
Chapter 11: Hadoop Cluster Planning and performance.Chapter Goal: Learn about Cluster planning and performance tuning and other tools.
  1. Hadoop Cluster Planning.
  2. Map Reduce Phases.
  3. Performance tuning.
  4. Hadoop Benchmarking.
Chapter 12: Hadoop Advanced Features.Chapter Goal: Learn about Federation, NFS, webHDFS.
  1. Introduction to Hadoop Federation.
  2. Setup Hadoop Federation.
  3. Introduction to Snapshots and its configuration.
  4. NFSv3 configuration for Hadoop.
  5. WebHDFS for REST API calls.

Notă biografică

Gurmukh has over 12 years of experience in Infrastructure design, scalability, performance tuning and distributed Computing. He recently, Co-Founded "Netxillon Technologies", which is into BigData Consultancy services and trainings. Prior to starting his venture, he worked with companies like Yahoo, HP, JP Morgan on various technologies like OpenVMS, Yahoo Web Analytics platform and many network and security appliances. His areas of expertise include Scalability and Performance Engineering, Databases, Optimising Hadoop Infrastructure, Proxy Appliances and Automation. In addition to this he mentors and trains engineers on latest technologies and market trends.

Caracteristici

  • Practical Use Cases of the Hadoop production environment.
  • Easy to follow with examples and code snippets.
  • Hands-on Manual with right mixture of concepts.
  • Best practices for Production Clusters.