SlideShare ist ein Scribd-Unternehmen logo
1 von 17
The Next Generation of
 Hadoop Map-Reduce
        Sharad Agarwal
     sharadag@yahoo-inc.com
        sharad@apache.org
About Me

   Hadoop Committer and PMC member
   Architect at Yahoo!
Hadoop Map-Reduce Today
   JobTracker
    - Manages cluster resources
      and job scheduling
   TaskTracker
    - Per-node agent
    - Manage tasks
Current Limitations
   Scalability
    - Maximum Cluster size – 4,000 nodes
    - Maximum concurrent tasks – 40,000
    - Coarse synchronization in JobTracker
   Single point of failure
    - Failure kills all queued and running jobs
    - Jobs need to be re-submitted by users
   Restart is very tricky due to complex state
   Hard partition of resources into map and reduce
    slots
Current Limitations

   Lacks support for alternate paradigms
    - Iterative applications implemented using Map-Reduce
      are 10x slower.
    - Example: K-Means, PageRank
   Lack of wire-compatible protocols
    - Client and cluster must be of same version
    - Applications and workflows cannot migrate to
      different clusters
Next Generation Map-Reduce Requirements
   Reliability
   Availability
   Scalability - Clusters of 6,000 machines
    - Each machine with 16 cores, 48G RAM, 24TB disks
    - 100,000 concurrent tasks
    - 10,000 concurrent jobs
   Wire Compatibility
   Agility & Evolution – Ability for customers to
    control upgrades to the grid software stack.
Next Generation Map-Reduce – Design
Centre

   Split up the two major functions of JobTracker
    - Cluster resource management
    - Application life-cycle management
   Map-Reduce becomes user-land library
Architecture
Architecture
   Resource Manager
    - Global resource scheduler
    - Hierarchical queues
   Node Manager
    - Per-machine agent
    - Manages the life-cycle of container
    - Container resource monitoring
   Application Master
    - Per-application
    - Manages application scheduling and task execution
    - E.g. Map-Reduce Application Master
Improvements vis-à-vis current Map-Reduce
     Scalability
      - Application life-cycle management is very
        expensive
      - Partition resource management and application
        life-cycle management
      - Application management is distributed
      - Hardware trends - Currently run clusters of 4,000
        machines
          • 6,000 2012 machines > 12,000 2009 machines
          • <8 cores, 16G, 4TB> v/s <16+ cores, 48/96G, 24TB>
Improvements vis-à-vis current Map-Reduce
     Availability
      - Application Master
          • Optional failover via application-specific checkpoint
          • Map-Reduce applications pick up where they left off
      - Resource Manager
          • No single point of failure - failover via ZooKeeper
          • Application Masters are restarted automatically
Improvements vis-à-vis current Map-Reduce
     Wire Compatibility
      - Protocols are wire-compatible
      - Old clients can talk to new servers
      - Rolling upgrades
Improvements vis-à-vis current Map-Reduce
     Agility / Evolution
      - Map-Reduce now becomes a user-land library
      - Multiple versions of Map-Reduce can run in the
        same cluster (ala Apache Pig)
          • Faster deployment cycles for improvements
      - Customers upgrade Map-Reduce versions on their
        schedule
Improvements vis-à-vis current Map-Reduce
     Utilization
      - Generic resource model
          •   Memory
          •   CPU
          •   Disk b/w
          •   Network b/w
      - Remove fixed partition of map and reduce slots
Improvements vis-à-vis current Map-Reduce
     Support for programming paradigms other
      than Map-Reduce
      - MPI
      - Master-Worker
      - Machine Learning
      - Iterative processing
      - Enabled by allowing use of paradigm-specific
        Application Master
      - Run all on the same Hadoop cluster
Summary
   The next generation of Map-Reduce takes
    Hadoop to the next level
    -   Scale-out even further
    -   High availability
    -   Cluster Utilization
    -   Support for paradigms other than Map-Reduce
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Extending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event ProcessingExtending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event Processing
Oh Chan Kwon
 

Was ist angesagt? (20)

Spark Overview and Performance Issues
Spark Overview and Performance IssuesSpark Overview and Performance Issues
Spark Overview and Performance Issues
 
Extending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event ProcessingExtending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event Processing
 
Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016
 
Resource scheduling
Resource schedulingResource scheduling
Resource scheduling
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
 
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query ProcessingApache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query Processing
 
Enterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using SparkEnterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using Spark
 
Migrating to Riak at Shareaholic
Migrating to Riak at ShareaholicMigrating to Riak at Shareaholic
Migrating to Riak at Shareaholic
 
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
 
Взгляд на облака с точки зрения HPC
Взгляд на облака с точки зрения HPCВзгляд на облака с точки зрения HPC
Взгляд на облака с точки зрения HPC
 
Philly DB MapR Overview
Philly DB MapR OverviewPhilly DB MapR Overview
Philly DB MapR Overview
 
Hadoop Scheduling - a 7 year perspective
Hadoop Scheduling - a 7 year perspectiveHadoop Scheduling - a 7 year perspective
Hadoop Scheduling - a 7 year perspective
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
 
Anti patterns in hadoop cluster deployment
Anti patterns in hadoop cluster deploymentAnti patterns in hadoop cluster deployment
Anti patterns in hadoop cluster deployment
 
CaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark ClusterCaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark Cluster
 
C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...
C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...
C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...
 
Apache Spark on Kubernetes
Apache Spark on KubernetesApache Spark on Kubernetes
Apache Spark on Kubernetes
 
Scalable Acceleration of XGBoost Training on Apache Spark GPU Clusters
Scalable Acceleration of XGBoost Training on Apache Spark GPU ClustersScalable Acceleration of XGBoost Training on Apache Spark GPU Clusters
Scalable Acceleration of XGBoost Training on Apache Spark GPU Clusters
 
Big Data Journey
Big Data JourneyBig Data Journey
Big Data Journey
 
Spark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloadsSpark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloads
 

Andere mochten auch

Scvaf 2011 rooms presentation
Scvaf 2011 rooms presentationScvaf 2011 rooms presentation
Scvaf 2011 rooms presentation
mduhe2
 
The global environment (2)
The global environment (2)The global environment (2)
The global environment (2)
hi2mcfly
 
Downtown orlando, florida
Downtown orlando, floridaDowntown orlando, florida
Downtown orlando, florida
Jennifer Degan
 
Nutricion celular mi herbalife
Nutricion celular mi herbalifeNutricion celular mi herbalife
Nutricion celular mi herbalife
afernandezh
 

Andere mochten auch (20)

Sk rpt matematik tahun 3 by wahyu hidayat
Sk rpt matematik tahun 3 by wahyu hidayatSk rpt matematik tahun 3 by wahyu hidayat
Sk rpt matematik tahun 3 by wahyu hidayat
 
Gulliver al país de Li.liput
Gulliver al país de Li.liputGulliver al país de Li.liput
Gulliver al país de Li.liput
 
Skoleni golfovych rozhodcich III. tridy
Skoleni golfovych rozhodcich  III. tridySkoleni golfovych rozhodcich  III. tridy
Skoleni golfovych rozhodcich III. tridy
 
Lensfree Microscopy and Tomography
Lensfree Microscopy and TomographyLensfree Microscopy and Tomography
Lensfree Microscopy and Tomography
 
Asadsa s asd
Asadsa s asdAsadsa s asd
Asadsa s asd
 
Scvaf 2011 rooms presentation
Scvaf 2011 rooms presentationScvaf 2011 rooms presentation
Scvaf 2011 rooms presentation
 
Organizational Wholeness and Growth
Organizational Wholeness and GrowthOrganizational Wholeness and Growth
Organizational Wholeness and Growth
 
Tax Assist Budget Summary2011
Tax Assist Budget Summary2011Tax Assist Budget Summary2011
Tax Assist Budget Summary2011
 
Grade 6_Kiểm tra trình độ
Grade 6_Kiểm tra trình độGrade 6_Kiểm tra trình độ
Grade 6_Kiểm tra trình độ
 
Speaker Kit - Gift Spotter
Speaker Kit - Gift SpotterSpeaker Kit - Gift Spotter
Speaker Kit - Gift Spotter
 
The global environment (2)
The global environment (2)The global environment (2)
The global environment (2)
 
Describing trends
Describing trendsDescribing trends
Describing trends
 
New Accounting Standards Mcr Cpa 2009
New Accounting Standards Mcr Cpa 2009New Accounting Standards Mcr Cpa 2009
New Accounting Standards Mcr Cpa 2009
 
Steve Ventre
Steve VentreSteve Ventre
Steve Ventre
 
B&A Consumer Confidence Barometer Sept 2013
B&A Consumer Confidence Barometer Sept 2013B&A Consumer Confidence Barometer Sept 2013
B&A Consumer Confidence Barometer Sept 2013
 
100% Funding Exec Summary
100%  Funding Exec Summary100%  Funding Exec Summary
100% Funding Exec Summary
 
Downtown orlando, florida
Downtown orlando, floridaDowntown orlando, florida
Downtown orlando, florida
 
Nutricion celular mi herbalife
Nutricion celular mi herbalifeNutricion celular mi herbalife
Nutricion celular mi herbalife
 
Murata power supply hotsell
Murata power supply hotsellMurata power supply hotsell
Murata power supply hotsell
 
Heavy metal
Heavy metalHeavy metal
Heavy metal
 

Ähnlich wie YARN Hadoop Summit Bangalore 2011

Apache Hadoop India Summit 2011 talk "The Next Generation of Hadoop MapReduce...
Apache Hadoop India Summit 2011 talk "The Next Generation of Hadoop MapReduce...Apache Hadoop India Summit 2011 talk "The Next Generation of Hadoop MapReduce...
Apache Hadoop India Summit 2011 talk "The Next Generation of Hadoop MapReduce...
Yahoo Developer Network
 
Next Generation of Hadoop MapReduce
Next Generation of Hadoop MapReduceNext Generation of Hadoop MapReduce
Next Generation of Hadoop MapReduce
huguk
 
Hadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next GenHadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next Gen
Hortonworks
 
YARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopYARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache Hadoop
Hortonworks
 

Ähnlich wie YARN Hadoop Summit Bangalore 2011 (20)

Apache Hadoop India Summit 2011 talk "The Next Generation of Hadoop MapReduce...
Apache Hadoop India Summit 2011 talk "The Next Generation of Hadoop MapReduce...Apache Hadoop India Summit 2011 talk "The Next Generation of Hadoop MapReduce...
Apache Hadoop India Summit 2011 talk "The Next Generation of Hadoop MapReduce...
 
Next Generation of Hadoop MapReduce
Next Generation of Hadoop MapReduceNext Generation of Hadoop MapReduce
Next Generation of Hadoop MapReduce
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
 
Hadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next GenHadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next Gen
 
YARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopYARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache Hadoop
 
Apache Hadoop MapReduce: What's Next
Apache Hadoop MapReduce: What's NextApache Hadoop MapReduce: What's Next
Apache Hadoop MapReduce: What's Next
 
Mantle for Developers
Mantle for DevelopersMantle for Developers
Mantle for Developers
 
A sdn based application aware and network provisioning
A sdn based application aware and network provisioningA sdn based application aware and network provisioning
A sdn based application aware and network provisioning
 
Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATL
Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATLBryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATL
Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATL
 
Parallel Computing on the GPU
Parallel Computing on the GPUParallel Computing on the GPU
Parallel Computing on the GPU
 
February 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and InsidesFebruary 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and Insides
 
Distributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology OverviewDistributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology Overview
 
Building big data pipelines with Kafka and Kubernetes
Building big data pipelines with Kafka and KubernetesBuilding big data pipelines with Kafka and Kubernetes
Building big data pipelines with Kafka and Kubernetes
 
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query ProcessingApache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query Processing
 
Yarn
YarnYarn
Yarn
 
Big Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdfBig Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdf
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...
 
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query ProcessingApache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
 
Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...
Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...
Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...
 

Kürzlich hochgeladen

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

YARN Hadoop Summit Bangalore 2011

  • 1. The Next Generation of Hadoop Map-Reduce Sharad Agarwal sharadag@yahoo-inc.com sharad@apache.org
  • 2. About Me  Hadoop Committer and PMC member  Architect at Yahoo!
  • 3. Hadoop Map-Reduce Today  JobTracker - Manages cluster resources and job scheduling  TaskTracker - Per-node agent - Manage tasks
  • 4. Current Limitations  Scalability - Maximum Cluster size – 4,000 nodes - Maximum concurrent tasks – 40,000 - Coarse synchronization in JobTracker  Single point of failure - Failure kills all queued and running jobs - Jobs need to be re-submitted by users  Restart is very tricky due to complex state  Hard partition of resources into map and reduce slots
  • 5. Current Limitations  Lacks support for alternate paradigms - Iterative applications implemented using Map-Reduce are 10x slower. - Example: K-Means, PageRank  Lack of wire-compatible protocols - Client and cluster must be of same version - Applications and workflows cannot migrate to different clusters
  • 6. Next Generation Map-Reduce Requirements  Reliability  Availability  Scalability - Clusters of 6,000 machines - Each machine with 16 cores, 48G RAM, 24TB disks - 100,000 concurrent tasks - 10,000 concurrent jobs  Wire Compatibility  Agility & Evolution – Ability for customers to control upgrades to the grid software stack.
  • 7. Next Generation Map-Reduce – Design Centre  Split up the two major functions of JobTracker - Cluster resource management - Application life-cycle management  Map-Reduce becomes user-land library
  • 9. Architecture  Resource Manager - Global resource scheduler - Hierarchical queues  Node Manager - Per-machine agent - Manages the life-cycle of container - Container resource monitoring  Application Master - Per-application - Manages application scheduling and task execution - E.g. Map-Reduce Application Master
  • 10. Improvements vis-à-vis current Map-Reduce  Scalability - Application life-cycle management is very expensive - Partition resource management and application life-cycle management - Application management is distributed - Hardware trends - Currently run clusters of 4,000 machines • 6,000 2012 machines > 12,000 2009 machines • <8 cores, 16G, 4TB> v/s <16+ cores, 48/96G, 24TB>
  • 11. Improvements vis-à-vis current Map-Reduce  Availability - Application Master • Optional failover via application-specific checkpoint • Map-Reduce applications pick up where they left off - Resource Manager • No single point of failure - failover via ZooKeeper • Application Masters are restarted automatically
  • 12. Improvements vis-à-vis current Map-Reduce  Wire Compatibility - Protocols are wire-compatible - Old clients can talk to new servers - Rolling upgrades
  • 13. Improvements vis-à-vis current Map-Reduce  Agility / Evolution - Map-Reduce now becomes a user-land library - Multiple versions of Map-Reduce can run in the same cluster (ala Apache Pig) • Faster deployment cycles for improvements - Customers upgrade Map-Reduce versions on their schedule
  • 14. Improvements vis-à-vis current Map-Reduce  Utilization - Generic resource model • Memory • CPU • Disk b/w • Network b/w - Remove fixed partition of map and reduce slots
  • 15. Improvements vis-à-vis current Map-Reduce  Support for programming paradigms other than Map-Reduce - MPI - Master-Worker - Machine Learning - Iterative processing - Enabled by allowing use of paradigm-specific Application Master - Run all on the same Hadoop cluster
  • 16. Summary  The next generation of Map-Reduce takes Hadoop to the next level - Scale-out even further - High availability - Cluster Utilization - Support for paradigms other than Map-Reduce