SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Hadoop
YARN - Under the Hood

      Sharad Agarwal
    sharad@apache.org
About me
Recap: Hadoop 1.0 Map-Reduce
JobTracker
  Manages cluster resources
   and job scheduling
TaskTracker
  Per-node agent
  Manage tasks
YARN Architecture
                                          Node
                                          Node
                                         Manager
                                         Manager


                                   Container   App Mstr
                                               App Mstr


   Client

                        Resource
                        Resource          Node
                                          Node
                        Manager
                        Manager          Manager
                                         Manager
    Client
   Client

                                   App Mstr    Container
                                               Container




    MapReduce Status                      Node
                                          Node
    MapReduce Status
                                         Manager
                                         Manager
      Job Submission
     Job Submission
        Node Status
       Node Status
     Resource Request
    Resource Request               Container   Container
What the new Architecture gets us?

           Scale
       Compute Platform
Scale for a compute platform
• Application Size
  • No of sub-tasks
  • Application level state
    •   eg. Counters
• Number of Concurrent Tasks in a single
  cluster
Application size scaling in
Hadoop 1.0


JTHeap ÂľTotalTasks, Nodes, JobCounters
Application size scaling in YARN
               is by
          Architecture
Why a limitation on cluster size ?



                            Hadoop 1.0
Cluster
Utilization




                     Cluster Size
JobTracker   JIP   TIP             Scheduler

 Heartbeat
 Request



                                     • Synchronous Heartbeat
                                       Processing
                                     • JobTracker Global Lock




Heartbeat
Response



                                       JT transaction rate limit:
                                       200 heartbeats/sec
Highly Concurrent Systems
          • scales much better (if done
            right)
          • makes effective use of multi-
            core hardware
          • managing eventual
            consistency of states hard
          • need for a systemic framework
            to manage this
Event Queue                    Event
                                                    Dispatcher




      Component        Component        Component
         A                B                N



•   Mutations only via events
•   Components only expose Read APIs
•   Use Re-entrant locks
•   Components follow clear lifecycle

                         Event Model
Heartbeat                    NodeManager
             Listener     Event Q           Meta

 Heartbeat
 Request




                                    Get
                                    commands




Heartbeat
Response




               Asynchronous
             Heartbeat Handling
YARN: Better utilization bigger
    cluster
                                   YARN


Cluster
Utilization                       Hadoop 1.0




                   Cluster Size
State Management
State management in JT
                   Very Hard to Maintain
                   Debugging even harder
Complex State Management
• Light weight State Machines Library
• Declarative way of specifying the state
   Transitions
• Invalid transitions are handled automatically
• Fits nicely with the event model
• Debug-ability is drastically improved.
  Lineage of object states can easily be
  determined
• Handy while recovering the state
Declarative State Machine
High Availability
MR Application Master Recovery
• Hadoop 1.0
 • Application need to resubmit Job
 • All completed tasks are lost


• YARN
 • Application execution state check pointed in
   HDFS
 • Rebuilds the state by replaying the events
Resource Manager HA
• Based on Zookeeper
• Coming Soon
 • YARN-128
YARN: New Possibilities
  •   Open MPI - MR-2911
  •   Master-Worker – MR-3315
  •   Distributed Shell
  •   Graph processing – Giraph-13
  •   BSP – HAMA-431
  •   CEP
       • S4 – S4-25
       • Storm -
         https://github.com/nathanmarz/storm/issues/74
  • Iterative processing - Spark
       https://github.com/mesos/spark-yarn/
YARN - a solid foundation to take
    Hadoop to next level
                    on

   Scale, High Availability, Utilization
                 And
    Alternate Compute Paradigms
Thank You
@twitter: sharad_ag

Weitere ähnliche Inhalte

Was ist angesagt?

Extending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event ProcessingExtending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event Processing
Oh Chan Kwon
 
Philly DB MapR Overview
Philly DB MapR OverviewPhilly DB MapR Overview
Philly DB MapR Overview
MapR Technologies
 
Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014
Tsuyoshi OZAWA
 

Was ist angesagt? (20)

Spark Overview and Performance Issues
Spark Overview and Performance IssuesSpark Overview and Performance Issues
Spark Overview and Performance Issues
 
Extending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event ProcessingExtending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event Processing
 
Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016
 
Distributed Processing Frameworks
Distributed Processing FrameworksDistributed Processing Frameworks
Distributed Processing Frameworks
 
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastr...
 
Resource scheduling
Resource schedulingResource scheduling
Resource scheduling
 
Migrating to Riak at Shareaholic
Migrating to Riak at ShareaholicMigrating to Riak at Shareaholic
Migrating to Riak at Shareaholic
 
High Performance Deep learning with Apache Spark
High Performance Deep learning with Apache SparkHigh Performance Deep learning with Apache Spark
High Performance Deep learning with Apache Spark
 
Enterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using SparkEnterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using Spark
 
Philly DB MapR Overview
Philly DB MapR OverviewPhilly DB MapR Overview
Philly DB MapR Overview
 
Взгляд на облака с точки зрения HPC
Взгляд на облака с точки зрения HPCВзгляд на облака с точки зрения HPC
Взгляд на облака с точки зрения HPC
 
Hadoop scheduler
Hadoop schedulerHadoop scheduler
Hadoop scheduler
 
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query ProcessingApache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query Processing
 
Hadoop Scheduling - a 7 year perspective
Hadoop Scheduling - a 7 year perspectiveHadoop Scheduling - a 7 year perspective
Hadoop Scheduling - a 7 year perspective
 
CaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark ClusterCaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark Cluster
 
Yarnthug2014
Yarnthug2014Yarnthug2014
Yarnthug2014
 
Anti patterns in hadoop cluster deployment
Anti patterns in hadoop cluster deploymentAnti patterns in hadoop cluster deployment
Anti patterns in hadoop cluster deployment
 
Distributed Resource Scheduling Frameworks, Is there a clear Winner ?
Distributed Resource Scheduling Frameworks, Is there a clear Winner ?Distributed Resource Scheduling Frameworks, Is there a clear Winner ?
Distributed Resource Scheduling Frameworks, Is there a clear Winner ?
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
 
Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014
 

Andere mochten auch

Dynamic Resource Allocation Spark on YARN
Dynamic Resource Allocation Spark on YARNDynamic Resource Allocation Spark on YARN
Dynamic Resource Allocation Spark on YARN
Tsuyoshi OZAWA
 
Closing the back door
Closing the back doorClosing the back door
Closing the back door
onthecity
 
What are the types of channels
What are the types of channelsWhat are the types of channels
What are the types of channels
Sindhu Ragunathan
 
What are the types of channels
What are the types of channelsWhat are the types of channels
What are the types of channels
Sindhu Ragunathan
 
Lomo oko
Lomo okoLomo oko
Lomo oko
ruzinek
 
Alice
AliceAlice
Alice
sychun92
 
Redrock It Brochure Sp
Redrock It Brochure SpRedrock It Brochure Sp
Redrock It Brochure Sp
redrock2000
 

Andere mochten auch (20)

Dynamic Resource Allocation Spark on YARN
Dynamic Resource Allocation Spark on YARNDynamic Resource Allocation Spark on YARN
Dynamic Resource Allocation Spark on YARN
 
Closing the back door
Closing the back doorClosing the back door
Closing the back door
 
Solvency And Asset Recommendations 2011
Solvency And Asset Recommendations 2011Solvency And Asset Recommendations 2011
Solvency And Asset Recommendations 2011
 
Got Energy? You can't be successful without it!
Got Energy?  You can't be successful without it!  Got Energy?  You can't be successful without it!
Got Energy? You can't be successful without it!
 
Astec artesyn power supply stock
Astec artesyn power supply stockAstec artesyn power supply stock
Astec artesyn power supply stock
 
Skoleni golfovych rozhodcich III. tridy
Skoleni golfovych rozhodcich  III. tridySkoleni golfovych rozhodcich  III. tridy
Skoleni golfovych rozhodcich III. tridy
 
PresentaciĂł tertĂşlies literĂ ries
PresentaciĂł tertĂşlies literĂ riesPresentaciĂł tertĂşlies literĂ ries
PresentaciĂł tertĂşlies literĂ ries
 
Output
OutputOutput
Output
 
What are the types of channels
What are the types of channelsWhat are the types of channels
What are the types of channels
 
GuĂ­a educaciĂłn BilingĂźe para padres / Bilingual Education Guide for parents
GuĂ­a educaciĂłn BilingĂźe para padres / Bilingual Education Guide for parentsGuĂ­a educaciĂłn BilingĂźe para padres / Bilingual Education Guide for parents
GuĂ­a educaciĂłn BilingĂźe para padres / Bilingual Education Guide for parents
 
2000 years ago
2000 years ago2000 years ago
2000 years ago
 
Gulliver al paĂ­s de Li.liput
Gulliver al paĂ­s de Li.liputGulliver al paĂ­s de Li.liput
Gulliver al paĂ­s de Li.liput
 
What are the types of channels
What are the types of channelsWhat are the types of channels
What are the types of channels
 
Speaker Kit - Gift Spotter
Speaker Kit - Gift SpotterSpeaker Kit - Gift Spotter
Speaker Kit - Gift Spotter
 
D o-e
D o-eD o-e
D o-e
 
Lomo oko
Lomo okoLomo oko
Lomo oko
 
Alice
AliceAlice
Alice
 
Ceit338
Ceit338Ceit338
Ceit338
 
The Authentic Leadership Program
The Authentic Leadership ProgramThe Authentic Leadership Program
The Authentic Leadership Program
 
Redrock It Brochure Sp
Redrock It Brochure SpRedrock It Brochure Sp
Redrock It Brochure Sp
 

Ähnlich wie Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)

Apache Hadoop YARN - Hortonworks Meetup Presentation
Apache Hadoop YARN - Hortonworks Meetup PresentationApache Hadoop YARN - Hortonworks Meetup Presentation
Apache Hadoop YARN - Hortonworks Meetup Presentation
Hortonworks
 
Hadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next GenHadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next Gen
Hortonworks
 
Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014
Tsuyoshi OZAWA
 
YARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopYARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache Hadoop
Hortonworks
 
Searching conversations with hadoop
Searching conversations with hadoopSearching conversations with hadoop
Searching conversations with hadoop
DataWorks Summit
 
How to Make Hadoop Easy, Dependable and Fast
How to Make Hadoop Easy, Dependable and FastHow to Make Hadoop Easy, Dependable and Fast
How to Make Hadoop Easy, Dependable and Fast
MapR Technologies
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Ovidiu Dimulescu
 

Ähnlich wie Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe) (20)

Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
 
Apache Hadoop YARN - Hortonworks Meetup Presentation
Apache Hadoop YARN - Hortonworks Meetup PresentationApache Hadoop YARN - Hortonworks Meetup Presentation
Apache Hadoop YARN - Hortonworks Meetup Presentation
 
Hadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next GenHadoop World 2011, Apache Hadoop MapReduce Next Gen
Hadoop World 2011, Apache Hadoop MapReduce Next Gen
 
Yarn
YarnYarn
Yarn
 
Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014
 
Parallel Linear Regression in Interative Reduce and YARN
Parallel Linear Regression in Interative Reduce and YARNParallel Linear Regression in Interative Reduce and YARN
Parallel Linear Regression in Interative Reduce and YARN
 
YARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopYARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache Hadoop
 
Apache Hadoop YARN State of the Union
Apache Hadoop YARN State of the UnionApache Hadoop YARN State of the Union
Apache Hadoop YARN State of the Union
 
Searching conversations with hadoop
Searching conversations with hadoopSearching conversations with hadoop
Searching conversations with hadoop
 
ApacheCon BigData - What it takes to process a trillion events a day?
ApacheCon BigData - What it takes to process a trillion events a day?ApacheCon BigData - What it takes to process a trillion events a day?
ApacheCon BigData - What it takes to process a trillion events a day?
 
Riak at shareaholic
Riak at shareaholicRiak at shareaholic
Riak at shareaholic
 
Apache Hadoop MapReduce: What's Next
Apache Hadoop MapReduce: What's NextApache Hadoop MapReduce: What's Next
Apache Hadoop MapReduce: What's Next
 
CloudStack Architecture Future
CloudStack Architecture FutureCloudStack Architecture Future
CloudStack Architecture Future
 
Hanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aHanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221a
 
How to Make Hadoop Easy, Dependable and Fast
How to Make Hadoop Easy, Dependable and FastHow to Make Hadoop Easy, Dependable and Fast
How to Make Hadoop Easy, Dependable and Fast
 
Introduction to Yarn
Introduction to YarnIntroduction to Yarn
Introduction to Yarn
 
Times Ten in-memory database when time counts - Laszlo Ludas
Times Ten in-memory database when time counts - Laszlo LudasTimes Ten in-memory database when time counts - Laszlo Ludas
Times Ten in-memory database when time counts - Laszlo Ludas
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 

KĂźrzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

KĂźrzlich hochgeladen (20)

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)

  • 1. Hadoop YARN - Under the Hood Sharad Agarwal sharad@apache.org
  • 3. Recap: Hadoop 1.0 Map-Reduce JobTracker Manages cluster resources and job scheduling TaskTracker Per-node agent Manage tasks
  • 4. YARN Architecture Node Node Manager Manager Container App Mstr App Mstr Client Resource Resource Node Node Manager Manager Manager Manager Client Client App Mstr Container Container MapReduce Status Node Node MapReduce Status Manager Manager Job Submission Job Submission Node Status Node Status Resource Request Resource Request Container Container
  • 5. What the new Architecture gets us? Scale Compute Platform
  • 6. Scale for a compute platform • Application Size • No of sub-tasks • Application level state • eg. Counters • Number of Concurrent Tasks in a single cluster
  • 7. Application size scaling in Hadoop 1.0 JTHeap ÂľTotalTasks, Nodes, JobCounters
  • 8. Application size scaling in YARN is by Architecture
  • 9. Why a limitation on cluster size ? Hadoop 1.0 Cluster Utilization Cluster Size
  • 10. JobTracker JIP TIP Scheduler Heartbeat Request • Synchronous Heartbeat Processing • JobTracker Global Lock Heartbeat Response JT transaction rate limit: 200 heartbeats/sec
  • 11. Highly Concurrent Systems • scales much better (if done right) • makes effective use of multi- core hardware • managing eventual consistency of states hard • need for a systemic framework to manage this
  • 12. Event Queue Event Dispatcher Component Component Component A B N • Mutations only via events • Components only expose Read APIs • Use Re-entrant locks • Components follow clear lifecycle Event Model
  • 13. Heartbeat NodeManager Listener Event Q Meta Heartbeat Request Get commands Heartbeat Response Asynchronous Heartbeat Handling
  • 14. YARN: Better utilization bigger cluster YARN Cluster Utilization Hadoop 1.0 Cluster Size
  • 16.
  • 17. State management in JT Very Hard to Maintain Debugging even harder
  • 18. Complex State Management • Light weight State Machines Library • Declarative way of specifying the state Transitions • Invalid transitions are handled automatically • Fits nicely with the event model • Debug-ability is drastically improved. Lineage of object states can easily be determined • Handy while recovering the state
  • 21. MR Application Master Recovery • Hadoop 1.0 • Application need to resubmit Job • All completed tasks are lost • YARN • Application execution state check pointed in HDFS • Rebuilds the state by replaying the events
  • 22. Resource Manager HA • Based on Zookeeper • Coming Soon • YARN-128
  • 23. YARN: New Possibilities • Open MPI - MR-2911 • Master-Worker – MR-3315 • Distributed Shell • Graph processing – Giraph-13 • BSP – HAMA-431 • CEP • S4 – S4-25 • Storm - https://github.com/nathanmarz/storm/issues/74 • Iterative processing - Spark https://github.com/mesos/spark-yarn/
  • 24. YARN - a solid foundation to take Hadoop to next level on Scale, High Availability, Utilization And Alternate Compute Paradigms

Hinweis der Redaktion

  1. We will talk about how YARN is built fundamentally different than Hadoop 1.0. what is the motivation for doing so ? What it buys us ?Hadoop 1.0 as classic MRHadoop 2.0 has MR on Yarn
  2. I work primarily on Map-Reduce side and was part of the team when yarn was conceptualizedI work at InMobi, which is a mobile advertising company. I lead the development of big data platforms at InMobi, right from data collection to data analytics systems.I don’t see many folks from India. I am the organizer of hadoopmeetup group.
  3. Quick primer on the Hadoop 1.0 architectureSingle Master known as JobTracker. Slave daemons are called TaskTrackerClient submit Jobs to JobTracker.Individual jobs contain map and reduce definitions.Jobtracker knows about the cluster resource and schedules the map and reduce tasks accordingly.
  4. Single master known as Resource Manager - RM manages the resources of the clusterSlave daemons known as NodeManager - manages the resources of individual nodesClient submits Applications (Jobs are now called applications in YARN) to ResourceManagerEach Application has its own master process which gets spawned when the Application starts running - this process is responsible for managing the lifecycle of the Application- called Application master - fi Application Master wants to spawn more processes in the cluster, it ask the resource manager to spawn one. - the resource definition of the process which needs to be launched in the cluster is container – it says about things like RAM, disk, cpuetcFundamentally the application state mgmt is distributedRM is only responsible for cluster mgmt
  5. What the new architecture gets usTodo:put animationScale and general purpose distributed compute platformI will discuss First Lets understand what scale meanshadoop context
  6. For a distributed computate platform, scalability is at two levelsin terms of how big a single application couldAnd the number of concurrent running tasks in a single clusterApplication size is number of sub tasks and application level state.Number of concurrent tasks is nothing but the cluster size
  7. In hadoop 1.0, application size is constrained by JobtrakerJT is a huge Monolithic master.Keeps cluster level metadata, task level metadata and application specific meta data. You see things like counter limits etc for the same reasonTODO: put the formulae
  8. Todo: put animationBecauseApplication management is distributedLets see the other : number of concurrent tasks
  9. Todo: animationWhy nobody runs more than say 3k or 4k nodes in JTBecause as the cluster size grows the utilization drops. The steeper the curve, the more you sacrifice on utilization So we said at 4k utilization is acceptable, so cluster size should not grow beyond thatLets see why this drops
  10. Task scheduling happens in the heartbeatJob tracker has a global lock and heartbeat is process synchronouslyJT thru put is limited say 200 heartbeats/secAs cluster size is increased the interval a TT sends a heartbeat increasesJT is not very concurrentNeed to design for better concurrency
  11. Same as in slides
  12. Asynchronous processing of eventsEach component encapsulates its state. Mutations happen only via eventsReads can happen direclty
  13. In Resource manager, the heartbeat processing is asynchronous, so it can handle large number of heartbeats/secso what is the impact of this on utilization
  14. as the cluster size increases, the drop in utlization is much lowerYarn cluster can have large number of nodes within a single cluster
  15. Lets look at state management aspectsThe state management in distributed systems where there are lot of moving parts is very crucial
  16. This is the state transition picture for Jobsimilarly for different entities like Job, task, attemptTask and attempt have even more nodes
  17. This is a very small snippet of Jobtracker code. It is thru out like this.No one dares to touch this. Very very fragile
  18. For this reason, yarn has very light weight state machine library
  19. All valid state transitions are declared upfrontApart from the obvious benefits:Now one can visualize/discuss/argue about the proposed changes to state machine which is not possible in current Hadoop 1.0I remember the first version of all the state transitions we designed in a spreadsheet in which we could see what all valid transitions we are missing
  20. Lets look at the HA story
  21. Same as in slides
  22. Now since the work being done by Resource Manager is limited to cluster management and scheduling, now it is much much simpler to build HA in RM as opposed to JT which has a huge state
  23. There are several compute paradigm being built over YARN. This list some of themThere are several others as well
  24. Same as slideYARN is a general purpose distributed compute platform