SlideShare a Scribd company logo
1 of 31
© Hortonworks Inc. 2015
Hadoop YARN Services
Xuan Gong
xgong@hortonworks.com
Steve Loughran
stevel@hortonworks.com
Apache Hadoop + YARN:
An OS for data
An OS can do more than SQL
statements
An OS can do more than run
admin-installed apps
An OS lets you
run whatever you want!
© Hortonworks Inc. 2015
log
…which is important
front end
web
phones
devices
feeds log
stream
processing
front end
front end
log
database
analytics
© Hortonworks Inc. 2015
YARN Services:
Long lived applications
within a Hadoop cluster
Apache Slider (incubating)
(hosting: HBase, Accumulo, Storm…)
Samza
Hive LLAP Daemons
Kafka on YARN
Apache Flink
© Hortonworks Inc. 2015
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
“The RM”
HDFS
YARN Node Manager
• Servers run YARN Node Managers (NM)
• NM's heartbeat to Resource Manager (RM)
• RM schedules work over cluster
• RM allocates containers to apps
• NMs start containers
• NMs report container health
Background: YARN
© Hortonworks Inc. 2015
Client creates App Master
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
“The RM”
HDFS
YARN Node Manager
Client
Application Master
© Hortonworks Inc. 2015
“AM” requests containers
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
HDFS
YARN Node Manager
Application Master
Container
Container
Container
© Hortonworks Inc.
Short lived apps have it easy
• failure: clean restart
• logs: collect at end
• placement: by data
• security: Kerberos delegation tokens
• discovery: launcher app can track
© Hortonworks Inc.
Long-lived services don't
• failure: stay up
• logs: ongoing collection
• placement: availability, performance
• security: stay secure over time
• discovery: locatable by any client
© Hortonworks Inc. 2015
YARN-896
Support for YARN services
Log aggregation
Service registration & discovery
Windowed failure tracking
Anti-affinity placement
Gang scheduling
Applications to continue over AM restart
Container resource flexing
Container reuse
Kerberos token renewal
Container signalling
Net & Disk resources
Labelled nodes & queues
YARN-896
REST
Log aggregation
Service registration & discovery
Windowed failure tracking
Anti-affinity placement
Gang scheduling
Applications to continue over AM restart
Container resource flexing
Container reuse
Kerberos token renewal
Container signalling
Net & Disk resources
Labelled nodes & queues
Hadoop 2.6
(Docker)
REST
© Hortonworks Inc. 2015
Failures
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
HDFS
YARN Node Manager
Application Master
Container
Container
Container
© Hortonworks Inc. 2015
Failures
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
Application Master
Container
Container
container 1
container 2
lost: container 3
Failures
© Hortonworks Inc
Easy: enabling
// Client
amLauncher.setKeepContainersOverRestarts(true);
amLauncher.setMaxAppAttempts(8);
// Server
List<Container> liveContainers =
amRegistrationData.getContainersFromPreviousAttempts();
© Hortonworks Inc. 2015
Harder: rebuilding state
Node Map
Placement History
Specification
Container QueuesComponent Map
Event History
Persisted Rebuilt Transient
© Hortonworks Inc. 2015
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
Log Aggregation
Labels
ToR Switch
Switch
general
general
HBase
HBase
ToR Switch
general
general
HBase
HBase
ToR Switch
general
general
general,
kafka
HBase
HBase
general,
kafka
general,
kafka
© Hortonworks Inc. 2015
$ yarn rmadmin
...
-addToClusterNodeLabels [label1,label2,label3]
-removeFromClusterNodeLabels [label1,label2,label3]
-replaceLabelsOnNode [node1:port,label1,label2]
-directlyAccessNodeLabelStore
Labels
© Hortonworks Inc
YARN-913: Service Registry
$ slider resolve –path ~/services/org-apache-slider/storm1
{ "type" : "JSONServiceRecord",
"external" : [ {
"api" : "http://",
"addressType" : "uri",
"protocolType" : "webui",
"addresses" : [ {
"uri" : "http://nn.ex.net:4813"
} ]
}, {
"api" : "classpath:org.apache.slider.publisher.configurations",
"addressType" : "uri",
"protocolType" : "REST",
"addresses" : [ {
"uri" : "http://nn.ex.net:4813/ws/v1/slider/publisher/slider"
}]
} } ] }
© Hortonworks Inc. 2015
Internal and external endpoints
"internal" : [ {
"api" : "classpath:org.apache.slider.agents.secure",
"addressType" : "uri",
"protocolType" : "REST",
"addresses" : [ {
"uri" : "https://nn.ex.net:4813/ws/v1/slider/agents"
} ]
} ]
Internal: for an application's own use.
External: for clients, Web UIs and other apps
© Hortonworks Inc.
Security
• Token expiry a core Kerberos feature
• Token expiry inimical to service longevity
• Specifically: token delegation
• After 72h (default)
YARN updates the RM/AM tokens but not HDFS, ZK, ….
© Hortonworks Inc.
How do apps cope?
Do nothing  apps can run up to 72h
–All
Keytabs  apps can run forever; keytabs need to be
managed (securely)
–Slider
Client push  running/scheduled client updates AM;
AM forwards to containers
–Twill
AM keytab  containers ask for new tokens
–Spark via SPARK-5342
© Hortonworks Inc.
…so you can now:
write long lived apps
…with failure resilience
…and centralised log viewing
…and labelled/isolated placement
…in secure clusters
Log aggregation
Service registration & discovery
Windowed failure tracking
Anti-affinity placement
Gang scheduling
Applications to continue over AM restart
Container resource flexing
Container reuse
Kerberos token renewal
Container signalling
Net & Disk resources
Labelled nodes & queues
TODO
RESTREST
© Hortonworks Inc 2015
Questions?
For some code, see
http://slider.incubator.apache.org/
http://hadoop.apache.org

More Related Content

What's hot

Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
DataWorks Summit
 
Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters
Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared ClustersMercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters
Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters
DataWorks Summit
 
New Data Transfer Tools for Hadoop: Sqoop 2
New Data Transfer Tools for Hadoop: Sqoop 2New Data Transfer Tools for Hadoop: Sqoop 2
New Data Transfer Tools for Hadoop: Sqoop 2
DataWorks Summit
 
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
Yahoo Developer Network
 

What's hot (20)

Cloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
Cloudy with a chance of Hadoop - DataWorks Summit 2017 San JoseCloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
Cloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Apache Hive on ACID
Apache Hive on ACIDApache Hive on ACID
Apache Hive on ACID
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARN
 
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsTuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
YARN and the Docker container runtime
YARN and the Docker container runtimeYARN and the Docker container runtime
YARN and the Docker container runtime
 
Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters
Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared ClustersMercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters
Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters
 
Ozone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopOzone- Object store for Apache Hadoop
Ozone- Object store for Apache Hadoop
 
New Data Transfer Tools for Hadoop: Sqoop 2
New Data Transfer Tools for Hadoop: Sqoop 2New Data Transfer Tools for Hadoop: Sqoop 2
New Data Transfer Tools for Hadoop: Sqoop 2
 
Apache HBase: State of the Union
Apache HBase: State of the UnionApache HBase: State of the Union
Apache HBase: State of the Union
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, Future
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Operating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsOperating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and Improvements
 
Hadoop Summit 2012 | A New Generation of Data Transfer Tools for Hadoop: Sqoop 2
Hadoop Summit 2012 | A New Generation of Data Transfer Tools for Hadoop: Sqoop 2Hadoop Summit 2012 | A New Generation of Data Transfer Tools for Hadoop: Sqoop 2
Hadoop Summit 2012 | A New Generation of Data Transfer Tools for Hadoop: Sqoop 2
 
Strata Stinger Talk October 2013
Strata Stinger Talk October 2013Strata Stinger Talk October 2013
Strata Stinger Talk October 2013
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoop
 
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage  object store integration in production (final)Hadoop & cloud storage  object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
 
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
 

Similar to Hadoop YARN Services

Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNCombine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARN
Hortonworks
 
2013 11-19-hoya-status
2013 11-19-hoya-status2013 11-19-hoya-status
2013 11-19-hoya-status
Steve Loughran
 
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014
Hortonworks
 
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGReal-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
skumpf
 

Similar to Hadoop YARN Services (20)

Overview of slider project
Overview of slider projectOverview of slider project
Overview of slider project
 
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNCombine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARN
 
Hoya for Code Review
Hoya for Code ReviewHoya for Code Review
Hoya for Code Review
 
October 2014 HUG : Apache Slider
October 2014 HUG : Apache SliderOctober 2014 HUG : Apache Slider
October 2014 HUG : Apache Slider
 
2013 11-19-hoya-status
2013 11-19-hoya-status2013 11-19-hoya-status
2013 11-19-hoya-status
 
How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopHow YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in Hadoop
 
Running Services on YARN
Running Services on YARNRunning Services on YARN
Running Services on YARN
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
 
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Hadoop: today and tomorrow
Hadoop: today and tomorrowHadoop: today and tomorrow
Hadoop: today and tomorrow
 
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementTaming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
 
Accumulo Summit 2014: Accumulo on YARN
Accumulo Summit 2014: Accumulo on YARNAccumulo Summit 2014: Accumulo on YARN
Accumulo Summit 2014: Accumulo on YARN
 
YARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider WebinarYARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider Webinar
 
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
 
Big data spain keynote nov 2016
Big data spain keynote nov 2016Big data spain keynote nov 2016
Big data spain keynote nov 2016
 
What's new in Hadoop Yarn- Dec 2014
What's new in Hadoop Yarn- Dec 2014What's new in Hadoop Yarn- Dec 2014
What's new in Hadoop Yarn- Dec 2014
 
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGReal-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
 
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Apache Ambari BOF - OpenStack - Hadoop Summit 2013Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
 

More from DataWorks Summit

HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Hadoop YARN Services

  • 1. © Hortonworks Inc. 2015 Hadoop YARN Services Xuan Gong xgong@hortonworks.com Steve Loughran stevel@hortonworks.com
  • 2.
  • 3. Apache Hadoop + YARN: An OS for data
  • 4. An OS can do more than SQL statements
  • 5. An OS can do more than run admin-installed apps
  • 6. An OS lets you run whatever you want!
  • 7. © Hortonworks Inc. 2015 log …which is important front end web phones devices feeds log stream processing front end front end log database analytics
  • 8. © Hortonworks Inc. 2015 YARN Services: Long lived applications within a Hadoop cluster
  • 9. Apache Slider (incubating) (hosting: HBase, Accumulo, Storm…) Samza Hive LLAP Daemons Kafka on YARN Apache Flink
  • 10. © Hortonworks Inc. 2015 HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager “The RM” HDFS YARN Node Manager • Servers run YARN Node Managers (NM) • NM's heartbeat to Resource Manager (RM) • RM schedules work over cluster • RM allocates containers to apps • NMs start containers • NMs report container health Background: YARN
  • 11. © Hortonworks Inc. 2015 Client creates App Master HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager “The RM” HDFS YARN Node Manager Client Application Master
  • 12. © Hortonworks Inc. 2015 “AM” requests containers HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager HDFS YARN Node Manager Application Master Container Container Container
  • 13. © Hortonworks Inc. Short lived apps have it easy • failure: clean restart • logs: collect at end • placement: by data • security: Kerberos delegation tokens • discovery: launcher app can track
  • 14. © Hortonworks Inc. Long-lived services don't • failure: stay up • logs: ongoing collection • placement: availability, performance • security: stay secure over time • discovery: locatable by any client
  • 15. © Hortonworks Inc. 2015 YARN-896 Support for YARN services
  • 16. Log aggregation Service registration & discovery Windowed failure tracking Anti-affinity placement Gang scheduling Applications to continue over AM restart Container resource flexing Container reuse Kerberos token renewal Container signalling Net & Disk resources Labelled nodes & queues YARN-896 REST
  • 17. Log aggregation Service registration & discovery Windowed failure tracking Anti-affinity placement Gang scheduling Applications to continue over AM restart Container resource flexing Container reuse Kerberos token renewal Container signalling Net & Disk resources Labelled nodes & queues Hadoop 2.6 (Docker) REST
  • 18. © Hortonworks Inc. 2015 Failures HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager HDFS YARN Node Manager Application Master Container Container Container
  • 19. © Hortonworks Inc. 2015 Failures HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager Application Master Container Container container 1 container 2 lost: container 3 Failures
  • 20. © Hortonworks Inc Easy: enabling // Client amLauncher.setKeepContainersOverRestarts(true); amLauncher.setMaxAppAttempts(8); // Server List<Container> liveContainers = amRegistrationData.getContainersFromPreviousAttempts();
  • 21. © Hortonworks Inc. 2015 Harder: rebuilding state Node Map Placement History Specification Container QueuesComponent Map Event History Persisted Rebuilt Transient
  • 22. © Hortonworks Inc. 2015 <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> Log Aggregation
  • 23. Labels ToR Switch Switch general general HBase HBase ToR Switch general general HBase HBase ToR Switch general general general, kafka HBase HBase general, kafka general, kafka
  • 24. © Hortonworks Inc. 2015 $ yarn rmadmin ... -addToClusterNodeLabels [label1,label2,label3] -removeFromClusterNodeLabels [label1,label2,label3] -replaceLabelsOnNode [node1:port,label1,label2] -directlyAccessNodeLabelStore Labels
  • 25. © Hortonworks Inc YARN-913: Service Registry $ slider resolve –path ~/services/org-apache-slider/storm1 { "type" : "JSONServiceRecord", "external" : [ { "api" : "http://", "addressType" : "uri", "protocolType" : "webui", "addresses" : [ { "uri" : "http://nn.ex.net:4813" } ] }, { "api" : "classpath:org.apache.slider.publisher.configurations", "addressType" : "uri", "protocolType" : "REST", "addresses" : [ { "uri" : "http://nn.ex.net:4813/ws/v1/slider/publisher/slider" }] } } ] }
  • 26. © Hortonworks Inc. 2015 Internal and external endpoints "internal" : [ { "api" : "classpath:org.apache.slider.agents.secure", "addressType" : "uri", "protocolType" : "REST", "addresses" : [ { "uri" : "https://nn.ex.net:4813/ws/v1/slider/agents" } ] } ] Internal: for an application's own use. External: for clients, Web UIs and other apps
  • 27. © Hortonworks Inc. Security • Token expiry a core Kerberos feature • Token expiry inimical to service longevity • Specifically: token delegation • After 72h (default) YARN updates the RM/AM tokens but not HDFS, ZK, ….
  • 28. © Hortonworks Inc. How do apps cope? Do nothing  apps can run up to 72h –All Keytabs  apps can run forever; keytabs need to be managed (securely) –Slider Client push  running/scheduled client updates AM; AM forwards to containers –Twill AM keytab  containers ask for new tokens –Spark via SPARK-5342
  • 29. © Hortonworks Inc. …so you can now: write long lived apps …with failure resilience …and centralised log viewing …and labelled/isolated placement …in secure clusters
  • 30. Log aggregation Service registration & discovery Windowed failure tracking Anti-affinity placement Gang scheduling Applications to continue over AM restart Container resource flexing Container reuse Kerberos token renewal Container signalling Net & Disk resources Labelled nodes & queues TODO RESTREST
  • 31. © Hortonworks Inc 2015 Questions? For some code, see http://slider.incubator.apache.org/ http://hadoop.apache.org