SlideShare a Scribd company logo
1 of 12
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN and the Docker
Container Runtime
Dublin, April 2016
Sidharta S
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Context
LinuxContainerExecutor
Why docker with YARN?
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Container Runtimes in LinuxContainerExecutor
⬢ Support for Container Runtimes in YARN was added as part of – YARN-3611
(umbrella). Multiple container types are supported in the same executor.
⬢ The current mechanism of handling container lifecycle is moved into its own
runtime
⬢ A new docker container runtime is introduced that manages docker containers
⬢ LinuxContainerExecutor can delegate to either runtime on a per application basis
⬢ Clients specify which container type they want to use – currently via environment
variables but eventually through well-defined client APIs.
⬢ We could support more container types in the future.
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
DockerContainerRuntime in LinuxContainerExecutor
⬢ Exposes a subset of Docker container lifecycle functionality
⬢ Docker v1.10.x required for some of the work being planned (e.g. user namespaces)
⬢ A recent linux kernel is required (3.10+) for basic functionality. Some features (e.g.
overlay fs) require an even more recent kernel.
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
DockerContainerRuntime : Resource Isolation
⬢ Support added in YARN-4553
⬢ LinuxContainerExecutor still manages resource isolation and enforcement .
⬢ Docker uses the cgroup specified by LCE ( --cgroup-parent introduced in docker v 1.6 ,
net_cls support added to libcontainer recently – support added in v1.9)
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
DockerContainerRuntime : Linux Capabilities
⬢ Support added in YARN-4258
⬢ Based on linux capabilities
⬢ Admin controlled – cluster administrator can control which capabilities docker
containers have on the cluster
(yarn.nodemanager.runtime.linux.docker.capabilities)
⬢ Default set is based on what docker uses by default.
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
DockerContainerRuntime : Privileged Containers
⬢ Support added in YARN-4262
⬢ Allows certain applications to run in docker containers – e.g. oracle
⬢ This could be a security hazard so access to this needs to be controlled :
–Disabled by default (yarn.nodemanager.runtime.linux.docker.privileged-containers.allowed)
–Admin controlled whitelist (yarn.nodemanager.runtime.linux.docker.privileged-
containers.acl)
–This whitelisted set of users allowed to launch privileged containers – but they must explicitly
request for it (YARN_CONTAINER_RUNTIME_DOCKER_RUN_PRIVILEGED_CONTAINER)
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
DockerContainerRuntime : Users
⬢ Docker runs the container as the specified user (-u)
⬢ This user needs to be available in the image being used.
⬢ Depending on capabilities (CAP_SETUID), privileged escalation could occur.
⬢ Docker v1.10 added support for user namespaces – requires daemon re-
configuration. Support for this in DockerContainerRuntime needs to be planned.
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
DockerContainerRuntime : Networking
⬢ Defaults to --net=host in the docker container runtime.
–This is not secure – but this is the only way some applications can run.
–We need to switch the default to bridged mode or an admin specified network
plugin
⬢ Network plug-in support was added in v1.9 .
–This should allow for more sophisticated networking scenarios
–YARN doesn’t have to do anything except delegate to the specified network
plugin when launching the container.
–Support for this is a work in progress (YARN-4007)
1
0
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
DockerContainerRuntime : Images
⬢ Localization via HDFS? : We could localize images from HDFS and load them using
‘docker load’.
⬢ This approach has the advantage of using an existing HDFS instance for
storage/distribution at scale.
⬢ However :
–we lose some of the optimizations/functionality that using a full-fledged docker
registry might provide.
–We have to figure out security implications. What if users clobber each others
images when ‘loading’ ?
1
1
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Demos
Spark on Docker
YARN on YARN
1
2
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Q&A

More Related Content

What's hot

Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureDataWorks Summit
 
Big Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and MesosBig Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and MesosHeiko Loewe
 
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016alanfgates
 
One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)DataWorks Summit
 
Moving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloudMoving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloudDataWorks Summit/Hadoop Summit
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkDataWorks Summit
 
Ozone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopOzone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopHortonworks
 
Hadoop 3.0 features
Hadoop 3.0 featuresHadoop 3.0 features
Hadoop 3.0 featuresanand murari
 
Storage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduceStorage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduceChris Nauroth
 
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics FrameworksOverview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics FrameworksDataWorks Summit/Hadoop Summit
 
CBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFSCBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFSDataWorks Summit
 
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage  object store integration in production (final)Hadoop & cloud storage  object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)Chris Nauroth
 
Hive - 1455: Cloud Storage
Hive - 1455: Cloud StorageHive - 1455: Cloud Storage
Hive - 1455: Cloud StorageHortonworks
 
Hive on spark berlin buzzwords
Hive on spark berlin buzzwordsHive on spark berlin buzzwords
Hive on spark berlin buzzwordsSzehon Ho
 

What's hot (20)

Effective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant ClustersEffective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant Clusters
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
Big Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and MesosBig Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and Mesos
 
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
 
One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)
 
Moving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloudMoving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloud
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Kafka Security
Kafka SecurityKafka Security
Kafka Security
 
Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?
 
Ozone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopOzone- Object store for Apache Hadoop
Ozone- Object store for Apache Hadoop
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Hadoop 3.0 features
Hadoop 3.0 featuresHadoop 3.0 features
Hadoop 3.0 features
 
Storage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduceStorage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduce
 
A Multi Colored YARN
A Multi Colored YARNA Multi Colored YARN
A Multi Colored YARN
 
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics FrameworksOverview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
 
CBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFSCBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFS
 
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage  object store integration in production (final)Hadoop & cloud storage  object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
 
Hive - 1455: Cloud Storage
Hive - 1455: Cloud StorageHive - 1455: Cloud Storage
Hive - 1455: Cloud Storage
 
To The Cloud and Back: A Look At Hybrid Analytics
To The Cloud and Back: A Look At Hybrid AnalyticsTo The Cloud and Back: A Look At Hybrid Analytics
To The Cloud and Back: A Look At Hybrid Analytics
 
Hive on spark berlin buzzwords
Hive on spark berlin buzzwordsHive on spark berlin buzzwords
Hive on spark berlin buzzwords
 

Viewers also liked

Deploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderDeploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderHortonworks
 
How mentoring can help you start contributing to open source
How mentoring can help you start contributing to open sourceHow mentoring can help you start contributing to open source
How mentoring can help you start contributing to open sourceLuciano Resende
 
Luciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende
 
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirWriting Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirLuciano Resende
 
SystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningSystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningLuciano Resende
 
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...DataWorks Summit/Hadoop Summit
 
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...DataWorks Summit/Hadoop Summit
 
[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민NAVER D2
 
Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...DataWorks Summit/Hadoop Summit
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersDataWorks Summit/Hadoop Summit
 
Storm-on-YARN: Convergence of Low-Latency and Big-Data
Storm-on-YARN: Convergence of Low-Latency and Big-DataStorm-on-YARN: Convergence of Low-Latency and Big-Data
Storm-on-YARN: Convergence of Low-Latency and Big-DataDataWorks Summit
 
Document validation in MongoDB 3.2
Document validation in MongoDB 3.2Document validation in MongoDB 3.2
Document validation in MongoDB 3.2Andrew Morgan
 

Viewers also liked (20)

Deploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderDeploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via Slider
 
Comparison of Transactional Libraries for HBase
Comparison of Transactional Libraries for HBaseComparison of Transactional Libraries for HBase
Comparison of Transactional Libraries for HBase
 
Apache Slider
Apache SliderApache Slider
Apache Slider
 
How mentoring can help you start contributing to open source
How mentoring can help you start contributing to open sourceHow mentoring can help you start contributing to open source
How mentoring can help you start contributing to open source
 
Luciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conference
 
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirWriting Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
 
SystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningSystemML - Declarative Machine Learning
SystemML - Declarative Machine Learning
 
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
 
Case study of DevOps for Hadoop in Recruit.
Case study of DevOps for Hadoop in Recruit.Case study of DevOps for Hadoop in Recruit.
Case study of DevOps for Hadoop in Recruit.
 
Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?
 
What's new in Hadoop Common and HDFS
What's new in Hadoop Common and HDFS What's new in Hadoop Common and HDFS
What's new in Hadoop Common and HDFS
 
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
 
[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민
 
The truth about SQL and Data Warehousing on Hadoop
The truth about SQL and Data Warehousing on HadoopThe truth about SQL and Data Warehousing on Hadoop
The truth about SQL and Data Warehousing on Hadoop
 
Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
 
Flink vs. Spark
Flink vs. SparkFlink vs. Spark
Flink vs. Spark
 
Hadoop YARN Services
Hadoop YARN ServicesHadoop YARN Services
Hadoop YARN Services
 
Storm-on-YARN: Convergence of Low-Latency and Big-Data
Storm-on-YARN: Convergence of Low-Latency and Big-DataStorm-on-YARN: Convergence of Low-Latency and Big-Data
Storm-on-YARN: Convergence of Low-Latency and Big-Data
 
Document validation in MongoDB 3.2
Document validation in MongoDB 3.2Document validation in MongoDB 3.2
Document validation in MongoDB 3.2
 

Similar to YARN and the Docker container runtime

April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...Yahoo Developer Network
 
Lessons Learned Running a Container Cloud on Apache Hadoop YARN
Lessons Learned Running a Container Cloud on Apache Hadoop YARNLessons Learned Running a Container Cloud on Apache Hadoop YARN
Lessons Learned Running a Container Cloud on Apache Hadoop YARNBillie Rinaldi
 
Lessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARNLessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARNDataWorks Summit
 
Practical Design Patterns in Docker Networking
Practical Design Patterns in Docker NetworkingPractical Design Patterns in Docker Networking
Practical Design Patterns in Docker NetworkingDocker, Inc.
 
Docker Networking Overview
Docker Networking OverviewDocker Networking Overview
Docker Networking OverviewSreenivas Makam
 
Introduction to Docker and Monitoring with InfluxData
Introduction to Docker and Monitoring with InfluxDataIntroduction to Docker and Monitoring with InfluxData
Introduction to Docker and Monitoring with InfluxDataInfluxData
 
컨테이너 기술 소개 - Warden, Garden, Docker
컨테이너 기술 소개 - Warden, Garden, Docker컨테이너 기술 소개 - Warden, Garden, Docker
컨테이너 기술 소개 - Warden, Garden, Dockerseungdon Choi
 
OpenStack with-docker-team-17
OpenStack with-docker-team-17OpenStack with-docker-team-17
OpenStack with-docker-team-17Jaspreet Singh
 
Characterizing and contrasting kuhn tey-ner awr-kuh-streyt-ors
Characterizing and contrasting kuhn tey-ner awr-kuh-streyt-orsCharacterizing and contrasting kuhn tey-ner awr-kuh-streyt-ors
Characterizing and contrasting kuhn tey-ner awr-kuh-streyt-orsLee Calcote
 
Docker introduction & benefits
Docker introduction & benefitsDocker introduction & benefits
Docker introduction & benefitsAmit Manwade
 
Docker container a-brief_introduction_2016-01-30
Docker container a-brief_introduction_2016-01-30Docker container a-brief_introduction_2016-01-30
Docker container a-brief_introduction_2016-01-30Khelender Sasan
 
Dev opsec dockerimage_patch_n_lifecyclemanagement_
Dev opsec dockerimage_patch_n_lifecyclemanagement_Dev opsec dockerimage_patch_n_lifecyclemanagement_
Dev opsec dockerimage_patch_n_lifecyclemanagement_kanedafromparis
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Yahoo Developer Network
 
Best Practices for Developing & Deploying Java Applications with Docker
Best Practices for Developing & Deploying Java Applications with DockerBest Practices for Developing & Deploying Java Applications with Docker
Best Practices for Developing & Deploying Java Applications with DockerEric Smalling
 
Introduction to containers a practical session using core os and docker
Introduction to containers  a practical session using core os and dockerIntroduction to containers  a practical session using core os and docker
Introduction to containers a practical session using core os and dockerAlessandro Martellone
 
MySQL docker with demo by Ramana Yeruva
MySQL docker with demo by Ramana YeruvaMySQL docker with demo by Ramana Yeruva
MySQL docker with demo by Ramana YeruvaMysql User Camp
 
Building a sdn solution for the deployment of web application stacks in docker
Building a sdn solution for the deployment of web application stacks in dockerBuilding a sdn solution for the deployment of web application stacks in docker
Building a sdn solution for the deployment of web application stacks in dockerJorge Juan Mendoza
 

Similar to YARN and the Docker container runtime (20)

April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
 
Lessons Learned Running a Container Cloud on Apache Hadoop YARN
Lessons Learned Running a Container Cloud on Apache Hadoop YARNLessons Learned Running a Container Cloud on Apache Hadoop YARN
Lessons Learned Running a Container Cloud on Apache Hadoop YARN
 
Lessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARNLessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARN
 
Practical Design Patterns in Docker Networking
Practical Design Patterns in Docker NetworkingPractical Design Patterns in Docker Networking
Practical Design Patterns in Docker Networking
 
Docker Networking Overview
Docker Networking OverviewDocker Networking Overview
Docker Networking Overview
 
Introduction to Docker and Monitoring with InfluxData
Introduction to Docker and Monitoring with InfluxDataIntroduction to Docker and Monitoring with InfluxData
Introduction to Docker and Monitoring with InfluxData
 
컨테이너 기술 소개 - Warden, Garden, Docker
컨테이너 기술 소개 - Warden, Garden, Docker컨테이너 기술 소개 - Warden, Garden, Docker
컨테이너 기술 소개 - Warden, Garden, Docker
 
OpenStack with-docker-team-17
OpenStack with-docker-team-17OpenStack with-docker-team-17
OpenStack with-docker-team-17
 
Characterizing and contrasting kuhn tey-ner awr-kuh-streyt-ors
Characterizing and contrasting kuhn tey-ner awr-kuh-streyt-orsCharacterizing and contrasting kuhn tey-ner awr-kuh-streyt-ors
Characterizing and contrasting kuhn tey-ner awr-kuh-streyt-ors
 
Docker introduction & benefits
Docker introduction & benefitsDocker introduction & benefits
Docker introduction & benefits
 
Docker container a-brief_introduction_2016-01-30
Docker container a-brief_introduction_2016-01-30Docker container a-brief_introduction_2016-01-30
Docker container a-brief_introduction_2016-01-30
 
Dev opsec dockerimage_patch_n_lifecyclemanagement_
Dev opsec dockerimage_patch_n_lifecyclemanagement_Dev opsec dockerimage_patch_n_lifecyclemanagement_
Dev opsec dockerimage_patch_n_lifecyclemanagement_
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
 
Best Practices for Developing & Deploying Java Applications with Docker
Best Practices for Developing & Deploying Java Applications with DockerBest Practices for Developing & Deploying Java Applications with Docker
Best Practices for Developing & Deploying Java Applications with Docker
 
Docker
DockerDocker
Docker
 
Introduction to containers a practical session using core os and docker
Introduction to containers  a practical session using core os and dockerIntroduction to containers  a practical session using core os and docker
Introduction to containers a practical session using core os and docker
 
MySQL docker with demo by Ramana Yeruva
MySQL docker with demo by Ramana YeruvaMySQL docker with demo by Ramana Yeruva
MySQL docker with demo by Ramana Yeruva
 
Axigen on docker
Axigen on dockerAxigen on docker
Axigen on docker
 
2015 05-06-elias weingaertner-docker-intro
2015 05-06-elias weingaertner-docker-intro2015 05-06-elias weingaertner-docker-intro
2015 05-06-elias weingaertner-docker-intro
 
Building a sdn solution for the deployment of web application stacks in docker
Building a sdn solution for the deployment of web application stacks in dockerBuilding a sdn solution for the deployment of web application stacks in docker
Building a sdn solution for the deployment of web application stacks in docker
 

More from DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Recently uploaded

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Recently uploaded (20)

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

YARN and the Docker container runtime

  • 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved YARN and the Docker Container Runtime Dublin, April 2016 Sidharta S
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Context LinuxContainerExecutor Why docker with YARN?
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Container Runtimes in LinuxContainerExecutor ⬢ Support for Container Runtimes in YARN was added as part of – YARN-3611 (umbrella). Multiple container types are supported in the same executor. ⬢ The current mechanism of handling container lifecycle is moved into its own runtime ⬢ A new docker container runtime is introduced that manages docker containers ⬢ LinuxContainerExecutor can delegate to either runtime on a per application basis ⬢ Clients specify which container type they want to use – currently via environment variables but eventually through well-defined client APIs. ⬢ We could support more container types in the future.
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DockerContainerRuntime in LinuxContainerExecutor ⬢ Exposes a subset of Docker container lifecycle functionality ⬢ Docker v1.10.x required for some of the work being planned (e.g. user namespaces) ⬢ A recent linux kernel is required (3.10+) for basic functionality. Some features (e.g. overlay fs) require an even more recent kernel.
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DockerContainerRuntime : Resource Isolation ⬢ Support added in YARN-4553 ⬢ LinuxContainerExecutor still manages resource isolation and enforcement . ⬢ Docker uses the cgroup specified by LCE ( --cgroup-parent introduced in docker v 1.6 , net_cls support added to libcontainer recently – support added in v1.9)
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DockerContainerRuntime : Linux Capabilities ⬢ Support added in YARN-4258 ⬢ Based on linux capabilities ⬢ Admin controlled – cluster administrator can control which capabilities docker containers have on the cluster (yarn.nodemanager.runtime.linux.docker.capabilities) ⬢ Default set is based on what docker uses by default.
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DockerContainerRuntime : Privileged Containers ⬢ Support added in YARN-4262 ⬢ Allows certain applications to run in docker containers – e.g. oracle ⬢ This could be a security hazard so access to this needs to be controlled : –Disabled by default (yarn.nodemanager.runtime.linux.docker.privileged-containers.allowed) –Admin controlled whitelist (yarn.nodemanager.runtime.linux.docker.privileged- containers.acl) –This whitelisted set of users allowed to launch privileged containers – but they must explicitly request for it (YARN_CONTAINER_RUNTIME_DOCKER_RUN_PRIVILEGED_CONTAINER)
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DockerContainerRuntime : Users ⬢ Docker runs the container as the specified user (-u) ⬢ This user needs to be available in the image being used. ⬢ Depending on capabilities (CAP_SETUID), privileged escalation could occur. ⬢ Docker v1.10 added support for user namespaces – requires daemon re- configuration. Support for this in DockerContainerRuntime needs to be planned.
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DockerContainerRuntime : Networking ⬢ Defaults to --net=host in the docker container runtime. –This is not secure – but this is the only way some applications can run. –We need to switch the default to bridged mode or an admin specified network plugin ⬢ Network plug-in support was added in v1.9 . –This should allow for more sophisticated networking scenarios –YARN doesn’t have to do anything except delegate to the specified network plugin when launching the container. –Support for this is a work in progress (YARN-4007)
  • 10. 1 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DockerContainerRuntime : Images ⬢ Localization via HDFS? : We could localize images from HDFS and load them using ‘docker load’. ⬢ This approach has the advantage of using an existing HDFS instance for storage/distribution at scale. ⬢ However : –we lose some of the optimizations/functionality that using a full-fledged docker registry might provide. –We have to figure out security implications. What if users clobber each others images when ‘loading’ ?
  • 11. 1 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Demos Spark on Docker YARN on YARN
  • 12. 1 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Q&A