SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
YUMING LIANG
VOIDBOX - DOCKER ON YARN
WHAT IS HULU
GAMING • CONNECTED TVS • MOBILE • COMPUTER
540CONTENT PARTNERS
+!
75+DISTRIBUTION PARTNERS
100+HULU BEIJING
WHAT IS AUDIENCE PLATFORM
WORKFLOW ENGINE
HADOOP ECOSYSTEM: YARN + HDFS + MAP/REDUCE + HBASE + ZOOKEEPER
CONFIGURATION SERVICE
KILLUA / MATRIX
FIREWORK
SPARK
VOIDBOX
DOCKER ON YARN
ESTIMATION
SERVICE
NESTOBATCH FLOW: ETL + PROCESSING + SERVING
REAL-TIME FLOW: ETL + PROCESSING + SERVING
Lambda Architecture
SEGMENTATION SYSTEM OLAP
AGENDA
•  Why Voidbox"
•  What is Voidbox"
•  Architecture"
•  How to use Voidbox"
•  Voidbox in Hulu"
•  What’s next"
•  Q&A"
WHY VOIDBOX	
YARN: DATA OPERATION SYSTEM
CLUSTER
MAP /
REDUCE
TEZ SLIDER
HIVE PIG HBASESTORM
USER APPLICATION
Why only big data
application?
Can we run other
application on yarn?
What about task
processing, web
service?
WHY VOIDBOX	
YARN: DISTRIBUTED OPERATION SYSTEM
CLUSTER
MAP /
REDUCE
TEZ SLIDER
HIVE PIG HBASESTORM
USER APPLICATION
TASK
SYSTEM
WEB
SERVICE
OTHER
Data App Other App
Is there any demand?
VOIDBOX
WHY VOIDBOX	
20+ TASK IN
DIFFERENT
LANGUAGE
20+ VIRTUAL
MACHINE
LOW UTILIZATION
MACHINE FAILURE
VOIDBOX
WHAT IS VOIDBOX
LIGHTWEIGHT
VM
RUNTIME
ISOLATION
UNION
FILE SYSTEM
RESOURCE
MANAGEMENT
SCHEDULING
FAULT
TOLERANCE
DISTRIBUTE
COMPUTATION
ARBITRARY
APPLICATION
FAULT
TOLERANCE
VOIDBOX IS A PROGRAMMING FRAMEWORK
WHAT IS VOIDBOX
Core
VOIDBOX MASTER VOIDBOX PROXY
RUNTIME MODE REGISTRY / MGMT
API
Framework
DAG SERVICE TASK
ARCHITECTURE – DOCKER	
Host OS
Server
bins/libs
App
A
App
B
Hypervisor
App
A
Guest OS Guest OS
bins/libs bins/libs
App
B
Docker Engine
App
A
bins/libs bins/libs
App
B
Docker Image
Docker
Registry
DOCKER
IMAGE
DOCKER
ENGINE
DOCKER
CONTAINERDOCKER
CONTAINERDOCKER
CONTAINER
Runtime
Isolation
bins/libs
App
B’
ARCHITECTURE – DOCKER
ARCHITECTURE – VOIDBOX	
YARN modules	
Resource
Manager
Client
NodeManager
HDFS
NodeManager
HDFS
NodeManager
HDFS
YARN application1	
Container
(MapTask)
MR
AppMaster
Container
(MapTask)
Client
Application
Master
Container
YARN application2
ARCHITECTURE – VOIDBOX	
YARN modules	
Resource
Manager
Voidbox
Client
NodeManager
NodeManager
NodeManager
Docker Engine
Docker Engine
Voidbox
Master
State Server
Container
Voidbox
Proxy
Container
Voidbox
Proxy
Docker
Registry
In GlusterFs
HDFS
HDFS
HDFS
Voidbox
Driver
Pull
Voidbox modules	
Docker modules	
Voidbox is a standard appliction on
top of YARN, just like MapReduce
server
server
server
server
ARCHITECTURE – VOIDBOX YARN CLUSTER MODE	
YARN modules	
Resource
Manager
Voidbox
Client
NodeManager
NodeManager
Docker Engine
Voidbox
Master
Container
Voidbox
Proxy
HDFS
HDFS
Voidbox
Driver
Voidbox modules	
Docker modules
ARCHITECTURE – VOIDBOX YARN CLIENT MODE	
YARN modules	
Resource
Manager
Voidbox
Client
NodeManager
NodeManager
Docker Engine
Voidbox
Master
Container
Voidbox
Proxy
HDFS
HDFS
Voidbox modules	
Docker modules	
Voidbox
Driver
ARCHITECTURE – VOIDBOX YARN LOCAL MODE	
YARN modules	
Resource
Manager
Voidbox
Client
NodeManager
NodeManager
Docker Engine
Voidbox
Master
Container
Voidbox
Proxy
HDFS
HDFS
Voidbox modules	
Docker modules	
Voidbox
Driver
ARCHITECTURE – VOIDBOX FAULT TOLERANCE	
•  Work preserving in case ResourceManager restarts
–  https://issues.apache.org/jira/browse/YARN-556 target version:2.6.0
–  Improvement of ResourceManager HA
•  Container preserving in case NodeManager restarts
–  https://issues.apache.org/jira/browse/YARN-1336
–  Avoid multiple ApplicationMasters
–  Avoid out-of-control containers in NM crashed machine
•  ApplicationMaster retry count reset window
–  https://issues.apache.org/jira/browse/YARN-611
•  YARN container -> Docker cli -> Docker container
–  Add shutdown hook to recycle docker container in case container unexpected exit
–  Deal with voidboxproxy exits without running shutdown hook(kill -9 proxy.pid)
ARCHITECTURE – VOIDBOX FAULT TOLERANCE	
yarn-site.xml
<property>
<name>yarn.resourcemanager.work-preserving-recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.am.max-attempts</name>
<value>100</value>
</property>
<property>
<name>yarn.nodemanager.recovery.enabled</name>
<value>true</value>
</property>
ApplicationSubmissionContext
public abstract void setMaxAppAttempts(int maxAppAttempts);
public abstract void setAttemptFailuresValidityInterval(long attemptFailuresValidityInterval);
public abstract void setKeepContainersAcrossApplicationAttempts(boolean keepContainers);
ARCHITECTURE – VOIDBOX RESOURCE MANAGEMENT	
•  Management Center
–  Add/remove Docker engine in cluster
–  Garbage collection of out-of-control Container(NodeManage crashes)
–  History Log Server
–  Service discovery
ARCHITECTURE – LOGS	
•  VoidboxMaster’s log
–  Specify log4j.properties to rotate master’s log when submit an application
•  Docker container’s log
–  Stdout, stderr log
•  Copyfile, truncate(0) to rotate docker container’s log
•  Rotate log file in host(default:/var/lib/docker/containers/../containerid-json.log)
–  A log dir in Docker image
•  Map specify log volume, by user-defined log rotate in Docker image
CORE
ARCHITECTURE – KEY COMPONENTS	
FRAMEWORK
APP
CORE
ARCHITECTURE – KEY COMPONENTS	
•  Functionality
•  Allocate/release container
•  Launch/stop task in container
•  Multi run mode: yarn-cluster, yarn-client, yarn-local
•  Component
•  VoidboxMaster
•  Container resource allocation/release/preserving
•  Container context management
•  VoidboxProxy
•  Docker container lifecycle management
•  Management Center
•  Docker engine management
•  Garbage collectionFRAMEWORK
APP
CORE
ARCHITECTURE – KEY COMPONENTS	
•  DAG framework
•  DAGDriver
•  Task scheduler
•  Task retry policy
•  DAGContext
•  Describe DAG docker job
•  Service framework
•  ServiceDriver
•  Replication controller
•  Health check, warm up
•  ServiceContext
•  Describe service docker job
FRAMEWORK
APP
CORE
ARCHITECTURE – KEY COMPONENTS	
•  DAG APP
•  DAG programming based on Framework
•  Service APP
•  Service programming based on Framework
•  Register to load balance
FRAMEWORK
APP
HOW TO USE VOIDBOX	
Advanced API:
•  AMClient
–  requestResource(cpu, memory)
–  releaseResource(container)
•  DockerTaskRunner
–  run(task)
P1
P2
C1
C2
C3
How to increase / decrease number of consumers automatically?
HOW TO USE VOIDBOX
HOW TO USE VOIDBOX	
P1
P2
Message Queue
Monitor
Consumer
Manager
Docker
Consumer
Docker
Consumer
Docker
Consumer
AMClient
TaskRunner
VOIDBOX IN HULU
VOIDBOX IN HULU	
RESOURCE BOUND
HIGHLY
PARALLELIZABLE
COMPLEX
ENVIRONMENT
DEPENDENCIES
VOIDBOX APPLICATION
VOIDBOX BENEFITS	
•  Ease creating distributed applications
–  Build-in service discovery, fault tolerance, scheduling, communication
–  Support complex environment dependencies
•  Improve cluster efficiency
–  Share cluster with other data application
•  Simplify deployment
–  Package only
–  No deployment
•  Flexibility
–  Programming interface
WHAT’S NEXT	
YARN
VOIDBOX
MESOS
MAP
REDUCE
/ SPARK /
AURORA /
MARATHON /
KUBERNETES
DOCKER
M / R
SPARK
HIVE
TEZ
HBASE
STORM
SLIDER
PAAS SAAS
DISTRIBUTED
APPLICATION
QUESTIONS
Q & A
THANK YOU

Weitere ähnliche Inhalte

Was ist angesagt?

Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopHortonworks
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks
 
Deploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderDeploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderHortonworks
 
Realtime Analytics in Hadoop
Realtime Analytics in HadoopRealtime Analytics in Hadoop
Realtime Analytics in HadoopRommel Garcia
 
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudYARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudDataWorks Summit
 
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...DataWorks Summit
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3Hortonworks
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerDataWorks Summit
 
End-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service DeploymentEnd-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service DeploymentDataWorks Summit/Hadoop Summit
 
Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014alanfgates
 
Enabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARNEnabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARNDataWorks Summit
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnhdhappy001
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizonThejas Nair
 
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARNDeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARNDataWorks Summit
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduLow latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduDataWorks Summit
 
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...StampedeCon
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksHortonworks
 
Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?DataWorks Summit
 

Was ist angesagt? (19)

Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
 
Deploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderDeploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via Slider
 
Realtime Analytics in Hadoop
Realtime Analytics in HadoopRealtime Analytics in Hadoop
Realtime Analytics in Hadoop
 
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudYARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
 
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
 
End-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service DeploymentEnd-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service Deployment
 
Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014
 
Enabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARNEnabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARN
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizon
 
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARNDeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduLow latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache Kudu
 
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
 
Evolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage SubsystemEvolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage Subsystem
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
 
Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?
 

Ähnlich wie BDTC2015 hulu-梁宇明-voidbox - docker on yarn

MariaDB on Docker
MariaDB on DockerMariaDB on Docker
MariaDB on DockerMariaDB plc
 
Getting Started with MariaDB with Docker
Getting Started with MariaDB with DockerGetting Started with MariaDB with Docker
Getting Started with MariaDB with DockerMariaDB plc
 
Getting started with MariaDB with Docker
Getting started with MariaDB with DockerGetting started with MariaDB with Docker
Getting started with MariaDB with DockerMariaDB plc
 
Rami Sayar - Node microservices with Docker
Rami Sayar - Node microservices with DockerRami Sayar - Node microservices with Docker
Rami Sayar - Node microservices with DockerWeb à Québec
 
Running database infrastructure on containers
Running database infrastructure on containersRunning database infrastructure on containers
Running database infrastructure on containersMariaDB plc
 
Intro to Docker October 2013
Intro to Docker October 2013Intro to Docker October 2013
Intro to Docker October 2013Docker, Inc.
 
VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private C...
VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private C...VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private C...
VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private C...VMworld
 
Getting Started with Docker
Getting Started with DockerGetting Started with Docker
Getting Started with Dockervisual28
 
Docker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on Azure
Docker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on AzureDocker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on Azure
Docker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on AzurePatrick Chanezon
 
Achieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with ChefAchieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with ChefMatt Ray
 
Clocker - How to Train your Docker Cloud
Clocker - How to Train your Docker CloudClocker - How to Train your Docker Cloud
Clocker - How to Train your Docker CloudAndrew Kennedy
 
Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure
Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure
Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure Patrick Chanezon
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Yahoo Developer Network
 
Using Docker in production: Get started today!
Using Docker in production: Get started today!Using Docker in production: Get started today!
Using Docker in production: Get started today!Clarence Bakirtzidis
 
Orchestrating Linux Containers while tolerating failures
Orchestrating Linux Containers while tolerating failuresOrchestrating Linux Containers while tolerating failures
Orchestrating Linux Containers while tolerating failuresDocker, Inc.
 
SimplifyStreamingArchitecture
SimplifyStreamingArchitectureSimplifyStreamingArchitecture
SimplifyStreamingArchitectureMaheedhar Gunturu
 
Stay productive while slicing up the monolith
Stay productive while slicing up the monolith Stay productive while slicing up the monolith
Stay productive while slicing up the monolith Markus Eisele
 
20150425 experimenting with openstack sahara on docker
20150425 experimenting with openstack sahara on docker20150425 experimenting with openstack sahara on docker
20150425 experimenting with openstack sahara on dockerWei Ting Chen
 
Docker and kubernetes_introduction
Docker and kubernetes_introductionDocker and kubernetes_introduction
Docker and kubernetes_introductionJason Hu
 

Ähnlich wie BDTC2015 hulu-梁宇明-voidbox - docker on yarn (20)

MariaDB on Docker
MariaDB on DockerMariaDB on Docker
MariaDB on Docker
 
Getting Started with MariaDB with Docker
Getting Started with MariaDB with DockerGetting Started with MariaDB with Docker
Getting Started with MariaDB with Docker
 
Getting started with MariaDB with Docker
Getting started with MariaDB with DockerGetting started with MariaDB with Docker
Getting started with MariaDB with Docker
 
Rami Sayar - Node microservices with Docker
Rami Sayar - Node microservices with DockerRami Sayar - Node microservices with Docker
Rami Sayar - Node microservices with Docker
 
Running database infrastructure on containers
Running database infrastructure on containersRunning database infrastructure on containers
Running database infrastructure on containers
 
Docker Introduction
Docker IntroductionDocker Introduction
Docker Introduction
 
Intro to Docker October 2013
Intro to Docker October 2013Intro to Docker October 2013
Intro to Docker October 2013
 
VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private C...
VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private C...VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private C...
VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private C...
 
Getting Started with Docker
Getting Started with DockerGetting Started with Docker
Getting Started with Docker
 
Docker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on Azure
Docker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on AzureDocker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on Azure
Docker Seattle Meetup April 2015 - The Docker Orchestration Ecosystem on Azure
 
Achieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with ChefAchieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with Chef
 
Clocker - How to Train your Docker Cloud
Clocker - How to Train your Docker CloudClocker - How to Train your Docker Cloud
Clocker - How to Train your Docker Cloud
 
Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure
Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure
Docker New York Meetup May 2015 - The Docker Orchestration Ecosystem on Azure
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
 
Using Docker in production: Get started today!
Using Docker in production: Get started today!Using Docker in production: Get started today!
Using Docker in production: Get started today!
 
Orchestrating Linux Containers while tolerating failures
Orchestrating Linux Containers while tolerating failuresOrchestrating Linux Containers while tolerating failures
Orchestrating Linux Containers while tolerating failures
 
SimplifyStreamingArchitecture
SimplifyStreamingArchitectureSimplifyStreamingArchitecture
SimplifyStreamingArchitecture
 
Stay productive while slicing up the monolith
Stay productive while slicing up the monolith Stay productive while slicing up the monolith
Stay productive while slicing up the monolith
 
20150425 experimenting with openstack sahara on docker
20150425 experimenting with openstack sahara on docker20150425 experimenting with openstack sahara on docker
20150425 experimenting with openstack sahara on docker
 
Docker and kubernetes_introduction
Docker and kubernetes_introductionDocker and kubernetes_introduction
Docker and kubernetes_introduction
 

Mehr von Jerry Wen

BDTC2015 数美时代-梁堃-sentry 金融实时风控系统
BDTC2015 数美时代-梁堃-sentry 金融实时风控系统BDTC2015 数美时代-梁堃-sentry 金融实时风控系统
BDTC2015 数美时代-梁堃-sentry 金融实时风控系统Jerry Wen
 
BDTC2015 阿里巴巴-郑斌-大数据下的数据安全
BDTC2015 阿里巴巴-郑斌-大数据下的数据安全BDTC2015 阿里巴巴-郑斌-大数据下的数据安全
BDTC2015 阿里巴巴-郑斌-大数据下的数据安全Jerry Wen
 
BDTC2015 新浪微博-姜贵彬-大数据驱动下的微博社会化推荐
BDTC2015 新浪微博-姜贵彬-大数据驱动下的微博社会化推荐BDTC2015 新浪微博-姜贵彬-大数据驱动下的微博社会化推荐
BDTC2015 新浪微博-姜贵彬-大数据驱动下的微博社会化推荐Jerry Wen
 
BDTC2015 南京大学-黄宜华-octopus(大章鱼):基于r语言的跨平台大数据机器学习与数据分析系统
BDTC2015 南京大学-黄宜华-octopus(大章鱼):基于r语言的跨平台大数据机器学习与数据分析系统BDTC2015 南京大学-黄宜华-octopus(大章鱼):基于r语言的跨平台大数据机器学习与数据分析系统
BDTC2015 南京大学-黄宜华-octopus(大章鱼):基于r语言的跨平台大数据机器学习与数据分析系统Jerry Wen
 
BDTC2015-新加坡管理大学-朱飞达
BDTC2015-新加坡管理大学-朱飞达BDTC2015-新加坡管理大学-朱飞达
BDTC2015-新加坡管理大学-朱飞达Jerry Wen
 
BDTC2015 小米-大数据和小米金融
BDTC2015 小米-大数据和小米金融BDTC2015 小米-大数据和小米金融
BDTC2015 小米-大数据和小米金融Jerry Wen
 
BDTC2015 阿里巴巴-鄢志杰(智捷)-deep learning助力客服小二:数据技术及机器学习在客服中心的应用
BDTC2015 阿里巴巴-鄢志杰(智捷)-deep learning助力客服小二:数据技术及机器学习在客服中心的应用BDTC2015 阿里巴巴-鄢志杰(智捷)-deep learning助力客服小二:数据技术及机器学习在客服中心的应用
BDTC2015 阿里巴巴-鄢志杰(智捷)-deep learning助力客服小二:数据技术及机器学习在客服中心的应用Jerry Wen
 
BDTC2015 京东-刘海锋-大规模内存数据库jimdb:从2014到2016
BDTC2015 京东-刘海锋-大规模内存数据库jimdb:从2014到2016BDTC2015 京东-刘海锋-大规模内存数据库jimdb:从2014到2016
BDTC2015 京东-刘海锋-大规模内存数据库jimdb:从2014到2016Jerry Wen
 
BDTC2015 databricks-辛湜-state of spark
BDTC2015 databricks-辛湜-state of sparkBDTC2015 databricks-辛湜-state of spark
BDTC2015 databricks-辛湜-state of sparkJerry Wen
 
BDTC2015 启明星辰-潘柱廷-中国大数据技术与产业发展报告
BDTC2015 启明星辰-潘柱廷-中国大数据技术与产业发展报告BDTC2015 启明星辰-潘柱廷-中国大数据技术与产业发展报告
BDTC2015 启明星辰-潘柱廷-中国大数据技术与产业发展报告Jerry Wen
 

Mehr von Jerry Wen (10)

BDTC2015 数美时代-梁堃-sentry 金融实时风控系统
BDTC2015 数美时代-梁堃-sentry 金融实时风控系统BDTC2015 数美时代-梁堃-sentry 金融实时风控系统
BDTC2015 数美时代-梁堃-sentry 金融实时风控系统
 
BDTC2015 阿里巴巴-郑斌-大数据下的数据安全
BDTC2015 阿里巴巴-郑斌-大数据下的数据安全BDTC2015 阿里巴巴-郑斌-大数据下的数据安全
BDTC2015 阿里巴巴-郑斌-大数据下的数据安全
 
BDTC2015 新浪微博-姜贵彬-大数据驱动下的微博社会化推荐
BDTC2015 新浪微博-姜贵彬-大数据驱动下的微博社会化推荐BDTC2015 新浪微博-姜贵彬-大数据驱动下的微博社会化推荐
BDTC2015 新浪微博-姜贵彬-大数据驱动下的微博社会化推荐
 
BDTC2015 南京大学-黄宜华-octopus(大章鱼):基于r语言的跨平台大数据机器学习与数据分析系统
BDTC2015 南京大学-黄宜华-octopus(大章鱼):基于r语言的跨平台大数据机器学习与数据分析系统BDTC2015 南京大学-黄宜华-octopus(大章鱼):基于r语言的跨平台大数据机器学习与数据分析系统
BDTC2015 南京大学-黄宜华-octopus(大章鱼):基于r语言的跨平台大数据机器学习与数据分析系统
 
BDTC2015-新加坡管理大学-朱飞达
BDTC2015-新加坡管理大学-朱飞达BDTC2015-新加坡管理大学-朱飞达
BDTC2015-新加坡管理大学-朱飞达
 
BDTC2015 小米-大数据和小米金融
BDTC2015 小米-大数据和小米金融BDTC2015 小米-大数据和小米金融
BDTC2015 小米-大数据和小米金融
 
BDTC2015 阿里巴巴-鄢志杰(智捷)-deep learning助力客服小二:数据技术及机器学习在客服中心的应用
BDTC2015 阿里巴巴-鄢志杰(智捷)-deep learning助力客服小二:数据技术及机器学习在客服中心的应用BDTC2015 阿里巴巴-鄢志杰(智捷)-deep learning助力客服小二:数据技术及机器学习在客服中心的应用
BDTC2015 阿里巴巴-鄢志杰(智捷)-deep learning助力客服小二:数据技术及机器学习在客服中心的应用
 
BDTC2015 京东-刘海锋-大规模内存数据库jimdb:从2014到2016
BDTC2015 京东-刘海锋-大规模内存数据库jimdb:从2014到2016BDTC2015 京东-刘海锋-大规模内存数据库jimdb:从2014到2016
BDTC2015 京东-刘海锋-大规模内存数据库jimdb:从2014到2016
 
BDTC2015 databricks-辛湜-state of spark
BDTC2015 databricks-辛湜-state of sparkBDTC2015 databricks-辛湜-state of spark
BDTC2015 databricks-辛湜-state of spark
 
BDTC2015 启明星辰-潘柱廷-中国大数据技术与产业发展报告
BDTC2015 启明星辰-潘柱廷-中国大数据技术与产业发展报告BDTC2015 启明星辰-潘柱廷-中国大数据技术与产业发展报告
BDTC2015 启明星辰-潘柱廷-中国大数据技术与产业发展报告
 

Kürzlich hochgeladen

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 

Kürzlich hochgeladen (20)

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 

BDTC2015 hulu-梁宇明-voidbox - docker on yarn

  • 1. YUMING LIANG VOIDBOX - DOCKER ON YARN
  • 2. WHAT IS HULU GAMING • CONNECTED TVS • MOBILE • COMPUTER
  • 4. WHAT IS AUDIENCE PLATFORM WORKFLOW ENGINE HADOOP ECOSYSTEM: YARN + HDFS + MAP/REDUCE + HBASE + ZOOKEEPER CONFIGURATION SERVICE KILLUA / MATRIX FIREWORK SPARK VOIDBOX DOCKER ON YARN ESTIMATION SERVICE NESTOBATCH FLOW: ETL + PROCESSING + SERVING REAL-TIME FLOW: ETL + PROCESSING + SERVING Lambda Architecture SEGMENTATION SYSTEM OLAP
  • 5. AGENDA •  Why Voidbox" •  What is Voidbox" •  Architecture" •  How to use Voidbox" •  Voidbox in Hulu" •  What’s next" •  Q&A"
  • 6. WHY VOIDBOX YARN: DATA OPERATION SYSTEM CLUSTER MAP / REDUCE TEZ SLIDER HIVE PIG HBASESTORM USER APPLICATION Why only big data application? Can we run other application on yarn? What about task processing, web service?
  • 7. WHY VOIDBOX YARN: DISTRIBUTED OPERATION SYSTEM CLUSTER MAP / REDUCE TEZ SLIDER HIVE PIG HBASESTORM USER APPLICATION TASK SYSTEM WEB SERVICE OTHER Data App Other App Is there any demand? VOIDBOX
  • 8. WHY VOIDBOX 20+ TASK IN DIFFERENT LANGUAGE 20+ VIRTUAL MACHINE LOW UTILIZATION MACHINE FAILURE VOIDBOX
  • 9. WHAT IS VOIDBOX LIGHTWEIGHT VM RUNTIME ISOLATION UNION FILE SYSTEM RESOURCE MANAGEMENT SCHEDULING FAULT TOLERANCE DISTRIBUTE COMPUTATION ARBITRARY APPLICATION FAULT TOLERANCE VOIDBOX IS A PROGRAMMING FRAMEWORK
  • 10. WHAT IS VOIDBOX Core VOIDBOX MASTER VOIDBOX PROXY RUNTIME MODE REGISTRY / MGMT API Framework DAG SERVICE TASK
  • 11.
  • 12.
  • 13. ARCHITECTURE – DOCKER Host OS Server bins/libs App A App B Hypervisor App A Guest OS Guest OS bins/libs bins/libs App B Docker Engine App A bins/libs bins/libs App B Docker Image Docker Registry DOCKER IMAGE DOCKER ENGINE DOCKER CONTAINERDOCKER CONTAINERDOCKER CONTAINER Runtime Isolation bins/libs App B’
  • 15. ARCHITECTURE – VOIDBOX YARN modules Resource Manager Client NodeManager HDFS NodeManager HDFS NodeManager HDFS YARN application1 Container (MapTask) MR AppMaster Container (MapTask) Client Application Master Container YARN application2
  • 16. ARCHITECTURE – VOIDBOX YARN modules Resource Manager Voidbox Client NodeManager NodeManager NodeManager Docker Engine Docker Engine Voidbox Master State Server Container Voidbox Proxy Container Voidbox Proxy Docker Registry In GlusterFs HDFS HDFS HDFS Voidbox Driver Pull Voidbox modules Docker modules Voidbox is a standard appliction on top of YARN, just like MapReduce server server server server
  • 17. ARCHITECTURE – VOIDBOX YARN CLUSTER MODE YARN modules Resource Manager Voidbox Client NodeManager NodeManager Docker Engine Voidbox Master Container Voidbox Proxy HDFS HDFS Voidbox Driver Voidbox modules Docker modules
  • 18. ARCHITECTURE – VOIDBOX YARN CLIENT MODE YARN modules Resource Manager Voidbox Client NodeManager NodeManager Docker Engine Voidbox Master Container Voidbox Proxy HDFS HDFS Voidbox modules Docker modules Voidbox Driver
  • 19. ARCHITECTURE – VOIDBOX YARN LOCAL MODE YARN modules Resource Manager Voidbox Client NodeManager NodeManager Docker Engine Voidbox Master Container Voidbox Proxy HDFS HDFS Voidbox modules Docker modules Voidbox Driver
  • 20. ARCHITECTURE – VOIDBOX FAULT TOLERANCE •  Work preserving in case ResourceManager restarts –  https://issues.apache.org/jira/browse/YARN-556 target version:2.6.0 –  Improvement of ResourceManager HA •  Container preserving in case NodeManager restarts –  https://issues.apache.org/jira/browse/YARN-1336 –  Avoid multiple ApplicationMasters –  Avoid out-of-control containers in NM crashed machine •  ApplicationMaster retry count reset window –  https://issues.apache.org/jira/browse/YARN-611 •  YARN container -> Docker cli -> Docker container –  Add shutdown hook to recycle docker container in case container unexpected exit –  Deal with voidboxproxy exits without running shutdown hook(kill -9 proxy.pid)
  • 21. ARCHITECTURE – VOIDBOX FAULT TOLERANCE yarn-site.xml <property> <name>yarn.resourcemanager.work-preserving-recovery.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.am.max-attempts</name> <value>100</value> </property> <property> <name>yarn.nodemanager.recovery.enabled</name> <value>true</value> </property> ApplicationSubmissionContext public abstract void setMaxAppAttempts(int maxAppAttempts); public abstract void setAttemptFailuresValidityInterval(long attemptFailuresValidityInterval); public abstract void setKeepContainersAcrossApplicationAttempts(boolean keepContainers);
  • 22. ARCHITECTURE – VOIDBOX RESOURCE MANAGEMENT •  Management Center –  Add/remove Docker engine in cluster –  Garbage collection of out-of-control Container(NodeManage crashes) –  History Log Server –  Service discovery
  • 23. ARCHITECTURE – LOGS •  VoidboxMaster’s log –  Specify log4j.properties to rotate master’s log when submit an application •  Docker container’s log –  Stdout, stderr log •  Copyfile, truncate(0) to rotate docker container’s log •  Rotate log file in host(default:/var/lib/docker/containers/../containerid-json.log) –  A log dir in Docker image •  Map specify log volume, by user-defined log rotate in Docker image
  • 24. CORE ARCHITECTURE – KEY COMPONENTS FRAMEWORK APP
  • 25. CORE ARCHITECTURE – KEY COMPONENTS •  Functionality •  Allocate/release container •  Launch/stop task in container •  Multi run mode: yarn-cluster, yarn-client, yarn-local •  Component •  VoidboxMaster •  Container resource allocation/release/preserving •  Container context management •  VoidboxProxy •  Docker container lifecycle management •  Management Center •  Docker engine management •  Garbage collectionFRAMEWORK APP
  • 26. CORE ARCHITECTURE – KEY COMPONENTS •  DAG framework •  DAGDriver •  Task scheduler •  Task retry policy •  DAGContext •  Describe DAG docker job •  Service framework •  ServiceDriver •  Replication controller •  Health check, warm up •  ServiceContext •  Describe service docker job FRAMEWORK APP
  • 27. CORE ARCHITECTURE – KEY COMPONENTS •  DAG APP •  DAG programming based on Framework •  Service APP •  Service programming based on Framework •  Register to load balance FRAMEWORK APP
  • 28. HOW TO USE VOIDBOX Advanced API: •  AMClient –  requestResource(cpu, memory) –  releaseResource(container) •  DockerTaskRunner –  run(task)
  • 29. P1 P2 C1 C2 C3 How to increase / decrease number of consumers automatically? HOW TO USE VOIDBOX
  • 30. HOW TO USE VOIDBOX P1 P2 Message Queue Monitor Consumer Manager Docker Consumer Docker Consumer Docker Consumer AMClient TaskRunner
  • 32. VOIDBOX IN HULU RESOURCE BOUND HIGHLY PARALLELIZABLE COMPLEX ENVIRONMENT DEPENDENCIES VOIDBOX APPLICATION
  • 33. VOIDBOX BENEFITS •  Ease creating distributed applications –  Build-in service discovery, fault tolerance, scheduling, communication –  Support complex environment dependencies •  Improve cluster efficiency –  Share cluster with other data application •  Simplify deployment –  Package only –  No deployment •  Flexibility –  Programming interface
  • 34. WHAT’S NEXT YARN VOIDBOX MESOS MAP REDUCE / SPARK / AURORA / MARATHON / KUBERNETES DOCKER M / R SPARK HIVE TEZ HBASE STORM SLIDER PAAS SAAS DISTRIBUTED APPLICATION
  • 35.