From http://www.csdn.net/article/2015-12-17/2826501
《Hulu 资深研发主管梁宇明 :Voidbox - Docker On YARN在Hulu的实践》
Docker 技术越来越得到了很多开发者的青睐,而YARN对于多数爱好者来说还是一个比较新的产品平台。如果两者放在一起融化会发生什么事情呢?来自Hulu公司的资深研发主管梁宇明为大家讲解了这一神奇的经历。他的演讲题目是《Voidbox - Docker On YARN在Hulu的实践》。因为基于YARN的大数据计算平台使得不同的计算框架可以在同一集群中混合部署,进而提升了集群资源利用率。
4. WHAT IS AUDIENCE PLATFORM
WORKFLOW ENGINE
HADOOP ECOSYSTEM: YARN + HDFS + MAP/REDUCE + HBASE + ZOOKEEPER
CONFIGURATION SERVICE
KILLUA / MATRIX
FIREWORK
SPARK
VOIDBOX
DOCKER ON YARN
ESTIMATION
SERVICE
NESTOBATCH FLOW: ETL + PROCESSING + SERVING
REAL-TIME FLOW: ETL + PROCESSING + SERVING
Lambda Architecture
SEGMENTATION SYSTEM OLAP
5. AGENDA
• Why Voidbox"
• What is Voidbox"
• Architecture"
• How to use Voidbox"
• Voidbox in Hulu"
• What’s next"
• Q&A"
6. WHY VOIDBOX
YARN: DATA OPERATION SYSTEM
CLUSTER
MAP /
REDUCE
TEZ SLIDER
HIVE PIG HBASESTORM
USER APPLICATION
Why only big data
application?
Can we run other
application on yarn?
What about task
processing, web
service?
7. WHY VOIDBOX
YARN: DISTRIBUTED OPERATION SYSTEM
CLUSTER
MAP /
REDUCE
TEZ SLIDER
HIVE PIG HBASESTORM
USER APPLICATION
TASK
SYSTEM
WEB
SERVICE
OTHER
Data App Other App
Is there any demand?
VOIDBOX
8. WHY VOIDBOX
20+ TASK IN
DIFFERENT
LANGUAGE
20+ VIRTUAL
MACHINE
LOW UTILIZATION
MACHINE FAILURE
VOIDBOX
13. ARCHITECTURE – DOCKER
Host OS
Server
bins/libs
App
A
App
B
Hypervisor
App
A
Guest OS Guest OS
bins/libs bins/libs
App
B
Docker Engine
App
A
bins/libs bins/libs
App
B
Docker Image
Docker
Registry
DOCKER
IMAGE
DOCKER
ENGINE
DOCKER
CONTAINERDOCKER
CONTAINERDOCKER
CONTAINER
Runtime
Isolation
bins/libs
App
B’
16. ARCHITECTURE – VOIDBOX
YARN modules
Resource
Manager
Voidbox
Client
NodeManager
NodeManager
NodeManager
Docker Engine
Docker Engine
Voidbox
Master
State Server
Container
Voidbox
Proxy
Container
Voidbox
Proxy
Docker
Registry
In GlusterFs
HDFS
HDFS
HDFS
Voidbox
Driver
Pull
Voidbox modules
Docker modules
Voidbox is a standard appliction on
top of YARN, just like MapReduce
server
server
server
server
22. ARCHITECTURE – VOIDBOX RESOURCE MANAGEMENT
• Management Center
– Add/remove Docker engine in cluster
– Garbage collection of out-of-control Container(NodeManage crashes)
– History Log Server
– Service discovery
23. ARCHITECTURE – LOGS
• VoidboxMaster’s log
– Specify log4j.properties to rotate master’s log when submit an application
• Docker container’s log
– Stdout, stderr log
• Copyfile, truncate(0) to rotate docker container’s log
• Rotate log file in host(default:/var/lib/docker/containers/../containerid-json.log)
– A log dir in Docker image
• Map specify log volume, by user-defined log rotate in Docker image
26. CORE
ARCHITECTURE – KEY COMPONENTS
• DAG framework
• DAGDriver
• Task scheduler
• Task retry policy
• DAGContext
• Describe DAG docker job
• Service framework
• ServiceDriver
• Replication controller
• Health check, warm up
• ServiceContext
• Describe service docker job
FRAMEWORK
APP
27. CORE
ARCHITECTURE – KEY COMPONENTS
• DAG APP
• DAG programming based on Framework
• Service APP
• Service programming based on Framework
• Register to load balance
FRAMEWORK
APP
28. HOW TO USE VOIDBOX
Advanced API:
• AMClient
– requestResource(cpu, memory)
– releaseResource(container)
• DockerTaskRunner
– run(task)