SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Downloaden Sie, um offline zu lesen
Big Data in Cloud
堵俊平
Apache Hadoop Committer
Staff Engineer, VMware
Bio 堵俊平 (Junping Du)
- Join VMware in 2008 for cloud product
first
- Initiate earliest effort on big data within
VMware since 2010
- Automate Hadoop deployment on
vSphere which becomes Open Source
project – Serengeti later
- Start contributing to Apache Hadoop
community since 2012
- Become Apache Hadoop committer
recently only 1 in +8 timezone today
Agenda
- Virtualization, SDDC and Cloud
- Trends from my observation in Big
Data
- YARN: resource hub for Big Data
Applications
- YARN in the Cloud
What is Virtualization?
- @see VMware’s vSphere
Guest

TCP/IP

Guest

Monitor

File
System

Monitor

Virtual NIC

Physical
Hardware

Scheduler

Memory
Manager

Virtual Switch

File System

NIC Drivers

VMkernel

Virtual SCSI

I/O Drivers

Monitor Emulates Physical
Devices: CPU, Memory, I/O

CPU is controlled by scheduler
and virtualized by monitor

Memory is allocated by the
VMkernel and virtualized by
the monitor
Network and I/O devices are
emulated and proxied though
native device drivers
Server Virtualization Adoption on
Path to 80% Over Next 5 Years
% Virtualized of x86 Workloads

80%

Total x86 Workloads

200

100%

180

IDC
2012 to 2016
Change = +12 pts

90%

160
Gartner
2012 to 2016
Change = +22 pts

140

80%
x86 % Physical
Servers
Unvirtualized

70%

百万

120

40%

100

60%
IDC+ VMW
Estimate:
Workloads1
2012 to 2016
CAGR = 21%

50%

80
60

30%

40

20%

20
0%

40%

10%

2010 2011 2012 2013 2014 2015 2016 2017 2018

0%
2009 2010 2011 2012 2013 2014 2015 2016

Source(s): IDC: Annual Virtualization Forecast, Feb-13; Gartner: x86 Server Virtualization, Worldwide, 3Q12 Update; Gartner: Forecast x86 Server Virtualization, Worldwide, 2008-2018, Jul-11; VMware estimates,
Note: Server workloads only 1 Installed Base totals assume 5-year refresh
Apps on Traditional Infrastructure
Windows

Linux

Databases

Mission
Critical

HPC

Big Data
Apps on Software-Defined Data Center
Windows

Linux

Mission
Critical

Databases

HPC

Big Data

Software-Defined Data Center

VDC

VDC

VDC

VDC

VDC

Software-Defined Data Center Services

Abstract

Pool

Automate
Infrastructure for Traditional Apps
Traditional Applications

2016

141M
70%

Infrastructure for Traditional Enterprise Apps
Existing Application bound to vendor specific HW

2012

83M

Hardware-based Resiliency
Hardware-based QOS
Hard To automate
Complex to scale
Infrastructure for New Apps
Infrastructure for New/Cloud/Data Apps
Application Specific Network and Storage

Next Gen Cloud Applications

2016

48M

700%
2012

6M

Software-based Infrastructure
Transformational Economics
Automation and Agility
Designed For Scale
SDDC Delivers Single Architecture for New and Existing Apps
Infrastructure for New/Cloud/Data Apps
Application Specific Network and Storage

Any Application

Infrastructure for Existing Enterprise Apps
Existing Application bound to vendor specific HW

Any Hardware
Let’s back to Big Data …
New Trends of Big Data from my observation
- Hadoop 2.0, YARN plays as key resource hub in big
data ecosystem
- MapReduce is not good enough, we need faster one,
like: Tez, Spark, etc.
- HDFS tries to support more scenarios, i.e. cache for
low-latency apps, snapshot for disaster recovery,
storage tiers awareness, etc.
- More Hadoop-based SQL engines: Apache Drill,
Impala, Stinger, Hawq, etc.
- For enterprise-ready, more efforts are spent on
Security, HA, QoS, Monitor & Management
Hadoop MapReduce v1 (Classic)
• JobTracker
– Manage cluster
resources and job
scheduling

• TaskTracker
– Per node agent
– Manage tasks
MapReduce v1 Limitations
• Scalability
– Manage cluster resources and job scheduling

• SPOF (Single Point Of Failure)
• JobTracker failure cause all queued and running job
failure
– Restart is very tricky due to complex state

• Hard partition of resources into map and reduce
slots
– Low resource utilization

• Lacks support for alternate paradigms
• Lack of wire-compatible protocols
YARN Architecture
• Splits up the two major functions of
JobTracker
– Resource Manager (RM) - Cluster resource
management
– Application Master (AM) - Task scheduling and
monitoring

• NodeManager (NM) - A new per-node
slave
– launching the applications’ containers
– monitoring their resource usage (cpu,
memory) and reporting to the Resource
Manager.

• YARN maintains compatibility with existing
MapReduce application and support other
applications
YARN – Hub for Big Data Applications
OpenMPI

Impala

HBase

Distributed Shell

Spark

MapReduce

Tez

Storm

YARN

HDFS

• App-specific AM
• HOYA (Hbase On YArn)
– Long running services (YARN-896)

• LLAMA (Low Latency Application MAster)
– Gang Scheduler (YARN-624)
YARN and Cloud
• Two different prospective:
– YARN-centric prospective
• YARN is the key platform to apps
• YARN is independent of infrastructure, running on top of
Cloud shows YARN’s generality

– Cloud-centric prospective
• YARN is an umbrella kind of applications
• Supporting YARN shows Cloud’s generality
YARN and Cloud: YARN-centric Prospective
• YARN is “OS”
Big Data Apps
• Infrastructure (no matter physical or cloud) is “hardware”
HBase

Open MPI

Distributed Shell

Spark

…
Impala

MapReduce

Tez

Storm

YARN
Infrastructure
Bare-metal machines

Cloud Infrastructure

…

VMware

Open Stack

…
YARN and Cloud: Cloud-centric Prospective
• Cloud Infrastructure is “OS”
• YARN is a group of “process”
Legacy Apps

Other
Big Data Apps

YARN Apps
Open MPI

D.S

Spark

Impala

…

HBase

MapReduce

Tez

Storm

…

YARN

Cloud Infrastructure (VMware, Open Stack, etc.)
YARN vs. Cloud
• Similarity
– Target to share resources across applications
– Provide Global Resource Management

• YARN vs. Cloud
– YARN managing resource in OS layer vs. Cloud managing
resources in Hypervisor (Not comparable, but Hypervisor
is more powerful than OS in isolation)
– Apps managed by YARN need specific AppMaster, Apps
managed by Cloud is exactly the same as running on
physical machines (Cloud +1)
– YARN layer is closed to big data app, better
understand/estimate app’s requirement (YARN +1)
– Cloud layer is closed to hardware resources, easier to
track real time and global resource utilization (Cloud +1)
YARN + Cloud
• Why YARN + Cloud?
– Leverage virtualization in strong isolation, fine-grained
resource sharing and other benefits
– Uniform infrastructure to simplify IT in enterprise

• What it looks like?
– Running YARN NM inside of VMs managed by Cloud
Infrastructure
– Build communication channel between YARN RM and
Cloud Resource Manager for coordination

• How we do?
– First thing above is very easy and smoothly
– Second things to achieve in two ways
• YARN can aware/manipulate Cloud resource change
• YARN provide a generic resource notification mechanism so
Cloud Manager can use when resource changing
Elastic YARN Node in the Cloud

Container

Add/Remove
Resources?

Container
Other
Workload

Virtual
YARN
Node

NodeManager

Datanode

Virtualization Host

Grow/Shrink resource of a VM

VMDK

Grow/Shrink
by tens of GB in
memory?
Elastic YARN Node in the Cloud
• VM’s resource boundary can be elastic
–
–
–
–

CPU is easy – time slicing (with constraints)
Memory is harder – page sharing and memory ballooning
In case of contention, enforce limits and proportional sharing
“Stealing” resources behind apps could cause bad
performance (paging)
– App aware resource management could address these issues

• Hadoop YARN Resource Model
– Dynamic with adding/removing nodes
– But static for per node

• In this case, shall we enable resource elasticity on VM?
– If yes, low performance when resource contention happens.
– If no, low utilization as physical boxes because free resources
cannot be leveraged by other busy VMs

• We need better answer .
HVE provide the answer!
• Hadoop Virtualization Extensions
– A project initiated from VMware to enhance Hadoop
running on virtualization
– A “driver” for Hadoop “OS” running on cloud
“hardware”

• Goal: Make Hadoop Cloud-Ready
– Provide Virtualization-awareness to Hadoop, i.e.
virtual topology, virtual resources, etc.
– Deliver generic utility that can be leveraged by
virtualized platform

• Independent of virtualization platform and cloud
infrastructure
• 100% contribute to Apache Hadoop Community
HVE
• Philosophy
– make infrastructure related components abstract
– deliver different implementations that can be
configured properly

• E.g.
BlockPlacementPolicy
(Abstract)
BlockPlacementPolicy

BlockPlacementPolicy
Default

BlockPlacementPolicy
For Virtualization
Elastic YARN Node in the Cloud
• In this case, shall we enable resource elasticity
on VM?
• Yes, and we try to get rid of resource contention
– Notify YARN that node’s resource get changed
– YARN RM scheduler won’t schedule new tasks on
nodes get congestion
– YARN scheduler preempt low priority tasks if
necessary
– The work is addressed in YARN-291
Implementation – YARN-291 (umbrella)
• YARN-312

• YARN-311
– Core scheduler changes

– AdminProtocol changes

• REST API, JMX, etc.

• YARN-313
• CLI
Resource Manager
Scheduler

UpdateNodeResource()

AdminService

Cluster Resource

Admin CLI
yarn rmadmin -updateNodeResource
<NodeId> <Resource>

SchedulerNode

RMContext
RMNode
Resource Tracker Service

Heartbeat

Node Manager

Cloud Resource
Manager
Welcome contribution to Apache Hadoop!
• Hadoop is the key platform
– For architecting Big Data
– Contribute a bit can change the world!

• Open source project is a great platform
– For people to share great ideas, works from different
organizations
– Community is a great work place

• Companies and persons get credit
– From work and resources they are putting
– Also easy to build a ecosystem and show expertise

• So many challenges in Big Data, like building Babel
– Open source is the common language to make sure we can
work together
Key messages in today’s talk
• SDDC and Cloud are the future for architecting
enterprise IT
• New trends in big data: YARN plays as a “OS” for
big data apps
• In VMware, we tries to support any “OS”, include
“YARN”
• HVE plays as “driver” to enable Hadoop on
virtualization/cloud
• Contribute to Apache Hadoop
Reference
• YARN MapReduce 2.0
– https://issues.apache.org/jira/browse/MAPREDUCE279

• HVE topology extension
– https://issues.apache.org/jira/browse/HADOOP-8468

• HVE topology extension for YARN
– https://issues.apache.org/jira/browse/YARN-18

• HVE elastic resource configuration
– https://issues.apache.org/jira/browse/YARN-291

• Gang Scheduling
– https://issues.apache.org/jira/browse/YARN-624

• Long-lived services in YARN
– https://issues.apache.org/jira/browse/YARN-896
堵俊平:Hadoop virtualization extensions

Weitere ähnliche Inhalte

Was ist angesagt?

Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Hortonworks
 
YARN - Hadoop's Resource Manager
YARN - Hadoop's Resource ManagerYARN - Hadoop's Resource Manager
YARN - Hadoop's Resource ManagerVertiCloud Inc
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopHortonworks
 
Provisioning Big Data Platform using Cloudbreak & Ambari
Provisioning Big Data Platform using Cloudbreak & AmbariProvisioning Big Data Platform using Cloudbreak & Ambari
Provisioning Big Data Platform using Cloudbreak & AmbariDataWorks Summit/Hadoop Summit
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN ApplicationsHortonworks
 
Cloudera Cares + DataKind | 7 May 2015 | London, UK
Cloudera Cares + DataKind | 7 May 2015 | London, UKCloudera Cares + DataKind | 7 May 2015 | London, UK
Cloudera Cares + DataKind | 7 May 2015 | London, UKCloudera, Inc.
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesDataWorks Summit
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3Hortonworks
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureDataWorks Summit
 
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...StampedeCon
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureDataWorks Summit
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Cécile Poyet
 
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessCloudera, Inc.
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARNDataWorks Summit
 
Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5Cloudera, Inc.
 
Enabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARNEnabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARNDataWorks Summit
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformBikas Saha
 
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNCombine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNHortonworks
 
Debugging Apache Hadoop YARN Cluster in Production
Debugging Apache Hadoop YARN Cluster in ProductionDebugging Apache Hadoop YARN Cluster in Production
Debugging Apache Hadoop YARN Cluster in ProductionXuan Gong
 

Was ist angesagt? (20)

Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014
 
YARN - Hadoop's Resource Manager
YARN - Hadoop's Resource ManagerYARN - Hadoop's Resource Manager
YARN - Hadoop's Resource Manager
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
 
Provisioning Big Data Platform using Cloudbreak & Ambari
Provisioning Big Data Platform using Cloudbreak & AmbariProvisioning Big Data Platform using Cloudbreak & Ambari
Provisioning Big Data Platform using Cloudbreak & Ambari
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN Applications
 
Cloudera Cares + DataKind | 7 May 2015 | London, UK
Cloudera Cares + DataKind | 7 May 2015 | London, UKCloudera Cares + DataKind | 7 May 2015 | London, UK
Cloudera Cares + DataKind | 7 May 2015 | London, UK
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - Stamped...
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data Success
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARN
 
Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5
 
Enabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARNEnabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARN
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
 
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNCombine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARN
 
Debugging Apache Hadoop YARN Cluster in Production
Debugging Apache Hadoop YARN Cluster in ProductionDebugging Apache Hadoop YARN Cluster in Production
Debugging Apache Hadoop YARN Cluster in Production
 

Andere mochten auch

Vtug spring ahead Microsoft Storage Spaces by dan stolts (it pro-guru)
Vtug spring ahead Microsoft Storage Spaces by dan stolts (it pro-guru)Vtug spring ahead Microsoft Storage Spaces by dan stolts (it pro-guru)
Vtug spring ahead Microsoft Storage Spaces by dan stolts (it pro-guru)csharney
 
V Mworld 2010 Lab Cloud
V Mworld 2010 Lab CloudV Mworld 2010 Lab Cloud
V Mworld 2010 Lab Cloudcsharney
 
توسعه پایدار در صنعت ساخت
توسعه پایدار در صنعت ساختتوسعه پایدار در صنعت ساخت
توسعه پایدار در صنعت ساختamri k
 
V mware v sphere 5 fundamentals services kit
V mware v sphere 5 fundamentals services kitV mware v sphere 5 fundamentals services kit
V mware v sphere 5 fundamentals services kitsolarisyougood
 
V mware workbench_eclipse_con2011_talk
V mware workbench_eclipse_con2011_talkV mware workbench_eclipse_con2011_talk
V mware workbench_eclipse_con2011_talkalantztan
 
Cloud infrastructure licensing and pricing customer presentation
Cloud infrastructure licensing and pricing customer presentationCloud infrastructure licensing and pricing customer presentation
Cloud infrastructure licensing and pricing customer presentationxKinAnx
 
IBM NYSE event - 1-16 Sanjay Katyal, VP of Global Alliances at VMware on VMwa...
IBM NYSE event - 1-16 Sanjay Katyal, VP of Global Alliances at VMware on VMwa...IBM NYSE event - 1-16 Sanjay Katyal, VP of Global Alliances at VMware on VMwa...
IBM NYSE event - 1-16 Sanjay Katyal, VP of Global Alliances at VMware on VMwa...Cliff Kinard
 
Covmug v sphere 4.1 what's new
Covmug v sphere 4.1 what's newCovmug v sphere 4.1 what's new
Covmug v sphere 4.1 what's newesarakaitis
 
Vsicm51 m02 virtualization_intro_
Vsicm51 m02 virtualization_intro_Vsicm51 m02 virtualization_intro_
Vsicm51 m02 virtualization_intro_VCAP5_wordpress
 
VMware_Snapshot sessions_Horizon vision and strategy
VMware_Snapshot sessions_Horizon vision and strategyVMware_Snapshot sessions_Horizon vision and strategy
VMware_Snapshot sessions_Horizon vision and strategyAnnSteyaert_vmware
 
Corporate Overview Presentation
Corporate Overview PresentationCorporate Overview Presentation
Corporate Overview Presentationepenedos
 
Presentation v mware virtualization & cloud vision 2010
Presentation   v mware virtualization & cloud vision 2010Presentation   v mware virtualization & cloud vision 2010
Presentation v mware virtualization & cloud vision 2010solarisyourep
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld
 
VMworld2008
VMworld2008VMworld2008
VMworld2008Nishka
 
Virtualization – A Year in Review with Eric Siebert
Virtualization – A Year in Review with Eric SiebertVirtualization – A Year in Review with Eric Siebert
Virtualization – A Year in Review with Eric SiebertSolarWinds
 
VMworld 2014: Virtualization 101
VMworld 2014: Virtualization 101VMworld 2014: Virtualization 101
VMworld 2014: Virtualization 101VMworld
 
Todd Muirhead (@virtualTodd) - VMware vSA
Todd Muirhead (@virtualTodd) - VMware vSATodd Muirhead (@virtualTodd) - VMware vSA
Todd Muirhead (@virtualTodd) - VMware vSADell TechCenter
 
VMware Primer
VMware PrimerVMware Primer
VMware Primermiker71
 
Lengow - International presentation
Lengow - International presentationLengow - International presentation
Lengow - International presentationnenadc
 

Andere mochten auch (20)

Vtug spring ahead Microsoft Storage Spaces by dan stolts (it pro-guru)
Vtug spring ahead Microsoft Storage Spaces by dan stolts (it pro-guru)Vtug spring ahead Microsoft Storage Spaces by dan stolts (it pro-guru)
Vtug spring ahead Microsoft Storage Spaces by dan stolts (it pro-guru)
 
V Mworld 2010 Lab Cloud
V Mworld 2010 Lab CloudV Mworld 2010 Lab Cloud
V Mworld 2010 Lab Cloud
 
توسعه پایدار در صنعت ساخت
توسعه پایدار در صنعت ساختتوسعه پایدار در صنعت ساخت
توسعه پایدار در صنعت ساخت
 
V mware v sphere 5 fundamentals services kit
V mware v sphere 5 fundamentals services kitV mware v sphere 5 fundamentals services kit
V mware v sphere 5 fundamentals services kit
 
V mware workbench_eclipse_con2011_talk
V mware workbench_eclipse_con2011_talkV mware workbench_eclipse_con2011_talk
V mware workbench_eclipse_con2011_talk
 
Cloud infrastructure licensing and pricing customer presentation
Cloud infrastructure licensing and pricing customer presentationCloud infrastructure licensing and pricing customer presentation
Cloud infrastructure licensing and pricing customer presentation
 
IBM NYSE event - 1-16 Sanjay Katyal, VP of Global Alliances at VMware on VMwa...
IBM NYSE event - 1-16 Sanjay Katyal, VP of Global Alliances at VMware on VMwa...IBM NYSE event - 1-16 Sanjay Katyal, VP of Global Alliances at VMware on VMwa...
IBM NYSE event - 1-16 Sanjay Katyal, VP of Global Alliances at VMware on VMwa...
 
Covmug v sphere 4.1 what's new
Covmug v sphere 4.1 what's newCovmug v sphere 4.1 what's new
Covmug v sphere 4.1 what's new
 
Vsicm51 m02 virtualization_intro_
Vsicm51 m02 virtualization_intro_Vsicm51 m02 virtualization_intro_
Vsicm51 m02 virtualization_intro_
 
VMware_Snapshot sessions_Horizon vision and strategy
VMware_Snapshot sessions_Horizon vision and strategyVMware_Snapshot sessions_Horizon vision and strategy
VMware_Snapshot sessions_Horizon vision and strategy
 
VMWARE
VMWAREVMWARE
VMWARE
 
Corporate Overview Presentation
Corporate Overview PresentationCorporate Overview Presentation
Corporate Overview Presentation
 
Presentation v mware virtualization & cloud vision 2010
Presentation   v mware virtualization & cloud vision 2010Presentation   v mware virtualization & cloud vision 2010
Presentation v mware virtualization & cloud vision 2010
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
 
VMworld2008
VMworld2008VMworld2008
VMworld2008
 
Virtualization – A Year in Review with Eric Siebert
Virtualization – A Year in Review with Eric SiebertVirtualization – A Year in Review with Eric Siebert
Virtualization – A Year in Review with Eric Siebert
 
VMworld 2014: Virtualization 101
VMworld 2014: Virtualization 101VMworld 2014: Virtualization 101
VMworld 2014: Virtualization 101
 
Todd Muirhead (@virtualTodd) - VMware vSA
Todd Muirhead (@virtualTodd) - VMware vSATodd Muirhead (@virtualTodd) - VMware vSA
Todd Muirhead (@virtualTodd) - VMware vSA
 
VMware Primer
VMware PrimerVMware Primer
VMware Primer
 
Lengow - International presentation
Lengow - International presentationLengow - International presentation
Lengow - International presentation
 

Ähnlich wie 堵俊平:Hadoop virtualization extensions

Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnhdhappy001
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopHortonworks
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesDataWorks Summit
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionWangda Tan
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionDataWorks Summit
 
A sdn based application aware and network provisioning
A sdn based application aware and network provisioningA sdn based application aware and network provisioning
A sdn based application aware and network provisioningStanley Wang
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : BeginnersShweta Patnaik
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : BeginnersShweta Patnaik
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : BeginnersShweta Patnaik
 
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupYARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupRommel Garcia
 
RightScale Webinar: Key Considerations For Cloud Migration and Portability
RightScale Webinar:  Key Considerations For Cloud Migration and PortabilityRightScale Webinar:  Key Considerations For Cloud Migration and Portability
RightScale Webinar: Key Considerations For Cloud Migration and PortabilityRightScale
 
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.02013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0Adam Muise
 
How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopHow YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopPOSSCON
 
Hadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual MachinesHadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual MachinesDataWorks Summit
 
Running Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale PlatformRunning Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale PlatformInMobi Technology
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRHadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRData Con LA
 

Ähnlich wie 堵俊平:Hadoop virtualization extensions (20)

Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond Kubernetes
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
MHUG - YARN
MHUG - YARNMHUG - YARN
MHUG - YARN
 
A sdn based application aware and network provisioning
A sdn based application aware and network provisioningA sdn based application aware and network provisioning
A sdn based application aware and network provisioning
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Apache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduceApache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduce
 
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupYARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User Group
 
RightScale Webinar: Key Considerations For Cloud Migration and Portability
RightScale Webinar:  Key Considerations For Cloud Migration and PortabilityRightScale Webinar:  Key Considerations For Cloud Migration and Portability
RightScale Webinar: Key Considerations For Cloud Migration and Portability
 
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.02013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
 
How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopHow YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in Hadoop
 
Hadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual MachinesHadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual Machines
 
Running Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale PlatformRunning Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale Platform
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRHadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapR
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 

Mehr von hdhappy001

詹剑锋:Big databench—benchmarking big data systems
詹剑锋:Big databench—benchmarking big data systems詹剑锋:Big databench—benchmarking big data systems
詹剑锋:Big databench—benchmarking big data systemshdhappy001
 
翟艳堂:腾讯大规模Hadoop集群实践
翟艳堂:腾讯大规模Hadoop集群实践翟艳堂:腾讯大规模Hadoop集群实践
翟艳堂:腾讯大规模Hadoop集群实践hdhappy001
 
袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战hdhappy001
 
俞晨杰:Linked in大数据应用和azkaban
俞晨杰:Linked in大数据应用和azkaban俞晨杰:Linked in大数据应用和azkaban
俞晨杰:Linked in大数据应用和azkabanhdhappy001
 
杨少华:阿里开放数据处理服务
杨少华:阿里开放数据处理服务杨少华:阿里开放数据处理服务
杨少华:阿里开放数据处理服务hdhappy001
 
薛伟:腾讯广点通——大数据之上的实时精准推荐
薛伟:腾讯广点通——大数据之上的实时精准推荐薛伟:腾讯广点通——大数据之上的实时精准推荐
薛伟:腾讯广点通——大数据之上的实时精准推荐hdhappy001
 
徐萌:中国移动大数据应用实践
徐萌:中国移动大数据应用实践徐萌:中国移动大数据应用实践
徐萌:中国移动大数据应用实践hdhappy001
 
肖永红:科研数据应用和共享方面的实践
肖永红:科研数据应用和共享方面的实践肖永红:科研数据应用和共享方面的实践
肖永红:科研数据应用和共享方面的实践hdhappy001
 
肖康:Storm在实时网络攻击检测和分析的应用与改进
肖康:Storm在实时网络攻击检测和分析的应用与改进肖康:Storm在实时网络攻击检测和分析的应用与改进
肖康:Storm在实时网络攻击检测和分析的应用与改进hdhappy001
 
夏俊鸾:Spark——基于内存的下一代大数据分析框架
夏俊鸾:Spark——基于内存的下一代大数据分析框架夏俊鸾:Spark——基于内存的下一代大数据分析框架
夏俊鸾:Spark——基于内存的下一代大数据分析框架hdhappy001
 
魏凯:大数据商业利用的政策管制问题
魏凯:大数据商业利用的政策管制问题魏凯:大数据商业利用的政策管制问题
魏凯:大数据商业利用的政策管制问题hdhappy001
 
王涛:基于Cloudera impala的非关系型数据库sql执行引擎
王涛:基于Cloudera impala的非关系型数据库sql执行引擎王涛:基于Cloudera impala的非关系型数据库sql执行引擎
王涛:基于Cloudera impala的非关系型数据库sql执行引擎hdhappy001
 
王峰:阿里搜索实时流计算技术
王峰:阿里搜索实时流计算技术王峰:阿里搜索实时流计算技术
王峰:阿里搜索实时流计算技术hdhappy001
 
钱卫宁:在线社交媒体分析型查询基准评测初探
钱卫宁:在线社交媒体分析型查询基准评测初探钱卫宁:在线社交媒体分析型查询基准评测初探
钱卫宁:在线社交媒体分析型查询基准评测初探hdhappy001
 
穆黎森:Interactive batch query at scale
穆黎森:Interactive batch query at scale穆黎森:Interactive batch query at scale
穆黎森:Interactive batch query at scalehdhappy001
 
罗李:构建一个跨机房的Hadoop集群
罗李:构建一个跨机房的Hadoop集群罗李:构建一个跨机房的Hadoop集群
罗李:构建一个跨机房的Hadoop集群hdhappy001
 
刘书良:基于大数据公共云平台的Dsp技术
刘书良:基于大数据公共云平台的Dsp技术刘书良:基于大数据公共云平台的Dsp技术
刘书良:基于大数据公共云平台的Dsp技术hdhappy001
 
刘诚忠:Running cloudera impala on postgre sql
刘诚忠:Running cloudera impala on postgre sql刘诚忠:Running cloudera impala on postgre sql
刘诚忠:Running cloudera impala on postgre sqlhdhappy001
 
刘昌钰:阿里大数据应用平台
刘昌钰:阿里大数据应用平台刘昌钰:阿里大数据应用平台
刘昌钰:阿里大数据应用平台hdhappy001
 
李战怀:大数据背景下分布式系统的数据一致性策略
李战怀:大数据背景下分布式系统的数据一致性策略李战怀:大数据背景下分布式系统的数据一致性策略
李战怀:大数据背景下分布式系统的数据一致性策略hdhappy001
 

Mehr von hdhappy001 (20)

詹剑锋:Big databench—benchmarking big data systems
詹剑锋:Big databench—benchmarking big data systems詹剑锋:Big databench—benchmarking big data systems
詹剑锋:Big databench—benchmarking big data systems
 
翟艳堂:腾讯大规模Hadoop集群实践
翟艳堂:腾讯大规模Hadoop集群实践翟艳堂:腾讯大规模Hadoop集群实践
翟艳堂:腾讯大规模Hadoop集群实践
 
袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战
 
俞晨杰:Linked in大数据应用和azkaban
俞晨杰:Linked in大数据应用和azkaban俞晨杰:Linked in大数据应用和azkaban
俞晨杰:Linked in大数据应用和azkaban
 
杨少华:阿里开放数据处理服务
杨少华:阿里开放数据处理服务杨少华:阿里开放数据处理服务
杨少华:阿里开放数据处理服务
 
薛伟:腾讯广点通——大数据之上的实时精准推荐
薛伟:腾讯广点通——大数据之上的实时精准推荐薛伟:腾讯广点通——大数据之上的实时精准推荐
薛伟:腾讯广点通——大数据之上的实时精准推荐
 
徐萌:中国移动大数据应用实践
徐萌:中国移动大数据应用实践徐萌:中国移动大数据应用实践
徐萌:中国移动大数据应用实践
 
肖永红:科研数据应用和共享方面的实践
肖永红:科研数据应用和共享方面的实践肖永红:科研数据应用和共享方面的实践
肖永红:科研数据应用和共享方面的实践
 
肖康:Storm在实时网络攻击检测和分析的应用与改进
肖康:Storm在实时网络攻击检测和分析的应用与改进肖康:Storm在实时网络攻击检测和分析的应用与改进
肖康:Storm在实时网络攻击检测和分析的应用与改进
 
夏俊鸾:Spark——基于内存的下一代大数据分析框架
夏俊鸾:Spark——基于内存的下一代大数据分析框架夏俊鸾:Spark——基于内存的下一代大数据分析框架
夏俊鸾:Spark——基于内存的下一代大数据分析框架
 
魏凯:大数据商业利用的政策管制问题
魏凯:大数据商业利用的政策管制问题魏凯:大数据商业利用的政策管制问题
魏凯:大数据商业利用的政策管制问题
 
王涛:基于Cloudera impala的非关系型数据库sql执行引擎
王涛:基于Cloudera impala的非关系型数据库sql执行引擎王涛:基于Cloudera impala的非关系型数据库sql执行引擎
王涛:基于Cloudera impala的非关系型数据库sql执行引擎
 
王峰:阿里搜索实时流计算技术
王峰:阿里搜索实时流计算技术王峰:阿里搜索实时流计算技术
王峰:阿里搜索实时流计算技术
 
钱卫宁:在线社交媒体分析型查询基准评测初探
钱卫宁:在线社交媒体分析型查询基准评测初探钱卫宁:在线社交媒体分析型查询基准评测初探
钱卫宁:在线社交媒体分析型查询基准评测初探
 
穆黎森:Interactive batch query at scale
穆黎森:Interactive batch query at scale穆黎森:Interactive batch query at scale
穆黎森:Interactive batch query at scale
 
罗李:构建一个跨机房的Hadoop集群
罗李:构建一个跨机房的Hadoop集群罗李:构建一个跨机房的Hadoop集群
罗李:构建一个跨机房的Hadoop集群
 
刘书良:基于大数据公共云平台的Dsp技术
刘书良:基于大数据公共云平台的Dsp技术刘书良:基于大数据公共云平台的Dsp技术
刘书良:基于大数据公共云平台的Dsp技术
 
刘诚忠:Running cloudera impala on postgre sql
刘诚忠:Running cloudera impala on postgre sql刘诚忠:Running cloudera impala on postgre sql
刘诚忠:Running cloudera impala on postgre sql
 
刘昌钰:阿里大数据应用平台
刘昌钰:阿里大数据应用平台刘昌钰:阿里大数据应用平台
刘昌钰:阿里大数据应用平台
 
李战怀:大数据背景下分布式系统的数据一致性策略
李战怀:大数据背景下分布式系统的数据一致性策略李战怀:大数据背景下分布式系统的数据一致性策略
李战怀:大数据背景下分布式系统的数据一致性策略
 

Kürzlich hochgeladen

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

堵俊平:Hadoop virtualization extensions

  • 1. Big Data in Cloud 堵俊平 Apache Hadoop Committer Staff Engineer, VMware
  • 2. Bio 堵俊平 (Junping Du) - Join VMware in 2008 for cloud product first - Initiate earliest effort on big data within VMware since 2010 - Automate Hadoop deployment on vSphere which becomes Open Source project – Serengeti later - Start contributing to Apache Hadoop community since 2012 - Become Apache Hadoop committer recently only 1 in +8 timezone today
  • 3. Agenda - Virtualization, SDDC and Cloud - Trends from my observation in Big Data - YARN: resource hub for Big Data Applications - YARN in the Cloud
  • 4. What is Virtualization? - @see VMware’s vSphere Guest TCP/IP Guest Monitor File System Monitor Virtual NIC Physical Hardware Scheduler Memory Manager Virtual Switch File System NIC Drivers VMkernel Virtual SCSI I/O Drivers Monitor Emulates Physical Devices: CPU, Memory, I/O CPU is controlled by scheduler and virtualized by monitor Memory is allocated by the VMkernel and virtualized by the monitor Network and I/O devices are emulated and proxied though native device drivers
  • 5. Server Virtualization Adoption on Path to 80% Over Next 5 Years % Virtualized of x86 Workloads 80% Total x86 Workloads 200 100% 180 IDC 2012 to 2016 Change = +12 pts 90% 160 Gartner 2012 to 2016 Change = +22 pts 140 80% x86 % Physical Servers Unvirtualized 70% 百万 120 40% 100 60% IDC+ VMW Estimate: Workloads1 2012 to 2016 CAGR = 21% 50% 80 60 30% 40 20% 20 0% 40% 10% 2010 2011 2012 2013 2014 2015 2016 2017 2018 0% 2009 2010 2011 2012 2013 2014 2015 2016 Source(s): IDC: Annual Virtualization Forecast, Feb-13; Gartner: x86 Server Virtualization, Worldwide, 3Q12 Update; Gartner: Forecast x86 Server Virtualization, Worldwide, 2008-2018, Jul-11; VMware estimates, Note: Server workloads only 1 Installed Base totals assume 5-year refresh
  • 6. Apps on Traditional Infrastructure Windows Linux Databases Mission Critical HPC Big Data
  • 7. Apps on Software-Defined Data Center Windows Linux Mission Critical Databases HPC Big Data Software-Defined Data Center VDC VDC VDC VDC VDC Software-Defined Data Center Services Abstract Pool Automate
  • 8. Infrastructure for Traditional Apps Traditional Applications 2016 141M 70% Infrastructure for Traditional Enterprise Apps Existing Application bound to vendor specific HW 2012 83M Hardware-based Resiliency Hardware-based QOS Hard To automate Complex to scale
  • 9. Infrastructure for New Apps Infrastructure for New/Cloud/Data Apps Application Specific Network and Storage Next Gen Cloud Applications 2016 48M 700% 2012 6M Software-based Infrastructure Transformational Economics Automation and Agility Designed For Scale
  • 10. SDDC Delivers Single Architecture for New and Existing Apps Infrastructure for New/Cloud/Data Apps Application Specific Network and Storage Any Application Infrastructure for Existing Enterprise Apps Existing Application bound to vendor specific HW Any Hardware
  • 11. Let’s back to Big Data … New Trends of Big Data from my observation - Hadoop 2.0, YARN plays as key resource hub in big data ecosystem - MapReduce is not good enough, we need faster one, like: Tez, Spark, etc. - HDFS tries to support more scenarios, i.e. cache for low-latency apps, snapshot for disaster recovery, storage tiers awareness, etc. - More Hadoop-based SQL engines: Apache Drill, Impala, Stinger, Hawq, etc. - For enterprise-ready, more efforts are spent on Security, HA, QoS, Monitor & Management
  • 12. Hadoop MapReduce v1 (Classic) • JobTracker – Manage cluster resources and job scheduling • TaskTracker – Per node agent – Manage tasks
  • 13. MapReduce v1 Limitations • Scalability – Manage cluster resources and job scheduling • SPOF (Single Point Of Failure) • JobTracker failure cause all queued and running job failure – Restart is very tricky due to complex state • Hard partition of resources into map and reduce slots – Low resource utilization • Lacks support for alternate paradigms • Lack of wire-compatible protocols
  • 14. YARN Architecture • Splits up the two major functions of JobTracker – Resource Manager (RM) - Cluster resource management – Application Master (AM) - Task scheduling and monitoring • NodeManager (NM) - A new per-node slave – launching the applications’ containers – monitoring their resource usage (cpu, memory) and reporting to the Resource Manager. • YARN maintains compatibility with existing MapReduce application and support other applications
  • 15. YARN – Hub for Big Data Applications OpenMPI Impala HBase Distributed Shell Spark MapReduce Tez Storm YARN HDFS • App-specific AM • HOYA (Hbase On YArn) – Long running services (YARN-896) • LLAMA (Low Latency Application MAster) – Gang Scheduler (YARN-624)
  • 16. YARN and Cloud • Two different prospective: – YARN-centric prospective • YARN is the key platform to apps • YARN is independent of infrastructure, running on top of Cloud shows YARN’s generality – Cloud-centric prospective • YARN is an umbrella kind of applications • Supporting YARN shows Cloud’s generality
  • 17. YARN and Cloud: YARN-centric Prospective • YARN is “OS” Big Data Apps • Infrastructure (no matter physical or cloud) is “hardware” HBase Open MPI Distributed Shell Spark … Impala MapReduce Tez Storm YARN Infrastructure Bare-metal machines Cloud Infrastructure … VMware Open Stack …
  • 18. YARN and Cloud: Cloud-centric Prospective • Cloud Infrastructure is “OS” • YARN is a group of “process” Legacy Apps Other Big Data Apps YARN Apps Open MPI D.S Spark Impala … HBase MapReduce Tez Storm … YARN Cloud Infrastructure (VMware, Open Stack, etc.)
  • 19. YARN vs. Cloud • Similarity – Target to share resources across applications – Provide Global Resource Management • YARN vs. Cloud – YARN managing resource in OS layer vs. Cloud managing resources in Hypervisor (Not comparable, but Hypervisor is more powerful than OS in isolation) – Apps managed by YARN need specific AppMaster, Apps managed by Cloud is exactly the same as running on physical machines (Cloud +1) – YARN layer is closed to big data app, better understand/estimate app’s requirement (YARN +1) – Cloud layer is closed to hardware resources, easier to track real time and global resource utilization (Cloud +1)
  • 20. YARN + Cloud • Why YARN + Cloud? – Leverage virtualization in strong isolation, fine-grained resource sharing and other benefits – Uniform infrastructure to simplify IT in enterprise • What it looks like? – Running YARN NM inside of VMs managed by Cloud Infrastructure – Build communication channel between YARN RM and Cloud Resource Manager for coordination • How we do? – First thing above is very easy and smoothly – Second things to achieve in two ways • YARN can aware/manipulate Cloud resource change • YARN provide a generic resource notification mechanism so Cloud Manager can use when resource changing
  • 21. Elastic YARN Node in the Cloud Container Add/Remove Resources? Container Other Workload Virtual YARN Node NodeManager Datanode Virtualization Host Grow/Shrink resource of a VM VMDK Grow/Shrink by tens of GB in memory?
  • 22. Elastic YARN Node in the Cloud • VM’s resource boundary can be elastic – – – – CPU is easy – time slicing (with constraints) Memory is harder – page sharing and memory ballooning In case of contention, enforce limits and proportional sharing “Stealing” resources behind apps could cause bad performance (paging) – App aware resource management could address these issues • Hadoop YARN Resource Model – Dynamic with adding/removing nodes – But static for per node • In this case, shall we enable resource elasticity on VM? – If yes, low performance when resource contention happens. – If no, low utilization as physical boxes because free resources cannot be leveraged by other busy VMs • We need better answer .
  • 23. HVE provide the answer! • Hadoop Virtualization Extensions – A project initiated from VMware to enhance Hadoop running on virtualization – A “driver” for Hadoop “OS” running on cloud “hardware” • Goal: Make Hadoop Cloud-Ready – Provide Virtualization-awareness to Hadoop, i.e. virtual topology, virtual resources, etc. – Deliver generic utility that can be leveraged by virtualized platform • Independent of virtualization platform and cloud infrastructure • 100% contribute to Apache Hadoop Community
  • 24. HVE • Philosophy – make infrastructure related components abstract – deliver different implementations that can be configured properly • E.g. BlockPlacementPolicy (Abstract) BlockPlacementPolicy BlockPlacementPolicy Default BlockPlacementPolicy For Virtualization
  • 25. Elastic YARN Node in the Cloud • In this case, shall we enable resource elasticity on VM? • Yes, and we try to get rid of resource contention – Notify YARN that node’s resource get changed – YARN RM scheduler won’t schedule new tasks on nodes get congestion – YARN scheduler preempt low priority tasks if necessary – The work is addressed in YARN-291
  • 26. Implementation – YARN-291 (umbrella) • YARN-312 • YARN-311 – Core scheduler changes – AdminProtocol changes • REST API, JMX, etc. • YARN-313 • CLI Resource Manager Scheduler UpdateNodeResource() AdminService Cluster Resource Admin CLI yarn rmadmin -updateNodeResource <NodeId> <Resource> SchedulerNode RMContext RMNode Resource Tracker Service Heartbeat Node Manager Cloud Resource Manager
  • 27. Welcome contribution to Apache Hadoop! • Hadoop is the key platform – For architecting Big Data – Contribute a bit can change the world! • Open source project is a great platform – For people to share great ideas, works from different organizations – Community is a great work place • Companies and persons get credit – From work and resources they are putting – Also easy to build a ecosystem and show expertise • So many challenges in Big Data, like building Babel – Open source is the common language to make sure we can work together
  • 28. Key messages in today’s talk • SDDC and Cloud are the future for architecting enterprise IT • New trends in big data: YARN plays as a “OS” for big data apps • In VMware, we tries to support any “OS”, include “YARN” • HVE plays as “driver” to enable Hadoop on virtualization/cloud • Contribute to Apache Hadoop
  • 29. Reference • YARN MapReduce 2.0 – https://issues.apache.org/jira/browse/MAPREDUCE279 • HVE topology extension – https://issues.apache.org/jira/browse/HADOOP-8468 • HVE topology extension for YARN – https://issues.apache.org/jira/browse/YARN-18 • HVE elastic resource configuration – https://issues.apache.org/jira/browse/YARN-291 • Gang Scheduling – https://issues.apache.org/jira/browse/YARN-624 • Long-lived services in YARN – https://issues.apache.org/jira/browse/YARN-896