SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
HUAWEI TECHNOLOGIES CO., LTD. www.huawei.com
Energy Efficient
VM Placement
Ulrich Kleber <ulrich.kleber@huawei.com>
Kurt Garloff <huawei@garloff.de>
Radu Tudoran <radu.tudoran@huawei.com>
OpenStack Summit Vancouver 2015
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
The Energy Ceiling
Source: - Ian Bitterlin and Jon Summers, UoL, UK, Jul 2013
- Alexandru Iosup, Delft University, The Netherlands, Jan 2015
Over 500 YouTube videos have at least 100,000,000 viewers each
If you want to help killing the planet:
https://www.youtube.com/watch?v=9bZkp7q19f0
PSY Gandnam Style consumed >300 GWh
Ø More than some countries in a year
Ø Over 35 MW of 24/7/365 diesel, 100M liters of oil
Ø 80,000 cars running for a year
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
l How much energy is wasted by idle resources?
l How much energy can be saved by re-scheduling the execution of VMs?
l What is the relation between energy consumption and load?
l How should VMs be rescheduled to save energy?
Motivating questions
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
Roadmap
Evaluate overall
cluster energy
consumption
Zoom on the node
energy consumption
Evaluate the node
performance-energy
ratio
Energy Comparison of
VM scheduling
strategies
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
The hardware setup
E9000:
CH222: 2xXeon E5-2680
(8core SB),
256 GB RAM,15x900GB
SAS disks,
800GB SSD (cache),
2x10GigE
CH121: 2xXeon E5-2680
(8core SB),128 GB RAM,
2x900GB SAS disks,
2x10GigE
Overall: 40 CPUs, 240
cores, 3.5 TB RAM
CE12804 CE12804
……
UDS Sub-System
E9000 Blade E9000 Blade
FusionSphere system
3*A-Node
2*UDSN
4*CH222
8*CH121
4*CH222
4*CH121
UDS
UDS: 3 *A-Nodes + 2 *UDSN
150 disks, 4TB each
Total: 600TB raw
Block storage:
FusionStorage/DSware
(Distr. repl. storage on CH222s)
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
Methodology
The ES9000 has BMC capabilities that
allow to measure realtime power
consumption.
Power can be read from the webinterface at
both chassis (HMM) and blade (iMana)
level. (Also for PSUs.)
It can also be accessed via command line of
embedded ARM/MIPS Linux system.
smmget -l shelf -d realtimepower
ipmcget -t sensor -d list
Measurement of power and consumed
energy at both node and cluster level.
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
l 4 vCPU and 8GB memory per VM
l 2 Clusters with a FusionManager and OpenStack Havana (FS5)
l Some node reserved (idling/switched off)
l Warm data center (~35°C)
l Induce load and measure the energy consumption
– using linux stress tool
– using a synthetic benchmark
l 5-10 samples collected ~1 minute apart and averaged.
– measurements performed after cluster reaches stability from the energy
consumption point of view (~1 minute after operation is started)
Experiment 1: Methodology
OpenStack-based Hypervisor
Virtual
hardware
OS
Application
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
l Scale the cluster occupancy:
- 10 VMs scale steps ~9% of the compute capacity
l Use stress tool to induce constant load in VMs
- CPU consumption 3 threads spinning over sqrt
- Memory consumption 3 threads spinning over alloc/dealloc
l Compare with idle cluster as base-line, when:
- VMs hibernate
- VMs run but are idle
Experiment 1: Cluster energy consumption
Determine cluster energy consumption based on load
Hypervisor
Virtual hardware
OS
Virtual hardware
OS
Hypervisor
Virtual hardware
OS
Virtual hardware
OS
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
Measurements (1)
60% difference between working and idle cluster
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
Experiment 2: Node energy consumption
Determine the node energy consumption based on load
l Fully occupy a node : 8 VMs to occupy the 32 CPU threads
l Fully use the VM compute power: 6 threads per VM (4vCPUs)
l Use stress tool to induce different loads in VMs
- CPU load - spinning over sqrt
- Memory load - spinning over alloc/dealloc
- IO load - spinning over sync
- HDD load - spinning over write/unlink
l Compare with the idle node and the powered off node
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
Measurements (2)
Saving ~100W per
switched off idle node
Hard Disk load causes
storage cluster to
consume power
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
l CPU + Memory intensive patterns seem to be the most energy consuming per node
l External storage increases total energy consumption
l Significant energy difference per node between powered off and idle states
Ø Significant energy savings for mostly idle clusters (50+%)
Ø Reschedule VMs to empty some nodes?
q But how does the energy relates to performance?
q Does lower average power consumption mean lower energy for a fixed workload ?
Preliminary conclusions
Reschedule in order to empty nodes or to
distribute the load?
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
3 possible scheduling strategies
Hypervisor
Virtual hardware
OS
Hypervisor
Virtual hardware
OS
Application Application
Hypervisor
Virtual hardware
OS
Virtual hardware
OS
Hypervisor
Application Application
Hypervisor
Virtual hardware
OS
Virtual hardware
OS
Hypervisor
Application Application
Scenario 1:
VMs are running across
multiple nodes
Scenario 2:
VMs are grouped on the
minimal number of nodes
Scenario 3:
VMs are grouped on the
minimal number of nodes and
the others are powered off
Focus on Scenarios 1&2 to understand the best options for when nodes are kept on
Scenario 3 is not use in practice by Telcos
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
Experiment 3: Workload energy consumption
Determine the energy-performance relation
l Fully occupy 1 node : 8 VMs
l Balance the load between 2 nodes: 4 VMs per node
l Use a synthetic benchmark with a fixed computation workload
Ø Compute the first N digits of PI in each VM
echo "scale=15000; 4*a(1)" | time bc -l
l Compare the energy consumption of the 2 placement strategies
and the performance (timespan) to execute the workload
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
Measurements (3)
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
Discussion
l Measurements are hard to get right
p Good sensors and well-controlled environment necessary
p Constant load vs workload -- how to account for idle machines? Can they be
assumed to do something useful?
l If switching off hosts is an option, cluster VMs and do it!
p nova support, orchestrator?
l Distributing VMs can reduce the energy consumption per workload!
p Good for performance as well -- avoids resource sharing and Turbo-DEboost
p This can be understood by non-linear power curve of CPUs (P ~ U²)
l If there's nothing useful to be done afterwards, grouping VMs is good for
energy consumption due to high idle power (but better on newer CPUs).
l Related VMs may want to be un/grouped (anti-/affinity)
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
Towards energy aware scheduling
l A simple model would help (3 params to describe quadratic curve) a lot
p Ideally use sensors if available
p Ideally understands hardware details (e.g. AVX downclock on Haswell-EP/EX)
p Ideally understands workloads (communication b/w instances -> affinity)
l Enables various policies to be implemented
p Minimal energy consumption vs balanced vs maximum performance
p Thermal management -- avoid hot spots
l Advanced ideas (thanks, Adam! http://blog.adamspiers.org/2015/05/17/cloud-rearrangement/)
p Do (live) migrations to achieve better cloud state?
p Advanced optimizations for e.g. page sharing (KSM)
p Scalability: Hierarchical scheduler?
HUAWEI TECHNOLOGIES CO., LTD.
‹#›
• Observations:
Ø Significant room for improvement for the cluster energy management
Ø Resource and compute pattern awareness are key milestones to decrease
energy consumption
We're looking for help:
• Discussions with scheduler community
• Huawei looks for cloud engineers in Europe (Munich) and elsewhere
• Looking for other companies to work on this with us
Conclusions and Future

Weitere ähnliche Inhalte

Was ist angesagt?

Load Balancing in Cloud Computing Through Virtual Machine Placement
Load Balancing in Cloud Computing Through Virtual Machine PlacementLoad Balancing in Cloud Computing Through Virtual Machine Placement
Load Balancing in Cloud Computing Through Virtual Machine PlacementIRJET Journal
 
G-SLAM:OPTIMIZING ENERGY EFFIIENCY IN CLOUD
G-SLAM:OPTIMIZING ENERGY EFFIIENCY IN CLOUDG-SLAM:OPTIMIZING ENERGY EFFIIENCY IN CLOUD
G-SLAM:OPTIMIZING ENERGY EFFIIENCY IN CLOUDAlfiya Mahmood
 
Job sequence scheduling for cloud computing
Job sequence scheduling for cloud computingJob sequence scheduling for cloud computing
Job sequence scheduling for cloud computingSamruddhi Gaikwad
 
GREEN CLOUD COMPUTING BY SAIKIRAN PANJALA
GREEN CLOUD COMPUTING BY SAIKIRAN PANJALAGREEN CLOUD COMPUTING BY SAIKIRAN PANJALA
GREEN CLOUD COMPUTING BY SAIKIRAN PANJALASaikiran Panjala
 
Green cloud computing
Green cloud computingGreen cloud computing
Green cloud computingJauwadSyed
 
Energy efficient resource allocation in cloud computing
Energy efficient resource allocation in cloud computingEnergy efficient resource allocation in cloud computing
Energy efficient resource allocation in cloud computingDivaynshu Totla
 
Green cloud computing
Green cloud computingGreen cloud computing
Green cloud computingMathews Job
 
High Performance Computing in the Cloud?
High Performance Computing in the Cloud?High Performance Computing in the Cloud?
High Performance Computing in the Cloud?Ian Lumb
 
Green cloud computing using heuristic algorithms
Green cloud computing using heuristic algorithmsGreen cloud computing using heuristic algorithms
Green cloud computing using heuristic algorithmsIliad Mnd
 
Green cloud computing
Green cloud computingGreen cloud computing
Green cloud computingNalini Mehta
 
Energy efficient resource allocation007
Energy efficient resource allocation007Energy efficient resource allocation007
Energy efficient resource allocation007Divaynshu Totla
 
Paper id 41201624
Paper id 41201624Paper id 41201624
Paper id 41201624IJRAT
 
The Potential of cloud computing in accelerating the search for curing seriou...
The Potential of cloud computing in accelerating the search for curing seriou...The Potential of cloud computing in accelerating the search for curing seriou...
The Potential of cloud computing in accelerating the search for curing seriou...Mãrwã MãrwØùt'ã
 
A Survey on Resource Allocation & Monitoring in Cloud Computing
A Survey on Resource Allocation & Monitoring in Cloud ComputingA Survey on Resource Allocation & Monitoring in Cloud Computing
A Survey on Resource Allocation & Monitoring in Cloud ComputingMohd Hairey
 
Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...Papitha Velumani
 
Green cloud computing
Green cloud computingGreen cloud computing
Green cloud computingJauwadSyed
 
A stochastic approach to analysis of energy aware dvs-enabled cloud datacenters
A stochastic approach to analysis of energy aware dvs-enabled cloud datacentersA stochastic approach to analysis of energy aware dvs-enabled cloud datacenters
A stochastic approach to analysis of energy aware dvs-enabled cloud datacentersieeepondy
 

Was ist angesagt? (20)

Load Balancing in Cloud Computing Through Virtual Machine Placement
Load Balancing in Cloud Computing Through Virtual Machine PlacementLoad Balancing in Cloud Computing Through Virtual Machine Placement
Load Balancing in Cloud Computing Through Virtual Machine Placement
 
G-SLAM:OPTIMIZING ENERGY EFFIIENCY IN CLOUD
G-SLAM:OPTIMIZING ENERGY EFFIIENCY IN CLOUDG-SLAM:OPTIMIZING ENERGY EFFIIENCY IN CLOUD
G-SLAM:OPTIMIZING ENERGY EFFIIENCY IN CLOUD
 
Job sequence scheduling for cloud computing
Job sequence scheduling for cloud computingJob sequence scheduling for cloud computing
Job sequence scheduling for cloud computing
 
GREEN CLOUD COMPUTING BY SAIKIRAN PANJALA
GREEN CLOUD COMPUTING BY SAIKIRAN PANJALAGREEN CLOUD COMPUTING BY SAIKIRAN PANJALA
GREEN CLOUD COMPUTING BY SAIKIRAN PANJALA
 
Green cloud computing
Green cloud computingGreen cloud computing
Green cloud computing
 
Energy efficient resource allocation in cloud computing
Energy efficient resource allocation in cloud computingEnergy efficient resource allocation in cloud computing
Energy efficient resource allocation in cloud computing
 
Green cloud computing
Green cloud computingGreen cloud computing
Green cloud computing
 
Green cloud computing
Green cloud computingGreen cloud computing
Green cloud computing
 
High Performance Computing in the Cloud?
High Performance Computing in the Cloud?High Performance Computing in the Cloud?
High Performance Computing in the Cloud?
 
Green cloud computing
Green  cloud computingGreen  cloud computing
Green cloud computing
 
Green cloud computing using heuristic algorithms
Green cloud computing using heuristic algorithmsGreen cloud computing using heuristic algorithms
Green cloud computing using heuristic algorithms
 
Green cloud computing
Green cloud computingGreen cloud computing
Green cloud computing
 
Energy efficient resource allocation007
Energy efficient resource allocation007Energy efficient resource allocation007
Energy efficient resource allocation007
 
Scheduling in CCE
Scheduling in CCEScheduling in CCE
Scheduling in CCE
 
Paper id 41201624
Paper id 41201624Paper id 41201624
Paper id 41201624
 
The Potential of cloud computing in accelerating the search for curing seriou...
The Potential of cloud computing in accelerating the search for curing seriou...The Potential of cloud computing in accelerating the search for curing seriou...
The Potential of cloud computing in accelerating the search for curing seriou...
 
A Survey on Resource Allocation & Monitoring in Cloud Computing
A Survey on Resource Allocation & Monitoring in Cloud ComputingA Survey on Resource Allocation & Monitoring in Cloud Computing
A Survey on Resource Allocation & Monitoring in Cloud Computing
 
Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...
 
Green cloud computing
Green cloud computingGreen cloud computing
Green cloud computing
 
A stochastic approach to analysis of energy aware dvs-enabled cloud datacenters
A stochastic approach to analysis of energy aware dvs-enabled cloud datacentersA stochastic approach to analysis of energy aware dvs-enabled cloud datacenters
A stochastic approach to analysis of energy aware dvs-enabled cloud datacenters
 

Ähnlich wie Energy efficient VM placement - OpenStack Summit Vancouver May 2015

BKK16-208 EAS
BKK16-208 EASBKK16-208 EAS
BKK16-208 EASLinaro
 
CNR @ VMUG.IT 20150304
CNR @ VMUG.IT 20150304CNR @ VMUG.IT 20150304
CNR @ VMUG.IT 20150304VMUG IT
 
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy Ehsan Sharifi
 
WALT vs PELT : Redux - SFO17-307
WALT vs PELT : Redux  - SFO17-307WALT vs PELT : Redux  - SFO17-307
WALT vs PELT : Redux - SFO17-307Linaro
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning
 
Optimization_of_Virtual_Machines_for_High_Performance
Optimization_of_Virtual_Machines_for_High_PerformanceOptimization_of_Virtual_Machines_for_High_Performance
Optimization_of_Virtual_Machines_for_High_PerformanceStorPool Storage
 
Optimization of OpenNebula VMs for Higher Performance - Boyan Krosnov
Optimization of OpenNebula VMs for Higher Performance - Boyan KrosnovOptimization of OpenNebula VMs for Higher Performance - Boyan Krosnov
Optimization of OpenNebula VMs for Higher Performance - Boyan KrosnovOpenNebula Project
 
Ceph at Work in Bloomberg: Object Store, RBD and OpenStack
Ceph at Work in Bloomberg: Object Store, RBD and OpenStackCeph at Work in Bloomberg: Object Store, RBD and OpenStack
Ceph at Work in Bloomberg: Object Store, RBD and OpenStackRed_Hat_Storage
 
Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embe...
Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embe...Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embe...
Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embe...Tulipp. Eu
 
Citrix TechXperts Perth May 2016
Citrix TechXperts Perth May 2016Citrix TechXperts Perth May 2016
Citrix TechXperts Perth May 2016Jeremy Saunders
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...Amazon Web Services
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...Amazon Web Services
 
Scaling Green Instrumentation to more than 10 Million Cores
Scaling Green Instrumentation to more than 10 Million CoresScaling Green Instrumentation to more than 10 Million Cores
Scaling Green Instrumentation to more than 10 Million Coresinside-BigData.com
 
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...NECST Lab @ Politecnico di Milano
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloadsinside-BigData.com
 
Run-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsRun-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsNECST Lab @ Politecnico di Milano
 
Deview 2013 rise of the wimpy machines - john mao
Deview 2013   rise of the wimpy machines - john maoDeview 2013   rise of the wimpy machines - john mao
Deview 2013 rise of the wimpy machines - john maoNAVER D2
 

Ähnlich wie Energy efficient VM placement - OpenStack Summit Vancouver May 2015 (20)

BKK16-208 EAS
BKK16-208 EASBKK16-208 EAS
BKK16-208 EAS
 
CNR @ VMUG.IT 20150304
CNR @ VMUG.IT 20150304CNR @ VMUG.IT 20150304
CNR @ VMUG.IT 20150304
 
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
 
WALT vs PELT : Redux - SFO17-307
WALT vs PELT : Redux  - SFO17-307WALT vs PELT : Redux  - SFO17-307
WALT vs PELT : Redux - SFO17-307
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use Case
 
Optimization_of_Virtual_Machines_for_High_Performance
Optimization_of_Virtual_Machines_for_High_PerformanceOptimization_of_Virtual_Machines_for_High_Performance
Optimization_of_Virtual_Machines_for_High_Performance
 
Optimization of OpenNebula VMs for Higher Performance - Boyan Krosnov
Optimization of OpenNebula VMs for Higher Performance - Boyan KrosnovOptimization of OpenNebula VMs for Higher Performance - Boyan Krosnov
Optimization of OpenNebula VMs for Higher Performance - Boyan Krosnov
 
Ceph at Work in Bloomberg: Object Store, RBD and OpenStack
Ceph at Work in Bloomberg: Object Store, RBD and OpenStackCeph at Work in Bloomberg: Object Store, RBD and OpenStack
Ceph at Work in Bloomberg: Object Store, RBD and OpenStack
 
Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embe...
Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embe...Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embe...
Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embe...
 
Citrix TechXperts Perth May 2016
Citrix TechXperts Perth May 2016Citrix TechXperts Perth May 2016
Citrix TechXperts Perth May 2016
 
HPC in the Cloud
HPC in the CloudHPC in the Cloud
HPC in the Cloud
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Scaling Green Instrumentation to more than 10 Million Cores
Scaling Green Instrumentation to more than 10 Million CoresScaling Green Instrumentation to more than 10 Million Cores
Scaling Green Instrumentation to more than 10 Million Cores
 
Gupta_Keynote_VTDC-3
Gupta_Keynote_VTDC-3Gupta_Keynote_VTDC-3
Gupta_Keynote_VTDC-3
 
Blue gene
Blue geneBlue gene
Blue gene
 
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloads
 
Run-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsRun-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environments
 
Deview 2013 rise of the wimpy machines - john mao
Deview 2013   rise of the wimpy machines - john maoDeview 2013   rise of the wimpy machines - john mao
Deview 2013 rise of the wimpy machines - john mao
 

Kürzlich hochgeladen

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Kürzlich hochgeladen (20)

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Energy efficient VM placement - OpenStack Summit Vancouver May 2015

  • 1. HUAWEI TECHNOLOGIES CO., LTD. www.huawei.com Energy Efficient VM Placement Ulrich Kleber <ulrich.kleber@huawei.com> Kurt Garloff <huawei@garloff.de> Radu Tudoran <radu.tudoran@huawei.com> OpenStack Summit Vancouver 2015
  • 2. HUAWEI TECHNOLOGIES CO., LTD. ‹#› The Energy Ceiling Source: - Ian Bitterlin and Jon Summers, UoL, UK, Jul 2013 - Alexandru Iosup, Delft University, The Netherlands, Jan 2015 Over 500 YouTube videos have at least 100,000,000 viewers each If you want to help killing the planet: https://www.youtube.com/watch?v=9bZkp7q19f0 PSY Gandnam Style consumed >300 GWh Ø More than some countries in a year Ø Over 35 MW of 24/7/365 diesel, 100M liters of oil Ø 80,000 cars running for a year
  • 3. HUAWEI TECHNOLOGIES CO., LTD. ‹#› l How much energy is wasted by idle resources? l How much energy can be saved by re-scheduling the execution of VMs? l What is the relation between energy consumption and load? l How should VMs be rescheduled to save energy? Motivating questions
  • 4. HUAWEI TECHNOLOGIES CO., LTD. ‹#› Roadmap Evaluate overall cluster energy consumption Zoom on the node energy consumption Evaluate the node performance-energy ratio Energy Comparison of VM scheduling strategies
  • 5. HUAWEI TECHNOLOGIES CO., LTD. ‹#› The hardware setup E9000: CH222: 2xXeon E5-2680 (8core SB), 256 GB RAM,15x900GB SAS disks, 800GB SSD (cache), 2x10GigE CH121: 2xXeon E5-2680 (8core SB),128 GB RAM, 2x900GB SAS disks, 2x10GigE Overall: 40 CPUs, 240 cores, 3.5 TB RAM CE12804 CE12804 …… UDS Sub-System E9000 Blade E9000 Blade FusionSphere system 3*A-Node 2*UDSN 4*CH222 8*CH121 4*CH222 4*CH121 UDS UDS: 3 *A-Nodes + 2 *UDSN 150 disks, 4TB each Total: 600TB raw Block storage: FusionStorage/DSware (Distr. repl. storage on CH222s)
  • 6. HUAWEI TECHNOLOGIES CO., LTD. ‹#› Methodology The ES9000 has BMC capabilities that allow to measure realtime power consumption. Power can be read from the webinterface at both chassis (HMM) and blade (iMana) level. (Also for PSUs.) It can also be accessed via command line of embedded ARM/MIPS Linux system. smmget -l shelf -d realtimepower ipmcget -t sensor -d list Measurement of power and consumed energy at both node and cluster level.
  • 7. HUAWEI TECHNOLOGIES CO., LTD. ‹#› l 4 vCPU and 8GB memory per VM l 2 Clusters with a FusionManager and OpenStack Havana (FS5) l Some node reserved (idling/switched off) l Warm data center (~35°C) l Induce load and measure the energy consumption – using linux stress tool – using a synthetic benchmark l 5-10 samples collected ~1 minute apart and averaged. – measurements performed after cluster reaches stability from the energy consumption point of view (~1 minute after operation is started) Experiment 1: Methodology OpenStack-based Hypervisor Virtual hardware OS Application
  • 8. HUAWEI TECHNOLOGIES CO., LTD. ‹#› l Scale the cluster occupancy: - 10 VMs scale steps ~9% of the compute capacity l Use stress tool to induce constant load in VMs - CPU consumption 3 threads spinning over sqrt - Memory consumption 3 threads spinning over alloc/dealloc l Compare with idle cluster as base-line, when: - VMs hibernate - VMs run but are idle Experiment 1: Cluster energy consumption Determine cluster energy consumption based on load Hypervisor Virtual hardware OS Virtual hardware OS Hypervisor Virtual hardware OS Virtual hardware OS
  • 9. HUAWEI TECHNOLOGIES CO., LTD. ‹#› Measurements (1) 60% difference between working and idle cluster
  • 10. HUAWEI TECHNOLOGIES CO., LTD. ‹#› Experiment 2: Node energy consumption Determine the node energy consumption based on load l Fully occupy a node : 8 VMs to occupy the 32 CPU threads l Fully use the VM compute power: 6 threads per VM (4vCPUs) l Use stress tool to induce different loads in VMs - CPU load - spinning over sqrt - Memory load - spinning over alloc/dealloc - IO load - spinning over sync - HDD load - spinning over write/unlink l Compare with the idle node and the powered off node
  • 11. HUAWEI TECHNOLOGIES CO., LTD. ‹#› Measurements (2) Saving ~100W per switched off idle node Hard Disk load causes storage cluster to consume power
  • 12. HUAWEI TECHNOLOGIES CO., LTD. ‹#› l CPU + Memory intensive patterns seem to be the most energy consuming per node l External storage increases total energy consumption l Significant energy difference per node between powered off and idle states Ø Significant energy savings for mostly idle clusters (50+%) Ø Reschedule VMs to empty some nodes? q But how does the energy relates to performance? q Does lower average power consumption mean lower energy for a fixed workload ? Preliminary conclusions Reschedule in order to empty nodes or to distribute the load?
  • 13. HUAWEI TECHNOLOGIES CO., LTD. ‹#› 3 possible scheduling strategies Hypervisor Virtual hardware OS Hypervisor Virtual hardware OS Application Application Hypervisor Virtual hardware OS Virtual hardware OS Hypervisor Application Application Hypervisor Virtual hardware OS Virtual hardware OS Hypervisor Application Application Scenario 1: VMs are running across multiple nodes Scenario 2: VMs are grouped on the minimal number of nodes Scenario 3: VMs are grouped on the minimal number of nodes and the others are powered off Focus on Scenarios 1&2 to understand the best options for when nodes are kept on Scenario 3 is not use in practice by Telcos
  • 14. HUAWEI TECHNOLOGIES CO., LTD. ‹#› Experiment 3: Workload energy consumption Determine the energy-performance relation l Fully occupy 1 node : 8 VMs l Balance the load between 2 nodes: 4 VMs per node l Use a synthetic benchmark with a fixed computation workload Ø Compute the first N digits of PI in each VM echo "scale=15000; 4*a(1)" | time bc -l l Compare the energy consumption of the 2 placement strategies and the performance (timespan) to execute the workload
  • 15. HUAWEI TECHNOLOGIES CO., LTD. ‹#› Measurements (3)
  • 16. HUAWEI TECHNOLOGIES CO., LTD. ‹#› Discussion l Measurements are hard to get right p Good sensors and well-controlled environment necessary p Constant load vs workload -- how to account for idle machines? Can they be assumed to do something useful? l If switching off hosts is an option, cluster VMs and do it! p nova support, orchestrator? l Distributing VMs can reduce the energy consumption per workload! p Good for performance as well -- avoids resource sharing and Turbo-DEboost p This can be understood by non-linear power curve of CPUs (P ~ U²) l If there's nothing useful to be done afterwards, grouping VMs is good for energy consumption due to high idle power (but better on newer CPUs). l Related VMs may want to be un/grouped (anti-/affinity)
  • 17. HUAWEI TECHNOLOGIES CO., LTD. ‹#› Towards energy aware scheduling l A simple model would help (3 params to describe quadratic curve) a lot p Ideally use sensors if available p Ideally understands hardware details (e.g. AVX downclock on Haswell-EP/EX) p Ideally understands workloads (communication b/w instances -> affinity) l Enables various policies to be implemented p Minimal energy consumption vs balanced vs maximum performance p Thermal management -- avoid hot spots l Advanced ideas (thanks, Adam! http://blog.adamspiers.org/2015/05/17/cloud-rearrangement/) p Do (live) migrations to achieve better cloud state? p Advanced optimizations for e.g. page sharing (KSM) p Scalability: Hierarchical scheduler?
  • 18. HUAWEI TECHNOLOGIES CO., LTD. ‹#› • Observations: Ø Significant room for improvement for the cluster energy management Ø Resource and compute pattern awareness are key milestones to decrease energy consumption We're looking for help: • Discussions with scheduler community • Huawei looks for cloud engineers in Europe (Munich) and elsewhere • Looking for other companies to work on this with us Conclusions and Future