SlideShare ist ein Scribd-Unternehmen logo
1 von 11
Downloaden Sie, um offline zu lesen
1
Path to Energy Efficient
Scheduler
Linaro Connect Asia 2014, Macau
Morten Rasmussen
2
Motivation
 Energy cost driven task placement (load-balancing)
 Focus on the actual goal of the energy-aware scheduling activities:
 Saving energy while achieving (near) optimum performance.
 Energy benefit of scheduling decision clear when made.
 Assuming energy cost estimates are fairly accurate.
 Introduce a simple energy model to estimate costs and guide
scheduling decisions.
 Requested by maintainers at the KS workshop.
 Gives the right amount of packing and spreading.
 May simplify balancing decision logic.
 Strong focus on saving energy in load balancing algorithms.
 big.LITTLE support comes naturally and almost for free.
 This just one part of the energy efficiency work.
 Several related sessions this week.
3
Energy Load Balancing
 The idea (a bit simplified):
 Let the resulting energy consumption guide all balancing decisions:
 if (energy_diff(task, src_cpu, dst_cpu) > 0) {
move_task(task, src_cpu, dst_cpu);
} else {
/* Try some other task */
}
 Ideally, we should get the optimum balance if we try all combinations
of tasks and cpus.
 In reality it is not that simple. We can't try all combinations, but we
can get fairly close for most scenarios.
 If the energy model is accurate enough we get packing and spreading
implicitly and only when it saves energy
 Should work for any system. SMP and big.LITTLE (with a few
extensions).
4
Power and Energy
 Goal: Save energy, not power.
Power
Time
Energy
ecpu=P⋅t , t=
inst
cc
ecpu=P(cc)
inst
cc
ecpu=P(cc)(
insttask
cc
+
instidle
cc
)
ecpu=etask+eidle
Compute capacity (~ freq * uarch)
= Energy/inst: This is what we try to minimize.
ecpu=Pbusy (cc)
insttask
cc
+Pidle
instidle
cc
If we have cpuidle support we get:
We have to add an additional leakage energy term to reflect that it is better not wake cpus
unnecessarily.
~ utilization
Tracked load
Time
Time in runnable state
~ utilization*
Work
5
Simple Energy Model
 cpu_energy = power(cc) * util/cc
+ idle_power * (1-(util/cc))
+ leakage_energy
 cluster_energy =
c_active_power * c_util
+ c_idle_power * (1-c_util)
 util = Scale invariant cpu utilization (Tracked load).
 cc = Current compute capacity (depends on freq and uarch).
 power(cc) = Busy power (fully loaded) at current capacity from table.
 idle_power = Idle power consumption (~WFI).
 leakage_energy = Constant representing the cost of waking the cpu.
 c_util = Cluster utilization. Depends on max(util/cc) ratio of its cpus.
 c_active_power = Cluster active power.
 c_idle_power = Cluster idle power.
6
Compute Capacity and Power
 Processor specific table expressing power and compute
capacity at each P-state.
 The sched domain hierarchy is in a good position to hold this type of
information.
 Example (entirely made up):
Capacity Power
0.2 0.4
0.4 0.9
0.6 1.5
0.8 2.2
1.0 3.2
Capacity Power
0.4 1.6
0.8 4.4
1.2 9.0
1.6 15.0
2.0 23.0
Little Big
Equal compute capacity
idle 0.1
leakage 0.1
idle 0.3
leakage 0.5
Little Big
active 2.4 6.0
idle 0.0 0.0
cluster
7
energy_diff()
def energy_diff(tload, scpu, dcpu):
# Estimate the next compute capacity (P-state)
s_new_cc = find_cpu_cap(scpu, cpu_util(scpu))
# energy model cost for task on source cpu
s_task_energy = tload/s_new_cc * cpu_cc_power(scpu, s_new_cc)
if nr_running(scpu) == 1:
s_task_energy += cpu_leakage_energy[cpu_type[scpu]]
# Estimate destination cpu cc after adding the task
d_new_cc = find_cpu_cc(dcpu, cpu_util(dcpu)+tload)
# energy model cost for task on destination cpu
d_task_energy = tload/d_new_cc * cpu_cc_power(dcpu, d_new_cc)
if nr_running(dcpu) == 0:
d_task_energy += cpu_leakage_energy[cpu_type[dcpu]]
return s_task_energy - d_task_energy
 Balancing two cpus:
 Balancing sched domains is slightly more complicated as it
involves cluster power as well.
8
Example
cpu rq util cap cc_power leak power
0 {0.2} 0.2 0.2 0.4 0.1 0.5
1 {0.1} 0.1 0.2 0.4 0.1 0.35
2 {} 0.0 0.2 0.4 0.1 0.1
cluster - 1.0 - 2.4 - 2.4
Total 3.35
energy_diff()
= 0.075*
* energy_diff() ignores cluster power and other tasks to keep computations cheap and simple.
Better accuracy can be added if necessary.
0.55
saved
cpu rq util cap cc_power leak power
0 {0.2, 0.1} 0.3 0.4 0.9 0.1 0.8
1 {} 0.0 0.4 0.9 0.1 0.1
2 {} 0.0 0.4 0.9 0.1 0.1
cluster - 0.75 - 2.4 - 1.8
Total 2.8
After EA load balance:
9
Is the energy model too simple?
 It is essential that the energy model is fast and is easy to use for load-
balancing.
 The scheduler is a critical path and already complex enough.
 Python model tests
 Disclaimer: These numbers have not been validated in any way.
 Test configuration: 3+3 big.LITTLE, 1000 random balance scenarios.
 Rand/Opt: Random balance energy (starting point) worse than best possible balance
energy (brute-force).
 EA/Opt: Energy model based balance energy worse than best possible balance energy.
 EA == Opt: Scenarios where EA found best possible balance.
Tasks Rand/Opt EA/Opt EA == Opt
2 7.86% 0.09% 72.60%
3 7.79% 0.15% 64.80%
4 9.39% 0.45% 62.00%
5 10.02% 1.15% 51.10%
6 11.44% 2.23% 38.30%
10
What is next?
 Early prototype to validate the idea. Initial focus getting
energy_diff() working on simple SMP system.
 Post on LKML very soon.
 Open Issues
 Exposing power/capacity tables to kernel. Essential to make the right
decisions.
 Plumbing: Where do the tables come from? DT?
 Next steps:
 Scale invariance: Requirement for the energy model to work.
 Fix cpu_power/compute capacity use in scheduler.
 Tooling and benchmarks (covered in another session)
 Idle integration (covered in another session)
11
Questions?

Weitere ähnliche Inhalte

Mehr von Linaro

Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Linaro
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
Linaro
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
Linaro
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
Linaro
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
Linaro
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
Linaro
 
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
Linaro
 
HKG18-317 - Arm Server Ready Program
HKG18-317 - Arm Server Ready ProgramHKG18-317 - Arm Server Ready Program
HKG18-317 - Arm Server Ready Program
Linaro
 
HKG18-312 - CMSIS-NN
HKG18-312 - CMSIS-NNHKG18-312 - CMSIS-NN
HKG18-312 - CMSIS-NN
Linaro
 
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
Linaro
 
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
Linaro
 
HKG18-212 - Trusted Firmware M: Introduction
HKG18-212 - Trusted Firmware M: IntroductionHKG18-212 - Trusted Firmware M: Introduction
HKG18-212 - Trusted Firmware M: Introduction
Linaro
 

Mehr von Linaro (20)

It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening Keynote
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8M
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
 
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
 
HKG18-317 - Arm Server Ready Program
HKG18-317 - Arm Server Ready ProgramHKG18-317 - Arm Server Ready Program
HKG18-317 - Arm Server Ready Program
 
HKG18-312 - CMSIS-NN
HKG18-312 - CMSIS-NNHKG18-312 - CMSIS-NN
HKG18-312 - CMSIS-NN
 
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
 
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
 
HKG18-212 - Trusted Firmware M: Introduction
HKG18-212 - Trusted Firmware M: IntroductionHKG18-212 - Trusted Firmware M: Introduction
HKG18-212 - Trusted Firmware M: Introduction
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Kürzlich hochgeladen (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

LCA14: LCA14-109: Path to Energy Efficient Scheduler

  • 1. 1 Path to Energy Efficient Scheduler Linaro Connect Asia 2014, Macau Morten Rasmussen
  • 2. 2 Motivation  Energy cost driven task placement (load-balancing)  Focus on the actual goal of the energy-aware scheduling activities:  Saving energy while achieving (near) optimum performance.  Energy benefit of scheduling decision clear when made.  Assuming energy cost estimates are fairly accurate.  Introduce a simple energy model to estimate costs and guide scheduling decisions.  Requested by maintainers at the KS workshop.  Gives the right amount of packing and spreading.  May simplify balancing decision logic.  Strong focus on saving energy in load balancing algorithms.  big.LITTLE support comes naturally and almost for free.  This just one part of the energy efficiency work.  Several related sessions this week.
  • 3. 3 Energy Load Balancing  The idea (a bit simplified):  Let the resulting energy consumption guide all balancing decisions:  if (energy_diff(task, src_cpu, dst_cpu) > 0) { move_task(task, src_cpu, dst_cpu); } else { /* Try some other task */ }  Ideally, we should get the optimum balance if we try all combinations of tasks and cpus.  In reality it is not that simple. We can't try all combinations, but we can get fairly close for most scenarios.  If the energy model is accurate enough we get packing and spreading implicitly and only when it saves energy  Should work for any system. SMP and big.LITTLE (with a few extensions).
  • 4. 4 Power and Energy  Goal: Save energy, not power. Power Time Energy ecpu=P⋅t , t= inst cc ecpu=P(cc) inst cc ecpu=P(cc)( insttask cc + instidle cc ) ecpu=etask+eidle Compute capacity (~ freq * uarch) = Energy/inst: This is what we try to minimize. ecpu=Pbusy (cc) insttask cc +Pidle instidle cc If we have cpuidle support we get: We have to add an additional leakage energy term to reflect that it is better not wake cpus unnecessarily. ~ utilization Tracked load Time Time in runnable state ~ utilization* Work
  • 5. 5 Simple Energy Model  cpu_energy = power(cc) * util/cc + idle_power * (1-(util/cc)) + leakage_energy  cluster_energy = c_active_power * c_util + c_idle_power * (1-c_util)  util = Scale invariant cpu utilization (Tracked load).  cc = Current compute capacity (depends on freq and uarch).  power(cc) = Busy power (fully loaded) at current capacity from table.  idle_power = Idle power consumption (~WFI).  leakage_energy = Constant representing the cost of waking the cpu.  c_util = Cluster utilization. Depends on max(util/cc) ratio of its cpus.  c_active_power = Cluster active power.  c_idle_power = Cluster idle power.
  • 6. 6 Compute Capacity and Power  Processor specific table expressing power and compute capacity at each P-state.  The sched domain hierarchy is in a good position to hold this type of information.  Example (entirely made up): Capacity Power 0.2 0.4 0.4 0.9 0.6 1.5 0.8 2.2 1.0 3.2 Capacity Power 0.4 1.6 0.8 4.4 1.2 9.0 1.6 15.0 2.0 23.0 Little Big Equal compute capacity idle 0.1 leakage 0.1 idle 0.3 leakage 0.5 Little Big active 2.4 6.0 idle 0.0 0.0 cluster
  • 7. 7 energy_diff() def energy_diff(tload, scpu, dcpu): # Estimate the next compute capacity (P-state) s_new_cc = find_cpu_cap(scpu, cpu_util(scpu)) # energy model cost for task on source cpu s_task_energy = tload/s_new_cc * cpu_cc_power(scpu, s_new_cc) if nr_running(scpu) == 1: s_task_energy += cpu_leakage_energy[cpu_type[scpu]] # Estimate destination cpu cc after adding the task d_new_cc = find_cpu_cc(dcpu, cpu_util(dcpu)+tload) # energy model cost for task on destination cpu d_task_energy = tload/d_new_cc * cpu_cc_power(dcpu, d_new_cc) if nr_running(dcpu) == 0: d_task_energy += cpu_leakage_energy[cpu_type[dcpu]] return s_task_energy - d_task_energy  Balancing two cpus:  Balancing sched domains is slightly more complicated as it involves cluster power as well.
  • 8. 8 Example cpu rq util cap cc_power leak power 0 {0.2} 0.2 0.2 0.4 0.1 0.5 1 {0.1} 0.1 0.2 0.4 0.1 0.35 2 {} 0.0 0.2 0.4 0.1 0.1 cluster - 1.0 - 2.4 - 2.4 Total 3.35 energy_diff() = 0.075* * energy_diff() ignores cluster power and other tasks to keep computations cheap and simple. Better accuracy can be added if necessary. 0.55 saved cpu rq util cap cc_power leak power 0 {0.2, 0.1} 0.3 0.4 0.9 0.1 0.8 1 {} 0.0 0.4 0.9 0.1 0.1 2 {} 0.0 0.4 0.9 0.1 0.1 cluster - 0.75 - 2.4 - 1.8 Total 2.8 After EA load balance:
  • 9. 9 Is the energy model too simple?  It is essential that the energy model is fast and is easy to use for load- balancing.  The scheduler is a critical path and already complex enough.  Python model tests  Disclaimer: These numbers have not been validated in any way.  Test configuration: 3+3 big.LITTLE, 1000 random balance scenarios.  Rand/Opt: Random balance energy (starting point) worse than best possible balance energy (brute-force).  EA/Opt: Energy model based balance energy worse than best possible balance energy.  EA == Opt: Scenarios where EA found best possible balance. Tasks Rand/Opt EA/Opt EA == Opt 2 7.86% 0.09% 72.60% 3 7.79% 0.15% 64.80% 4 9.39% 0.45% 62.00% 5 10.02% 1.15% 51.10% 6 11.44% 2.23% 38.30%
  • 10. 10 What is next?  Early prototype to validate the idea. Initial focus getting energy_diff() working on simple SMP system.  Post on LKML very soon.  Open Issues  Exposing power/capacity tables to kernel. Essential to make the right decisions.  Plumbing: Where do the tables come from? DT?  Next steps:  Scale invariance: Requirement for the energy model to work.  Fix cpu_power/compute capacity use in scheduler.  Tooling and benchmarks (covered in another session)  Idle integration (covered in another session)