SlideShare ist ein Scribd-Unternehmen logo
1 von 56
Downloaden Sie, um offline zu lesen
Introduction ARM Kernel PM Plumbing Conclusion
Idling ARMs in a Busy World: Linux Power
Management for ARM multi-cluster systems
L.Pieralisi
Linaro Connect Q2-2012
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
Outline
1 Introduction
Power Management Fundamentals
2 ARM Kernel PM Plumbing
Introduction
ARM Common PM Code
ARM core code and CPU idle improvements
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
Power Management Fundamentals
Outline
1 Introduction
Power Management Fundamentals
2 ARM Kernel PM Plumbing
Introduction
ARM Common PM Code
ARM core code and CPU idle improvements
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
Power Management Fundamentals
SoC Technology and Power Consumption
Dynamic power, frequency scaling
Static (leakage) power, G (Generic) process, LP (Low Power)
process
Temperature variations
RAM retention
CPU/Cluster vs. IO devices
Need for more agressive and holistic power management
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
Power Management Fundamentals
Power Managed SoC Example
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
Power Management Fundamentals
Kernel Power Management Mechanics
System suspend User space forces system to sleep
Auto sleep Kernel forces system to sleep if no wake-ups
CPU idle Idle threads trigger sleep states
CPU freq CPU Frequency scaling
Runtime PM Devices Power Management
CPU hotplug Remove a CPU from the running system
Focus on CPU/Cluster Power Management (PM)
Unified code in the kernel to support saving and restoring of CPU
and Cluster state
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
Introduction
Outline
1 Introduction
Power Management Fundamentals
2 ARM Kernel PM Plumbing
Introduction
ARM Common PM Code
ARM core code and CPU idle improvements
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
Introduction
The Need for a Common Back-end
S2RAM, CPU idle and other PM subsystems require state
save/restore
CPU architectural state (inclusive of VFP and PMU)
Peripheral state (GIC, CCI)
Cache management (clear/invalidate, pipelining)
Our Goal
Create a back-end that unifies the code and caters for all
requirements
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
Outline
1 Introduction
Power Management Fundamentals
2 ARM Kernel PM Plumbing
Introduction
ARM Common PM Code
ARM core code and CPU idle improvements
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
ARM Common PM Code Components
CPU PM notifiers
Local timers save/restore
cpu suspend/resume
L2 suspend/resume
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
CPU PM notifiers (1/3)
Introduced by C.Cross to overcome code duplication in idle
and suspend code path
CPU events and CLUSTER events
GIC, VFP, PMU
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
CPU PM notifiers (2/3)
static int cpu_pm_notify(enum cpu_pm_event event, int nr_to_call, int *nr_calls)
{
int ret;
ret = __raw_notifier_call_chain(&cpu_pm_notifier_chain, event, NULL,
nr_to_call, nr_calls);
return notifier_to_errno(ret);
}
int cpu_pm_enter(void)
{
[...]
ret = cpu_pm_notify(CPU_PM_ENTER, -1, &nr_calls);
if (ret)
cpu_pm_notify(CPU_PM_ENTER_FAILED, nr_calls - 1, NULL);
[...]
return ret;
}
//CPU shutdown
cpu_pm_{enter,exit}();
//Cluster shutdown
cpu_cluster_pm_{enter,exit}();
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
CPU PM notifiers (3/3)
static int gic_notifier(struct notifier_block *self, unsigned long cmd, void *v)
{
int i;
[...]
switch (cmd) {
case CPU_PM_ENTER:
gic_cpu_save(i);
break;
case CPU_PM_ENTER_FAILED:
case CPU_PM_EXIT:
gic_cpu_restore(i);
break;
case CPU_CLUSTER_PM_ENTER:
gic_dist_save(i);
break;
case CPU_CLUSTER_PM_ENTER_FAILED:
case CPU_CLUSTER_PM_EXIT:
gic_dist_restore(i);
break;
}
return NOTIFY_OK;
}
static struct notifier_block gic_notifier_block = {
.notifier_call = gic_notifier,
};
v7 shutdown
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
Local timers save/restore
void enter_idle(...)
{
[...]
clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu);
[...]
cpu_do_idle();
[...]
clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu);
[...]
}
void enter_idle(...)
{
struct tick_device *tdev = tick_get_device(cpu);
[...]
cpu_do_idle();
[...]
/* Restore the per-cpu timer event */
clockevents_program_event(tdev->evtdev, tdev->evtdev->next_event, 1);
}
Enter broacadst mode if a global timer is available
Rely on always-on firmware timer and restore timer through clock
events programming API
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
ARM v7 SMP CPU Shutdown Procedure
1 save per CPU peripherals (IC, VFP, PMU)
2 save CPU registers
3 clean L1 D$
4 clean state from L2
5 disable L1 D$ allocation
6 clean L1 D$
7 exit coherency
8 call wfi (wait for interrupt)
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
ARM v7 SMP CPU Shutdown Procedure
1 save per CPU peripherals (IC, VFP, PMU)
2 save CPU registers
3 clean L1 D$
4 clean state from L2
5 disable L1 D$ allocation
6 clean L1 D$
7 exit coherency
8 call wfi (wait for interrupt)
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
ARM v7 SMP CPU Shutdown Procedure
1 save per CPU peripherals (IC, VFP, PMU)
2 save CPU registers
3 clean L1 D$
4 clean state from L2
5 disable L1 D$ allocation
6 clean L1 D$
7 exit coherency
8 call wfi (wait for interrupt)
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
ARM v7 SMP CPU Shutdown Procedure
1 save per CPU peripherals (IC, VFP, PMU)
2 save CPU registers
3 clean L1 D$
4 clean state from L2
5 disable L1 D$ allocation
6 clean L1 D$
7 exit coherency
8 call wfi (wait for interrupt)
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
ARM v7 SMP CPU Shutdown Procedure
1 save per CPU peripherals (IC, VFP, PMU)
2 save CPU registers
3 clean L1 D$
4 clean state from L2
5 disable L1 D$ allocation
6 clean L1 D$
7 exit coherency
8 call wfi (wait for interrupt)
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
ARM v7 SMP CPU Shutdown Procedure
1 save per CPU peripherals (IC, VFP, PMU)
2 save CPU registers
3 clean L1 D$
4 clean state from L2
5 disable L1 D$ allocation
6 clean L1 D$
7 exit coherency
8 call wfi (wait for interrupt)
This is the standard procedure that must be adopted by all platforms, for
cpu switching, cpu hotplug (cache cleaning and wfi), suspend and idle
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
ARM v7 SMP CPU Shutdown Procedure
1 save per CPU peripherals (IC, VFP, PMU)
2 save CPU registers
3 clean L1 D$
4 clean state from L2
5 disable L1 D$ allocation
6 clean L1 D$
7 exit coherency
8 call wfi (wait for interrupt)
This is the standard procedure that must be adopted by all platforms, for
cpu switching, cpu hotplug (cache cleaning and wfi), suspend and idle
idle
notifiers
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
CPU suspend (1/3)
Introduced by R.King to consolidate existing (and duplicated)
code across diffent ARM platforms
save/restore core registers, clean L1 and some bits of L2
L2 RAM retention handling poses further challenges
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
CPU suspend (2/3)
1:1 mapping page tables cloned from init_mm
C API, generic for all ARM architectures
int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
{
struct mm_struct *mm = current->active_mm;
int ret;
if (!suspend_pgd)
return -EINVAL;
[...]
ret = __cpu_suspend(arg, fn);
if (ret == 0) {
cpu_switch_mm(mm->pgd, mm);
local_flush_tlb_all();
}
return ret;
}
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
CPU suspend (3/3)
registers saved on the stack
void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr)
{
*save_ptr = virt_to_phys(ptr);
/* This must correspond to the LDM in cpu_resume() assembly */
*ptr++ = virt_to_phys(suspend_pgd);
*ptr++ = sp;
*ptr++ = virt_to_phys(cpu_do_resume);
cpu_do_suspend(ptr);
}
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
CPU suspend (3/3)
registers saved on the stack
L1 complete cleaning
void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr)
{
*save_ptr = virt_to_phys(ptr);
/* This must correspond to the LDM in cpu_resume() assembly */
*ptr++ = virt_to_phys(suspend_pgd);
*ptr++ = sp;
*ptr++ = virt_to_phys(cpu_do_resume);
cpu_do_suspend(ptr);
flush_cache_all();
}
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
CPU suspend (3/3)
registers saved on the stack
L1 complete cleaning
L2 partial cleaning
void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr)
{
*save_ptr = virt_to_phys(ptr);
/* This must correspond to the LDM in cpu_resume() assembly */
*ptr++ = virt_to_phys(suspend_pgd);
*ptr++ = sp;
*ptr++ = virt_to_phys(cpu_do_resume);
cpu_do_suspend(ptr);
flush_cache_all();
outer_clean_range(*save_ptr, *save_ptr + ptrsz);
outer_clean_range(virt_to_phys(save_ptr),
virt_to_phys(save_ptr) + sizeof(*save_ptr));
}
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
We Are Not Done, Yet: Cache-to-Cache migration
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
We Are Not Done, Yet: Cache-to-Cache migration
SCU keeps a copy of D$ cache TAG RAMs
To avoid data traffic ARM MPCore systems move dirty lines across cores
Lower L1 bus traffic
Dirty data might be fetched from another core during
power-down sequence
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
We Are Not Done, Yet: Cache-to-Cache migration
When the suspend finisher is called L1 is still allocating
accessing current implies accessing the sp
Snooping Direct Data Intervention (DDI), CPU might pull
dirty line in
ENTRY(disable_clean_inv_dcache_v7_all)
stmfd sp!, {r4-r5, r7, r9-r11, lr}
mrc p15, 0, r3, c1, c0, 0
bic r3, #4 @ clear C bit
mcr p15, 0, r3, c1, c0, 0
isb
bl v7_flush_dcache_all
mrc p15, 0, r0, c1, c0, 1
bic r0, r0, #0x40 @ exit SMP
mcr p15, 0, r0, c1, c0, 1
ldmfd sp!, {r4-r5, r7, r9-r11, pc}
ENDPROC(disable_clean_inv_dcache_v7_all)
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
We Are Not Done, Yet: Cache-to-Cache migration
When the suspend finisher is called L1 is still allocating
accessing current implies accessing the sp
Snooping Direct Data Intervention (DDI), CPU might pull
dirty line in
ENTRY(disable_clean_inv_dcache_v7_all)
stmfd sp!, {r4-r5, r7, r9-r11, lr}
mrc p15, 0, r3, c1, c0, 0
bic r3, #4 @ clear C bit
mcr p15, 0, r3, c1, c0, 0
isb
bl v7_flush_dcache_all
mrc p15, 0, r0, c1, c0, 1
bic r0, r0, #0x40 @ exit SMP
mcr p15, 0, r0, c1, c0, 1
ldmfd sp!, {r4-r5, r7, r9-r11, pc}
ENDPROC(disable_clean_inv_dcache_v7_all)
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
We Are Not Done, Yet: Cache-to-Cache migration
When the suspend finisher is called L1 is still allocating
accessing current implies accessing the sp
Snooping Direct Data Intervention (DDI), CPU might pull
dirty line in
ENTRY(disable_clean_inv_dcache_v7_all)
stmfd sp!, {r4-r5, r7, r9-r11, lr}
mrc p15, 0, r3, c1, c0, 0
bic r3, #4 @ clear C bit
mcr p15, 0, r3, c1, c0, 0
isb
bl v7_flush_dcache_all
mrc p15, 0, r0, c1, c0, 1
bic r0, r0, #0x40 @ exit SMP
mcr p15, 0, r0, c1, c0, 1
ldmfd sp!, {r4-r5, r7, r9-r11, pc}
ENDPROC(disable_clean_inv_dcache_v7_all)
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
Outer Cache Management: The Odd One Out (1/2)
L310 memory mapped device (aka outer cache)
Clearing C bit does NOT prevent allocation
L2 RAM retention, data sitting in L2, not accessible if MMU
is off
If not invalidated, L2 might contain stale data if resume code
runs with L2 off before enabling it
We could clean some specific bits: which ones ?
If retained, L2 must be resumed before turning MMU on
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
Outer Cache Management: The Odd One Out (1/2)
L310 memory mapped device (aka outer cache)
Clearing C bit does NOT prevent allocation
L2 RAM retention, data sitting in L2, not accessible if MMU
is off
If not invalidated, L2 might contain stale data if resume code
runs with L2 off before enabling it
We could clean some specific bits: which ones ?
If retained, L2 must be resumed before turning MMU on
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM Common PM Code
Outer Cache Management: The Odd One Out (2/2)
if L2 content is lost, it must be cleaned on shutdown but can
be resumed in C
if L2 is retained, it must be resumed in assembly before calling
cpu_resume
static void __init pl310_save(void)
{
u32 l2x0_revision = readl_relaxed(l2x0_base + L2X0_CACHE_ID) &
L2X0_CACHE_ID_RTL_MASK;
l2x0_saved_regs.tag_latency = readl_relaxed(l2x0_base +
L2X0_TAG_LATENCY_CTRL);
l2x0_saved_regs.data_latency = readl_relaxed(l2x0_base +
L2X0_DATA_LATENCY_CTRL);
[...]
}
//asm-offsets.c
DEFINE(L2X0_R_PHY_BASE, offsetof(struct l2x0_regs, phy_base));
DEFINE(L2X0_R_AUX_CTRL, offsetof(struct l2x0_regs, aux_ctrl));
[...]
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
Outline
1 Introduction
Power Management Fundamentals
2 ARM Kernel PM Plumbing
Introduction
ARM Common PM Code
ARM core code and CPU idle improvements
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
ARM big.LITTLE Systems
Heterogeneous system
Coherent CCI interconnect
Per-Cluster unified L2
Shared GIC 400
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
A15/A7 and big.LITTLE PM Requirements
Integrated L2 caches management
Inter-Cluster snoops (pipeline)
MPIDR affinity levels (boot, suspend/resume)
PM QoS, CPU switching and switching latencies
CPU idle tables asymmetricity
CPU idle table switching
CPU idle cluster states
Kernel Cluster awareness
Make changes to the kernel to allow both switching and MP SW models
to run on multi-cluster systems in a power efficient manner
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
v7 Cache Levels (1/2)
void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr)
{
*save_ptr = virt_to_phys(ptr);
/* This must correspond to the LDM in cpu_resume() assembly */
*ptr++ = virt_to_phys(suspend_pgd);
*ptr++ = sp;
*ptr++ = virt_to_phys(cpu_do_resume);
cpu_do_suspend(ptr);
flush_cache_all(); <- !! This call cleans all cache levels up to LoC to main memory !!
outer_clean_range(*save_ptr, *save_ptr + ptrsz);
outer_clean_range(virt_to_phys(save_ptr),
virt_to_phys(save_ptr) + sizeof(*save_ptr));
}
A15/A7 flush_cache_all() also cleans L2 !
per-CPU switching/idle must not clean unified L2, not required
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
v7 Cache Levels (2/2)
void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr)
{
*save_ptr = virt_to_phys(ptr);
/* This must correspond to the LDM in cpu_resume() assembly */
*ptr++ = virt_to_phys(suspend_pgd);
*ptr++ = sp;
*ptr++ = virt_to_phys(cpu_do_resume);
cpu_do_suspend(ptr);
flush_dcache_level(flush_cache_level_cpu());
[...]
__cpuc_flush_dcache_area(ptr, ptrsz);
__cpuc_flush_dcache_area(save_ptr, sizeof(*save_ptr));
outer_clean_range(*save_ptr, *save_ptr + ptrsz);
outer_clean_range(virt_to_phys(save_ptr),
virt_to_phys(save_ptr) + sizeof(*save_ptr));
}
Introduce cache level patches into the kernel
Flush all caches to the LoU-IS
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
MPIDR - suspend/resume (1/2)
MPIDR[23:16]: affinity level 2
MPIDR[15:8]: affinity level 1
MPIDR[7:0]: affinity level 0
MPIDRs of existing cores cannot be generated at boot unless
we probe them (pen release)
MPIDR must NOT be considered a linear index
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
MPIDR - suspend/resume (2/2)
ENTRY(cpu_resume)
#ifdef CONFIG_SMP
adr r0, sleep_save_sp
ALT_SMP(mrc p15, 0, r1, c0, c0, 5)
ALT_UP(mov r1, #0)
and r1, r1, #15
ldr r0, [r0, r1, lsl #2] @ stack phys addr
#else
ldr r0, sleep_save_sp @ stack phys addr
#endif
setmode PSR_I_BIT | PSR_F_BIT | SVC_MODE, r1 @ set SVC, irqs off
@ load phys pgd, stack, resume fn
ARM( ldmia r0!, {r1, sp, pc} )
[...]
ENDPROC(cpu_resume)
sleep_save_sp:
.rept CONFIG_NR_CPUS
.long 0 @ preserve stack phys ptr here
.endr
Create a map of MPIDR to logical indexes at boot to fetch the
proper context address
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
CPU idle: coupled C-states (1/3)
C.Cross code consolidating existing TI and Tegra code
shipped with current devices
Cores go idle at unrelated times that depend on the scheduler
and next event
Cluster states (whether SoC support per-CPU power rail or
not) can be attained iff all cores are idle
Hotplugging CPUs to force idle is not a solution
CPU idle barrier
All CPUs request power down at almost the same time
but...if power down fails, they all have to abort together
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
CPU idle: coupled C-states (2/3)
CPUs put into a safe state and woken up (IPI) to enter
the idle barrier
int cpuidle_enter_state_coupled(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int next_state)
{
[...]
retry:
/*
* Wait for all coupled cpus to be idle, using the deepest state
* allowed for a single cpu.
*/
while (!cpuidle_coupled_cpus_waiting(coupled)) {
if (cpuidle_coupled_clear_pokes(dev->cpu)) {
cpuidle_coupled_set_not_waiting(dev->cpu, coupled);
goto out;
}
if (coupled->prevent) {
cpuidle_coupled_set_not_waiting(dev->cpu, coupled);
goto out;
}
entered_state = cpuidle_enter_state(dev, drv,
dev->safe_state_index);
}
[...]
}
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
CPU idle: coupled C-states (2/3)
...this implies enabling irqs within idle....
static int cpuidle_coupled_clear_pokes(int cpu)
{
local_irq_enable();
while (cpumask_test_cpu(cpu, &cpuidle_coupled_poked_mask))
cpu_relax();
local_irq_disable();
return need_resched() ? -EINTR : 0;
}
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
CPU idle: coupled C-states (3/3)
”Defer no time, delays have dangerous ends”
W.Shakespeare
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
CPU idle: coupled C-states (3/3)
int arch_cpuidle_enter(struct cpuidle_device *dev, ...)
{
if (arch_turn_off_irq_controller()) {
/* returns an error if an irq is pending and would be lost
if idle continued and turned off power */
abort_flag = true;
}
cpuidle_coupled_parallel_barrier(dev, &abort_barrier);
if (abort_flag) {
/* One of the cpus didn’t turn off it’s irq controller */
arch_turn_on_irq_controller();
return -EINTR;
}
/* continue with idle */
...
}
Disable or move IRQs to one CPU
if IRQs are pending all CPUs get out of idle together
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
CPU idle: asymmetric tables (1/2)
Current kernel code exports a single latency table to all CPUs
b.L clusters sport asymmetric latency tables and must be
treated as such
In the switching case, idle tables must be switched at run-time
(upon notification)
P. De Schrijver created per-CPU idle states (through a pointer
in cpuidle_device)
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
CPU idle: asymmetric tables (2/2)
struct cpuidle_state_parameters {
unsigned int flags;
unsigned int exit_latency; /* in US */
int power_usage; /* in mW */
unsigned int target_residency; /* in US */
unsigned int disable;
};
struct cpuidle_device {
int state_count;
struct cpuidle_state_usage states_usage[CPUIDLE_STATE_MAX];
struct cpuidle_state_kobj *kobjs[CPUIDLE_STATE_MAX];
+ struct cpuidle_state_parameters *state_parameters;
struct list_head device_list;
struct kobject kobj;
struct completion kobj_unregister;
};
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
CPU idle: governors and next event (1/2)
Current governors make decisions on a per-CPU basis
Different CPUs in a cluster enter idle at arbitrary times
By the time coupled states are hit, governor decision can be
stale
Need to peek at next event to avoid initiating cluster
shutdown
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
CPU idle: governors and next event (2/2)
/**
* menu_select - selects the next idle state to enter
* @drv: cpuidle driver containing state data
* @dev: the CPU
*/
static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
{
[...]
/* determine the expected residency time, round up */
t = ktime_to_timespec(tick_nohz_get_sleep_length());
data->expected_us =
t.tv_sec * USEC_PER_SEC + t.tv_nsec / NSEC_PER_USEC;
[...]
/* Make sure to round up for half microseconds */
data->predicted_us = div_round64(data->expected_us * data->correction_factor[data->bucket],
RESOLUTION * DECAY);
[...]
/*
* Find the idle state with the lowest power while satisfying
* our constraints.
*/
for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) {
struct cpuidle_state *s = &drv->states[i];
[...]
if (s->target_residency > data->predicted_us)
continue;
if (s->exit_latency > latency_req)
continue;
[...]
if (s->power_usage < power_usage) {
power_usage = s->power_usage;
data->last_state_idx = i;
data->exit_us = s->exit_latency;
}
}
return data->last_state_idx;
}
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
Security Management
Most of the operations carried out in secure world
Policy decisions made in Linux
ARM SMC protocol proposal under review
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
Putting Everything Together (1/3)
Common PM Back-End Entry
struct pm_pms {
unsigned int cluster_state;
unsigned int cpu_state;
unsigned int subsystem;
};
void enter_lowpower(unsigned int cluster_state, unsigned int cpu_state, unsigned flags)
{
int i, cpu = smp_processor_id();
struct pm_pms pms;
struct cpumask tmp;
[...]
cpu_set(cpu_index, cpuidle_mask);
cpumask_and(&tmp, &cpuidle_mask, topology_core_cpumask(cpu));
pms.cluster_state = cluster_state;
pms.cpu_state = cpu_state;
pms.subsystem = flags;
if (!cpu_weight(&tmp) == cpu_weight(topology_core_cpumask(cpu)))
pms.cluster_state = 0;
cpu_pm_enter();
if (pms.cluster_state >= SHUTDOWN)
cpu_cluster_pm_enter();
cpu_suspend(&pms, suspend_finisher);
cpu_pm_exit();
if (pms.power_state >= SHUTDOWN)
cpu_cluster_pm_exit();
cpu_clear(cpu_index, cpu_idle_mask);
return 0;
}
v7 shutdown next
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
Putting Everything Together (2/3)
int suspend_finisher(unsigned long arg)
{
struct pm_pms *pp = (struct pm_pms *) arg;
[...]
switch (pp->subsystem) {
case S2RAM:
[...]
break;
case IDLE:
[...]
break;
default:
break;
}
smc_down(...);
return 1;
}
prev
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
ARM core code and CPU idle improvements
Putting Everything Together (3/3)
smc_down:
ldr
r0, =#SMC_NUMBER
smc #0
/*
* Pseudo code describing what secure world
* should do
*/
{
disable_clean_inv_dcache_all();
if (cluster->cluster_down && cluster->power_state == SHUTDOWN) {
flush_cache_all();
outer_flush_all();
}
normal_uncached_memory_lock();
disable_cci_snoops();
normal_uncached_memory_unlock();
power_down_command();
cpu_do_idle();
}
prev
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
Conclusion
PM kernel subsystems need power-down/up common back-end
Multi-cluster ARM systems require changes to core code and
CPU idle core drivers
Effort will gain momentum as soon as big.LITTLE platforms
start getting merged in the mainline
Improvements for PM subsystem as a whole
Outlook
Create common back-end for S2RAM, CPU idle and other PM
subsystems using CPU suspend/resume
Integrate ARM SMC proposal
Benchmarking
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
Introduction ARM Kernel PM Plumbing Conclusion
THANKS !!!
L.Pieralisi ARM Ltd.
Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems

Weitere ähnliche Inhalte

Was ist angesagt?

OSMC 2014: Server Hardware Monitoring done right | Werner Fischer
OSMC 2014: Server Hardware Monitoring done right | Werner FischerOSMC 2014: Server Hardware Monitoring done right | Werner Fischer
OSMC 2014: Server Hardware Monitoring done right | Werner FischerNETWAYS
 
Nvidia’s tegra line of processors for mobile devices2 2
Nvidia’s tegra line of processors for mobile devices2 2Nvidia’s tegra line of processors for mobile devices2 2
Nvidia’s tegra line of processors for mobile devices2 2Sukul Yarraguntla
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareLinaro
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareLinaro
 
ARM Architecture and Meltdown/Spectre
ARM Architecture and Meltdown/SpectreARM Architecture and Meltdown/Spectre
ARM Architecture and Meltdown/SpectreGlobalLogic Ukraine
 
SFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM VirtualizationSFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM VirtualizationLinaro
 
SoM with Zynq UltraScale device
SoM with Zynq UltraScale deviceSoM with Zynq UltraScale device
SoM with Zynq UltraScale devicenie, jack
 
Introduction to Stellaris Family Microcontrollers
Introduction to Stellaris Family MicrocontrollersIntroduction to Stellaris Family Microcontrollers
Introduction to Stellaris Family MicrocontrollersPremier Farnell
 
Q4.11: Next Gen Mobile Storage – UFS
Q4.11: Next Gen Mobile Storage – UFSQ4.11: Next Gen Mobile Storage – UFS
Q4.11: Next Gen Mobile Storage – UFSLinaro
 
Q2.12: Power Management Across OSs
Q2.12: Power Management Across OSsQ2.12: Power Management Across OSs
Q2.12: Power Management Across OSsLinaro
 
U-boot and Android Verified Boot 2.0
U-boot and Android Verified Boot 2.0U-boot and Android Verified Boot 2.0
U-boot and Android Verified Boot 2.0GlobalLogic Ukraine
 
LCA13: Power State Coordination Interface
LCA13: Power State Coordination InterfaceLCA13: Power State Coordination Interface
LCA13: Power State Coordination InterfaceLinaro
 

Was ist angesagt? (20)

OSMC 2014: Server Hardware Monitoring done right | Werner Fischer
OSMC 2014: Server Hardware Monitoring done right | Werner FischerOSMC 2014: Server Hardware Monitoring done right | Werner Fischer
OSMC 2014: Server Hardware Monitoring done right | Werner Fischer
 
Nvidia’s tegra line of processors for mobile devices2 2
Nvidia’s tegra line of processors for mobile devices2 2Nvidia’s tegra line of processors for mobile devices2 2
Nvidia’s tegra line of processors for mobile devices2 2
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
 
ARM Architecture and Meltdown/Spectre
ARM Architecture and Meltdown/SpectreARM Architecture and Meltdown/Spectre
ARM Architecture and Meltdown/Spectre
 
SFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM VirtualizationSFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM Virtualization
 
SoM with Zynq UltraScale device
SoM with Zynq UltraScale deviceSoM with Zynq UltraScale device
SoM with Zynq UltraScale device
 
Blackfin system services
Blackfin system servicesBlackfin system services
Blackfin system services
 
Introduction to Stellaris Family Microcontrollers
Introduction to Stellaris Family MicrocontrollersIntroduction to Stellaris Family Microcontrollers
Introduction to Stellaris Family Microcontrollers
 
Linux Audio Drivers. ALSA
Linux Audio Drivers. ALSALinux Audio Drivers. ALSA
Linux Audio Drivers. ALSA
 
1 Day Arm 2007
1 Day Arm 20071 Day Arm 2007
1 Day Arm 2007
 
Ccna day2
Ccna day2Ccna day2
Ccna day2
 
Ccna 2
Ccna 2Ccna 2
Ccna 2
 
Ccna day2
Ccna day2Ccna day2
Ccna day2
 
Ccna day2
Ccna day2Ccna day2
Ccna day2
 
Q4.11: Next Gen Mobile Storage – UFS
Q4.11: Next Gen Mobile Storage – UFSQ4.11: Next Gen Mobile Storage – UFS
Q4.11: Next Gen Mobile Storage – UFS
 
Q2.12: Power Management Across OSs
Q2.12: Power Management Across OSsQ2.12: Power Management Across OSs
Q2.12: Power Management Across OSs
 
U-boot and Android Verified Boot 2.0
U-boot and Android Verified Boot 2.0U-boot and Android Verified Boot 2.0
U-boot and Android Verified Boot 2.0
 
AVR introduction
AVR introduction AVR introduction
AVR introduction
 
LCA13: Power State Coordination Interface
LCA13: Power State Coordination InterfaceLCA13: Power State Coordination Interface
LCA13: Power State Coordination Interface
 

Andere mochten auch

LAS16-TR02: Upstreaming 101
LAS16-TR02: Upstreaming 101LAS16-TR02: Upstreaming 101
LAS16-TR02: Upstreaming 101Linaro
 
BKK16-104 sched-freq
BKK16-104 sched-freqBKK16-104 sched-freq
BKK16-104 sched-freqLinaro
 
Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_Linaro
 
LCU14 500 ARM Trusted Firmware
LCU14 500 ARM Trusted FirmwareLCU14 500 ARM Trusted Firmware
LCU14 500 ARM Trusted FirmwareLinaro
 
Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance ToolsBrendan Gregg
 
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to EmbeddedLAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to EmbeddedLinaro
 
LCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted FirmwareLCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted FirmwareLinaro
 

Andere mochten auch (7)

LAS16-TR02: Upstreaming 101
LAS16-TR02: Upstreaming 101LAS16-TR02: Upstreaming 101
LAS16-TR02: Upstreaming 101
 
BKK16-104 sched-freq
BKK16-104 sched-freqBKK16-104 sched-freq
BKK16-104 sched-freq
 
Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_
 
LCU14 500 ARM Trusted Firmware
LCU14 500 ARM Trusted FirmwareLCU14 500 ARM Trusted Firmware
LCU14 500 ARM Trusted Firmware
 
Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance Tools
 
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to EmbeddedLAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
 
LCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted FirmwareLCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted Firmware
 

Ähnlich wie Q2.12: Idling ARMs in a busy world: Linux Power Management for ARM Multicluster Systems

Ähnlich wie Q2.12: Idling ARMs in a busy world: Linux Power Management for ARM Multicluster Systems (20)

ARM - Advance RISC Machine
ARM - Advance RISC MachineARM - Advance RISC Machine
ARM - Advance RISC Machine
 
arm
armarm
arm
 
Arm architecture
Arm architectureArm architecture
Arm architecture
 
Arm architecture overview
Arm architecture overviewArm architecture overview
Arm architecture overview
 
Unit vi (2)
Unit vi (2)Unit vi (2)
Unit vi (2)
 
Arm architecture
Arm architectureArm architecture
Arm architecture
 
Unit 4 _ ARM Processors .pptx
Unit 4 _ ARM Processors .pptxUnit 4 _ ARM Processors .pptx
Unit 4 _ ARM Processors .pptx
 
arm_3.ppt
arm_3.pptarm_3.ppt
arm_3.ppt
 
Arm corrected ppt
Arm corrected pptArm corrected ppt
Arm corrected ppt
 
Processor types
Processor typesProcessor types
Processor types
 
arm 7 microprocessor architecture ans pin diagram.ppt
arm 7 microprocessor architecture ans pin diagram.pptarm 7 microprocessor architecture ans pin diagram.ppt
arm 7 microprocessor architecture ans pin diagram.ppt
 
Q4.11: ARM Technology Update Plenary
Q4.11: ARM Technology Update PlenaryQ4.11: ARM Technology Update Plenary
Q4.11: ARM Technology Update Plenary
 
ARM stacks, subroutines, Cortex M3, LPC 214X
ARM  stacks, subroutines, Cortex M3, LPC 214XARM  stacks, subroutines, Cortex M3, LPC 214X
ARM stacks, subroutines, Cortex M3, LPC 214X
 
Introduction to stm32-part2
Introduction to stm32-part2Introduction to stm32-part2
Introduction to stm32-part2
 
“Linux Kernel CPU Hotplug in the Multicore System”
“Linux Kernel CPU Hotplug in the Multicore System”“Linux Kernel CPU Hotplug in the Multicore System”
“Linux Kernel CPU Hotplug in the Multicore System”
 
AVR_Course_Day4 introduction to microcontroller
AVR_Course_Day4 introduction to microcontrollerAVR_Course_Day4 introduction to microcontroller
AVR_Course_Day4 introduction to microcontroller
 
Assignment
AssignmentAssignment
Assignment
 
PIC Introduction and explained in detailed
PIC Introduction and explained in detailedPIC Introduction and explained in detailed
PIC Introduction and explained in detailed
 
arm
 arm arm
arm
 
Arm
ArmArm
Arm
 

Mehr von Linaro

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloLinaro
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaLinaro
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraLinaro
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaLinaro
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018Linaro
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018Linaro
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...Linaro
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Linaro
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Linaro
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteLinaro
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopLinaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allLinaro
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorLinaro
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMULinaro
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MLinaro
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation Linaro
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootLinaro
 

Mehr von Linaro (20)

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qa
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening Keynote
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8M
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
 

Kürzlich hochgeladen

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 

Kürzlich hochgeladen (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 

Q2.12: Idling ARMs in a busy world: Linux Power Management for ARM Multicluster Systems

  • 1. Introduction ARM Kernel PM Plumbing Conclusion Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems L.Pieralisi Linaro Connect Q2-2012 L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 2. Introduction ARM Kernel PM Plumbing Conclusion Outline 1 Introduction Power Management Fundamentals 2 ARM Kernel PM Plumbing Introduction ARM Common PM Code ARM core code and CPU idle improvements L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 3. Introduction ARM Kernel PM Plumbing Conclusion Power Management Fundamentals Outline 1 Introduction Power Management Fundamentals 2 ARM Kernel PM Plumbing Introduction ARM Common PM Code ARM core code and CPU idle improvements L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 4. Introduction ARM Kernel PM Plumbing Conclusion Power Management Fundamentals SoC Technology and Power Consumption Dynamic power, frequency scaling Static (leakage) power, G (Generic) process, LP (Low Power) process Temperature variations RAM retention CPU/Cluster vs. IO devices Need for more agressive and holistic power management L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 5. Introduction ARM Kernel PM Plumbing Conclusion Power Management Fundamentals Power Managed SoC Example L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 6. Introduction ARM Kernel PM Plumbing Conclusion Power Management Fundamentals Kernel Power Management Mechanics System suspend User space forces system to sleep Auto sleep Kernel forces system to sleep if no wake-ups CPU idle Idle threads trigger sleep states CPU freq CPU Frequency scaling Runtime PM Devices Power Management CPU hotplug Remove a CPU from the running system Focus on CPU/Cluster Power Management (PM) Unified code in the kernel to support saving and restoring of CPU and Cluster state L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 7. Introduction ARM Kernel PM Plumbing Conclusion Introduction Outline 1 Introduction Power Management Fundamentals 2 ARM Kernel PM Plumbing Introduction ARM Common PM Code ARM core code and CPU idle improvements L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 8. Introduction ARM Kernel PM Plumbing Conclusion Introduction The Need for a Common Back-end S2RAM, CPU idle and other PM subsystems require state save/restore CPU architectural state (inclusive of VFP and PMU) Peripheral state (GIC, CCI) Cache management (clear/invalidate, pipelining) Our Goal Create a back-end that unifies the code and caters for all requirements L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 9. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code Outline 1 Introduction Power Management Fundamentals 2 ARM Kernel PM Plumbing Introduction ARM Common PM Code ARM core code and CPU idle improvements L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 10. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code ARM Common PM Code Components CPU PM notifiers Local timers save/restore cpu suspend/resume L2 suspend/resume L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 11. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code CPU PM notifiers (1/3) Introduced by C.Cross to overcome code duplication in idle and suspend code path CPU events and CLUSTER events GIC, VFP, PMU L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 12. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code CPU PM notifiers (2/3) static int cpu_pm_notify(enum cpu_pm_event event, int nr_to_call, int *nr_calls) { int ret; ret = __raw_notifier_call_chain(&cpu_pm_notifier_chain, event, NULL, nr_to_call, nr_calls); return notifier_to_errno(ret); } int cpu_pm_enter(void) { [...] ret = cpu_pm_notify(CPU_PM_ENTER, -1, &nr_calls); if (ret) cpu_pm_notify(CPU_PM_ENTER_FAILED, nr_calls - 1, NULL); [...] return ret; } //CPU shutdown cpu_pm_{enter,exit}(); //Cluster shutdown cpu_cluster_pm_{enter,exit}(); L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 13. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code CPU PM notifiers (3/3) static int gic_notifier(struct notifier_block *self, unsigned long cmd, void *v) { int i; [...] switch (cmd) { case CPU_PM_ENTER: gic_cpu_save(i); break; case CPU_PM_ENTER_FAILED: case CPU_PM_EXIT: gic_cpu_restore(i); break; case CPU_CLUSTER_PM_ENTER: gic_dist_save(i); break; case CPU_CLUSTER_PM_ENTER_FAILED: case CPU_CLUSTER_PM_EXIT: gic_dist_restore(i); break; } return NOTIFY_OK; } static struct notifier_block gic_notifier_block = { .notifier_call = gic_notifier, }; v7 shutdown L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 14. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code Local timers save/restore void enter_idle(...) { [...] clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu); [...] cpu_do_idle(); [...] clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu); [...] } void enter_idle(...) { struct tick_device *tdev = tick_get_device(cpu); [...] cpu_do_idle(); [...] /* Restore the per-cpu timer event */ clockevents_program_event(tdev->evtdev, tdev->evtdev->next_event, 1); } Enter broacadst mode if a global timer is available Rely on always-on firmware timer and restore timer through clock events programming API L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 15. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code ARM v7 SMP CPU Shutdown Procedure 1 save per CPU peripherals (IC, VFP, PMU) 2 save CPU registers 3 clean L1 D$ 4 clean state from L2 5 disable L1 D$ allocation 6 clean L1 D$ 7 exit coherency 8 call wfi (wait for interrupt) L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 16. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code ARM v7 SMP CPU Shutdown Procedure 1 save per CPU peripherals (IC, VFP, PMU) 2 save CPU registers 3 clean L1 D$ 4 clean state from L2 5 disable L1 D$ allocation 6 clean L1 D$ 7 exit coherency 8 call wfi (wait for interrupt) L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 17. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code ARM v7 SMP CPU Shutdown Procedure 1 save per CPU peripherals (IC, VFP, PMU) 2 save CPU registers 3 clean L1 D$ 4 clean state from L2 5 disable L1 D$ allocation 6 clean L1 D$ 7 exit coherency 8 call wfi (wait for interrupt) L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 18. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code ARM v7 SMP CPU Shutdown Procedure 1 save per CPU peripherals (IC, VFP, PMU) 2 save CPU registers 3 clean L1 D$ 4 clean state from L2 5 disable L1 D$ allocation 6 clean L1 D$ 7 exit coherency 8 call wfi (wait for interrupt) L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 19. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code ARM v7 SMP CPU Shutdown Procedure 1 save per CPU peripherals (IC, VFP, PMU) 2 save CPU registers 3 clean L1 D$ 4 clean state from L2 5 disable L1 D$ allocation 6 clean L1 D$ 7 exit coherency 8 call wfi (wait for interrupt) L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 20. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code ARM v7 SMP CPU Shutdown Procedure 1 save per CPU peripherals (IC, VFP, PMU) 2 save CPU registers 3 clean L1 D$ 4 clean state from L2 5 disable L1 D$ allocation 6 clean L1 D$ 7 exit coherency 8 call wfi (wait for interrupt) This is the standard procedure that must be adopted by all platforms, for cpu switching, cpu hotplug (cache cleaning and wfi), suspend and idle L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 21. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code ARM v7 SMP CPU Shutdown Procedure 1 save per CPU peripherals (IC, VFP, PMU) 2 save CPU registers 3 clean L1 D$ 4 clean state from L2 5 disable L1 D$ allocation 6 clean L1 D$ 7 exit coherency 8 call wfi (wait for interrupt) This is the standard procedure that must be adopted by all platforms, for cpu switching, cpu hotplug (cache cleaning and wfi), suspend and idle idle notifiers L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 22. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code CPU suspend (1/3) Introduced by R.King to consolidate existing (and duplicated) code across diffent ARM platforms save/restore core registers, clean L1 and some bits of L2 L2 RAM retention handling poses further challenges L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 23. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code CPU suspend (2/3) 1:1 mapping page tables cloned from init_mm C API, generic for all ARM architectures int cpu_suspend(unsigned long arg, int (*fn)(unsigned long)) { struct mm_struct *mm = current->active_mm; int ret; if (!suspend_pgd) return -EINVAL; [...] ret = __cpu_suspend(arg, fn); if (ret == 0) { cpu_switch_mm(mm->pgd, mm); local_flush_tlb_all(); } return ret; } L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 24. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code CPU suspend (3/3) registers saved on the stack void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr) { *save_ptr = virt_to_phys(ptr); /* This must correspond to the LDM in cpu_resume() assembly */ *ptr++ = virt_to_phys(suspend_pgd); *ptr++ = sp; *ptr++ = virt_to_phys(cpu_do_resume); cpu_do_suspend(ptr); } L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 25. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code CPU suspend (3/3) registers saved on the stack L1 complete cleaning void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr) { *save_ptr = virt_to_phys(ptr); /* This must correspond to the LDM in cpu_resume() assembly */ *ptr++ = virt_to_phys(suspend_pgd); *ptr++ = sp; *ptr++ = virt_to_phys(cpu_do_resume); cpu_do_suspend(ptr); flush_cache_all(); } L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 26. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code CPU suspend (3/3) registers saved on the stack L1 complete cleaning L2 partial cleaning void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr) { *save_ptr = virt_to_phys(ptr); /* This must correspond to the LDM in cpu_resume() assembly */ *ptr++ = virt_to_phys(suspend_pgd); *ptr++ = sp; *ptr++ = virt_to_phys(cpu_do_resume); cpu_do_suspend(ptr); flush_cache_all(); outer_clean_range(*save_ptr, *save_ptr + ptrsz); outer_clean_range(virt_to_phys(save_ptr), virt_to_phys(save_ptr) + sizeof(*save_ptr)); } L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 27. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code We Are Not Done, Yet: Cache-to-Cache migration L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 28. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code We Are Not Done, Yet: Cache-to-Cache migration SCU keeps a copy of D$ cache TAG RAMs To avoid data traffic ARM MPCore systems move dirty lines across cores Lower L1 bus traffic Dirty data might be fetched from another core during power-down sequence L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 29. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code We Are Not Done, Yet: Cache-to-Cache migration When the suspend finisher is called L1 is still allocating accessing current implies accessing the sp Snooping Direct Data Intervention (DDI), CPU might pull dirty line in ENTRY(disable_clean_inv_dcache_v7_all) stmfd sp!, {r4-r5, r7, r9-r11, lr} mrc p15, 0, r3, c1, c0, 0 bic r3, #4 @ clear C bit mcr p15, 0, r3, c1, c0, 0 isb bl v7_flush_dcache_all mrc p15, 0, r0, c1, c0, 1 bic r0, r0, #0x40 @ exit SMP mcr p15, 0, r0, c1, c0, 1 ldmfd sp!, {r4-r5, r7, r9-r11, pc} ENDPROC(disable_clean_inv_dcache_v7_all) L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 30. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code We Are Not Done, Yet: Cache-to-Cache migration When the suspend finisher is called L1 is still allocating accessing current implies accessing the sp Snooping Direct Data Intervention (DDI), CPU might pull dirty line in ENTRY(disable_clean_inv_dcache_v7_all) stmfd sp!, {r4-r5, r7, r9-r11, lr} mrc p15, 0, r3, c1, c0, 0 bic r3, #4 @ clear C bit mcr p15, 0, r3, c1, c0, 0 isb bl v7_flush_dcache_all mrc p15, 0, r0, c1, c0, 1 bic r0, r0, #0x40 @ exit SMP mcr p15, 0, r0, c1, c0, 1 ldmfd sp!, {r4-r5, r7, r9-r11, pc} ENDPROC(disable_clean_inv_dcache_v7_all) L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 31. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code We Are Not Done, Yet: Cache-to-Cache migration When the suspend finisher is called L1 is still allocating accessing current implies accessing the sp Snooping Direct Data Intervention (DDI), CPU might pull dirty line in ENTRY(disable_clean_inv_dcache_v7_all) stmfd sp!, {r4-r5, r7, r9-r11, lr} mrc p15, 0, r3, c1, c0, 0 bic r3, #4 @ clear C bit mcr p15, 0, r3, c1, c0, 0 isb bl v7_flush_dcache_all mrc p15, 0, r0, c1, c0, 1 bic r0, r0, #0x40 @ exit SMP mcr p15, 0, r0, c1, c0, 1 ldmfd sp!, {r4-r5, r7, r9-r11, pc} ENDPROC(disable_clean_inv_dcache_v7_all) L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 32. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code Outer Cache Management: The Odd One Out (1/2) L310 memory mapped device (aka outer cache) Clearing C bit does NOT prevent allocation L2 RAM retention, data sitting in L2, not accessible if MMU is off If not invalidated, L2 might contain stale data if resume code runs with L2 off before enabling it We could clean some specific bits: which ones ? If retained, L2 must be resumed before turning MMU on L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 33. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code Outer Cache Management: The Odd One Out (1/2) L310 memory mapped device (aka outer cache) Clearing C bit does NOT prevent allocation L2 RAM retention, data sitting in L2, not accessible if MMU is off If not invalidated, L2 might contain stale data if resume code runs with L2 off before enabling it We could clean some specific bits: which ones ? If retained, L2 must be resumed before turning MMU on L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 34. Introduction ARM Kernel PM Plumbing Conclusion ARM Common PM Code Outer Cache Management: The Odd One Out (2/2) if L2 content is lost, it must be cleaned on shutdown but can be resumed in C if L2 is retained, it must be resumed in assembly before calling cpu_resume static void __init pl310_save(void) { u32 l2x0_revision = readl_relaxed(l2x0_base + L2X0_CACHE_ID) & L2X0_CACHE_ID_RTL_MASK; l2x0_saved_regs.tag_latency = readl_relaxed(l2x0_base + L2X0_TAG_LATENCY_CTRL); l2x0_saved_regs.data_latency = readl_relaxed(l2x0_base + L2X0_DATA_LATENCY_CTRL); [...] } //asm-offsets.c DEFINE(L2X0_R_PHY_BASE, offsetof(struct l2x0_regs, phy_base)); DEFINE(L2X0_R_AUX_CTRL, offsetof(struct l2x0_regs, aux_ctrl)); [...] L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 35. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements Outline 1 Introduction Power Management Fundamentals 2 ARM Kernel PM Plumbing Introduction ARM Common PM Code ARM core code and CPU idle improvements L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 36. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements ARM big.LITTLE Systems Heterogeneous system Coherent CCI interconnect Per-Cluster unified L2 Shared GIC 400 L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 37. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements A15/A7 and big.LITTLE PM Requirements Integrated L2 caches management Inter-Cluster snoops (pipeline) MPIDR affinity levels (boot, suspend/resume) PM QoS, CPU switching and switching latencies CPU idle tables asymmetricity CPU idle table switching CPU idle cluster states Kernel Cluster awareness Make changes to the kernel to allow both switching and MP SW models to run on multi-cluster systems in a power efficient manner L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 38. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements v7 Cache Levels (1/2) void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr) { *save_ptr = virt_to_phys(ptr); /* This must correspond to the LDM in cpu_resume() assembly */ *ptr++ = virt_to_phys(suspend_pgd); *ptr++ = sp; *ptr++ = virt_to_phys(cpu_do_resume); cpu_do_suspend(ptr); flush_cache_all(); <- !! This call cleans all cache levels up to LoC to main memory !! outer_clean_range(*save_ptr, *save_ptr + ptrsz); outer_clean_range(virt_to_phys(save_ptr), virt_to_phys(save_ptr) + sizeof(*save_ptr)); } A15/A7 flush_cache_all() also cleans L2 ! per-CPU switching/idle must not clean unified L2, not required L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 39. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements v7 Cache Levels (2/2) void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr) { *save_ptr = virt_to_phys(ptr); /* This must correspond to the LDM in cpu_resume() assembly */ *ptr++ = virt_to_phys(suspend_pgd); *ptr++ = sp; *ptr++ = virt_to_phys(cpu_do_resume); cpu_do_suspend(ptr); flush_dcache_level(flush_cache_level_cpu()); [...] __cpuc_flush_dcache_area(ptr, ptrsz); __cpuc_flush_dcache_area(save_ptr, sizeof(*save_ptr)); outer_clean_range(*save_ptr, *save_ptr + ptrsz); outer_clean_range(virt_to_phys(save_ptr), virt_to_phys(save_ptr) + sizeof(*save_ptr)); } Introduce cache level patches into the kernel Flush all caches to the LoU-IS L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 40. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements MPIDR - suspend/resume (1/2) MPIDR[23:16]: affinity level 2 MPIDR[15:8]: affinity level 1 MPIDR[7:0]: affinity level 0 MPIDRs of existing cores cannot be generated at boot unless we probe them (pen release) MPIDR must NOT be considered a linear index L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 41. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements MPIDR - suspend/resume (2/2) ENTRY(cpu_resume) #ifdef CONFIG_SMP adr r0, sleep_save_sp ALT_SMP(mrc p15, 0, r1, c0, c0, 5) ALT_UP(mov r1, #0) and r1, r1, #15 ldr r0, [r0, r1, lsl #2] @ stack phys addr #else ldr r0, sleep_save_sp @ stack phys addr #endif setmode PSR_I_BIT | PSR_F_BIT | SVC_MODE, r1 @ set SVC, irqs off @ load phys pgd, stack, resume fn ARM( ldmia r0!, {r1, sp, pc} ) [...] ENDPROC(cpu_resume) sleep_save_sp: .rept CONFIG_NR_CPUS .long 0 @ preserve stack phys ptr here .endr Create a map of MPIDR to logical indexes at boot to fetch the proper context address L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 42. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements CPU idle: coupled C-states (1/3) C.Cross code consolidating existing TI and Tegra code shipped with current devices Cores go idle at unrelated times that depend on the scheduler and next event Cluster states (whether SoC support per-CPU power rail or not) can be attained iff all cores are idle Hotplugging CPUs to force idle is not a solution CPU idle barrier All CPUs request power down at almost the same time but...if power down fails, they all have to abort together L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 43. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements CPU idle: coupled C-states (2/3) CPUs put into a safe state and woken up (IPI) to enter the idle barrier int cpuidle_enter_state_coupled(struct cpuidle_device *dev, struct cpuidle_driver *drv, int next_state) { [...] retry: /* * Wait for all coupled cpus to be idle, using the deepest state * allowed for a single cpu. */ while (!cpuidle_coupled_cpus_waiting(coupled)) { if (cpuidle_coupled_clear_pokes(dev->cpu)) { cpuidle_coupled_set_not_waiting(dev->cpu, coupled); goto out; } if (coupled->prevent) { cpuidle_coupled_set_not_waiting(dev->cpu, coupled); goto out; } entered_state = cpuidle_enter_state(dev, drv, dev->safe_state_index); } [...] } L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 44. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements CPU idle: coupled C-states (2/3) ...this implies enabling irqs within idle.... static int cpuidle_coupled_clear_pokes(int cpu) { local_irq_enable(); while (cpumask_test_cpu(cpu, &cpuidle_coupled_poked_mask)) cpu_relax(); local_irq_disable(); return need_resched() ? -EINTR : 0; } L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 45. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements CPU idle: coupled C-states (3/3) ”Defer no time, delays have dangerous ends” W.Shakespeare L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 46. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements CPU idle: coupled C-states (3/3) int arch_cpuidle_enter(struct cpuidle_device *dev, ...) { if (arch_turn_off_irq_controller()) { /* returns an error if an irq is pending and would be lost if idle continued and turned off power */ abort_flag = true; } cpuidle_coupled_parallel_barrier(dev, &abort_barrier); if (abort_flag) { /* One of the cpus didn’t turn off it’s irq controller */ arch_turn_on_irq_controller(); return -EINTR; } /* continue with idle */ ... } Disable or move IRQs to one CPU if IRQs are pending all CPUs get out of idle together L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 47. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements CPU idle: asymmetric tables (1/2) Current kernel code exports a single latency table to all CPUs b.L clusters sport asymmetric latency tables and must be treated as such In the switching case, idle tables must be switched at run-time (upon notification) P. De Schrijver created per-CPU idle states (through a pointer in cpuidle_device) L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 48. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements CPU idle: asymmetric tables (2/2) struct cpuidle_state_parameters { unsigned int flags; unsigned int exit_latency; /* in US */ int power_usage; /* in mW */ unsigned int target_residency; /* in US */ unsigned int disable; }; struct cpuidle_device { int state_count; struct cpuidle_state_usage states_usage[CPUIDLE_STATE_MAX]; struct cpuidle_state_kobj *kobjs[CPUIDLE_STATE_MAX]; + struct cpuidle_state_parameters *state_parameters; struct list_head device_list; struct kobject kobj; struct completion kobj_unregister; }; L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 49. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements CPU idle: governors and next event (1/2) Current governors make decisions on a per-CPU basis Different CPUs in a cluster enter idle at arbitrary times By the time coupled states are hit, governor decision can be stale Need to peek at next event to avoid initiating cluster shutdown L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 50. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements CPU idle: governors and next event (2/2) /** * menu_select - selects the next idle state to enter * @drv: cpuidle driver containing state data * @dev: the CPU */ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) { [...] /* determine the expected residency time, round up */ t = ktime_to_timespec(tick_nohz_get_sleep_length()); data->expected_us = t.tv_sec * USEC_PER_SEC + t.tv_nsec / NSEC_PER_USEC; [...] /* Make sure to round up for half microseconds */ data->predicted_us = div_round64(data->expected_us * data->correction_factor[data->bucket], RESOLUTION * DECAY); [...] /* * Find the idle state with the lowest power while satisfying * our constraints. */ for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) { struct cpuidle_state *s = &drv->states[i]; [...] if (s->target_residency > data->predicted_us) continue; if (s->exit_latency > latency_req) continue; [...] if (s->power_usage < power_usage) { power_usage = s->power_usage; data->last_state_idx = i; data->exit_us = s->exit_latency; } } return data->last_state_idx; } L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 51. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements Security Management Most of the operations carried out in secure world Policy decisions made in Linux ARM SMC protocol proposal under review L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 52. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements Putting Everything Together (1/3) Common PM Back-End Entry struct pm_pms { unsigned int cluster_state; unsigned int cpu_state; unsigned int subsystem; }; void enter_lowpower(unsigned int cluster_state, unsigned int cpu_state, unsigned flags) { int i, cpu = smp_processor_id(); struct pm_pms pms; struct cpumask tmp; [...] cpu_set(cpu_index, cpuidle_mask); cpumask_and(&tmp, &cpuidle_mask, topology_core_cpumask(cpu)); pms.cluster_state = cluster_state; pms.cpu_state = cpu_state; pms.subsystem = flags; if (!cpu_weight(&tmp) == cpu_weight(topology_core_cpumask(cpu))) pms.cluster_state = 0; cpu_pm_enter(); if (pms.cluster_state >= SHUTDOWN) cpu_cluster_pm_enter(); cpu_suspend(&pms, suspend_finisher); cpu_pm_exit(); if (pms.power_state >= SHUTDOWN) cpu_cluster_pm_exit(); cpu_clear(cpu_index, cpu_idle_mask); return 0; } v7 shutdown next L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 53. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements Putting Everything Together (2/3) int suspend_finisher(unsigned long arg) { struct pm_pms *pp = (struct pm_pms *) arg; [...] switch (pp->subsystem) { case S2RAM: [...] break; case IDLE: [...] break; default: break; } smc_down(...); return 1; } prev L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 54. Introduction ARM Kernel PM Plumbing Conclusion ARM core code and CPU idle improvements Putting Everything Together (3/3) smc_down: ldr r0, =#SMC_NUMBER smc #0 /* * Pseudo code describing what secure world * should do */ { disable_clean_inv_dcache_all(); if (cluster->cluster_down && cluster->power_state == SHUTDOWN) { flush_cache_all(); outer_flush_all(); } normal_uncached_memory_lock(); disable_cci_snoops(); normal_uncached_memory_unlock(); power_down_command(); cpu_do_idle(); } prev L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 55. Introduction ARM Kernel PM Plumbing Conclusion Conclusion PM kernel subsystems need power-down/up common back-end Multi-cluster ARM systems require changes to core code and CPU idle core drivers Effort will gain momentum as soon as big.LITTLE platforms start getting merged in the mainline Improvements for PM subsystem as a whole Outlook Create common back-end for S2RAM, CPU idle and other PM subsystems using CPU suspend/resume Integrate ARM SMC proposal Benchmarking L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems
  • 56. Introduction ARM Kernel PM Plumbing Conclusion THANKS !!! L.Pieralisi ARM Ltd. Idling ARMs in a Busy World: Linux Power Management for ARM multi-cluster systems