SlideShare a Scribd company logo
1 of 24
Download to read offline
Jiannan Ouyang, Brian Kocoloski, John Lange
The Prognostic Lab @ University of Pittsburgh
Kevin Pedretti
Sandia National Laboratories
HPDC 2015
Achieving Performance Isolation
with Lightweight Co-Kernels
HPC Architecture
2
—  Move computation to data
—  Improved data locality
—  Reduced power consumption
—  Reduced network traffic
Compute Node
Operating System and Runtimes
(OS/R)
Simulation
Analytic /
Visualization
Supercomputer
Shared Storage Cluster
Processing Cluster
Problem: massive data movement
over interconnects
Traditional In Situ Data Processing
Challenge: Predictable High Performance
3
—  Tightly coupled HPC workloads are sensitive to OS noise
and overhead [Petrini SC’03, Ferreira SC’08, Hoefler SC’10]
—  Specialized kernels for predictable performance
—  Tailored from Linux: CNL for Cray supercomputers
—  Lightweight kernels (LWK) developed from scratch: IBM CNK, Kitten
—  Data processing workloads favor Linux environments
—  Cross workload interference
—  Shared hardware (CPU time, cache, memory bandwidth)
—  Shared system software
How to provide both Linux and specialized kernels on the same node,
while ensuring performance isolation??
Approach: Lightweight Co-Kernels
4
—  Hardware resources on one node are dynamically composed into
multiple partitions or enclaves
—  Independent software stacks are deployed on each enclave
—  Optimized for certain applications and hardware
—  Performance isolation at both the software and hardware level
Hardware
Linux LWK
Analytic /
Visualization
Hardware
Linux
Analytic /
Visualization
Simulation
Simulation
Agenda
5
—  Introduction
—  The Pisces Lightweight Co-Kernel Architecture
—  Implementation
—  Evaluation
—  RelatedWork
—  Conclusion
Building Blocks: Kitten and Palacios
—  the Kitten Lightweight Kernel (LWK)
—  Goal: provide predictable performance for massively parallel HPC applications
—  Simple resource management policies
—  Limited kernel I/O support + direct user-level network access
—  the Palacios LightweightVirtual Machine Monitor (VMM)
—  Goal: predictable performance
—  Lightweight resource management policies
—  Established history of providing virtualized environments for HPC [Lange et al.
VEE ’11, Kocoloski and Lange ROSS‘12]
Kitten: https://software.sandia.gov/trac/kitten
Palacios: http://www.prognosticlab.org/palacios http://www.v3vee.org/
The Pisces Lightweight Co-Kernel Architecture
7
Linux
Hardware
Isolated Virtual
Machine
Applications
+
Virtual
MachinesPalacios VMM
Kitten Co-kernel
(1)
Kitten Co-kernel
(2)
Isolated
Application
Pisces Pisces
http://www.prognosticlab.org/pisces/
Pisces Design Goals
—  Performance isolation at both software and hardware level
—  Dynamic creation of resizable enclaves
—  Isolated virtual environments
Design Decisions
8
—  Elimination of cross OS dependencies
—  Each enclave must implement its own complete set of supported
system calls
—  No system call forwarding is allowed
—  Internalized management of I/O
—  Each enclave must provide its own I/O device drivers and manage
its hardware resources directly
—  Userspace cross enclave communication
—  Cross enclave communication is not a kernel provided feature
—  Explicitly setup cross enclave shared memory at runtime (XEMEM)
—  Using virtualization to provide missing OS features
Cross Kernel Communication
9
Hardware'Par))on' Hardware'Par))on'
User%
Context%
Kernel%
Context% Linux'
Cross%Kernel*
Messages*
Control'
Process'
Control'
Process'
Shared*Mem*
*Ctrl*Channel*
Linux'
Compa)ble'
Workloads'
Isolated'
Processes''
+'
Virtual'
Machines'
Shared*Mem*
Communica6on*Channels*
Ki@en'
CoAKernel'
XEMEM: Efficient Shared Memory for Composed
Applications on Multi-OS/R Exascale Systems
[Kocoloski and Lange, HPDC‘15]
Agenda
10
—  Introduction
—  The Pisces Lightweight Co-Kernel Architecture
—  Implementation
—  Evaluation
—  RelatedWork
—  Conclusion
Challenges & Approaches
11
—  How to boot a co-kernel?
—  Hot-remove resources from Linux, and load co-kernel
—  Reuse Linux boot code with modified target kernel address
—  Restrict the Kitten co-kernel to access assigned resources only
—  How to share hardware resources among kernels?
—  Hot-remove from Linux + direct assignment and adjustment (e.g.
CPU cores, memory blocks, PCI devices)
—  Managed by Linux and Pisces (e.g. IOMMU)
—  How to communicate with a co-kernel?
—  Kernel level: IPI + shared memory, primarily for Pisces commands
—  Application level: XEMEM [Kocoloski HPDC’15]
—  How to route device interrupts?
I/O Interrupt Routing
12
Legacy
Device
IO-APIC
Management
Kernel Co-Kernel
IRQ
Forwarder
IRQ
Handler
MSI/MSI-X
Device
Management
Kernel Co-Kernel
IRQ
Forwarder
IRQ
Handler
MSI/MSI-X
Device
MSI MSI
INTx
IPI
Legacy Interrupt Forwarding Direct Device Assignment (w/ MSI)
•  Legacy interrupt vectors are potentially shared among multiple devices
•  Pisces provides IRQ forwarding service
•  IRQ forwarding is only used during initialization for PCI devices
•  Modern PCI devices support dedicated interrupt vectors (MSI/MSI-X)
•  Directly route to the corresponding enclave
Implementation
13
—  Pisces
—  Linux kernel module supports unmodified Linux kernels
(2.6.3x – 3.x.y)
—  Co-kernel initialization and management
—  Kitten (~9000 LOC changes)
—  Manage assigned hardware resources
—  Dynamic resource assignment
—  Kernel level communication channel
—  Palacios (~5000 LOC changes)
—  Dynamic resource assignment
—  Command forwarding channel
Pisces: http://www.prognosticlab.org/pisces/
Kitten: https://software.sandia.gov/trac/kitten
Palacios: http://www.prognosticlab.org/palacios http://www.v3vee.org/
Agenda
14
—  Introduction
—  The Pisces Lightweight Co-Kernel Architecture
—  Implementation
—  Evaluation
—  RelatedWork
—  Conclusion
Evaluation
15
—  8 node Dell R450 cluster
—  Two six-core Intel “Ivy-Bridge” Xeon processors
—  24GB RAM split across two NUMA domains
—  QDR Infiniband
—  CentOS 7, Linux kernel 3.16
—  For performance isolation experiments, the hardware is
partitioned by NUMA domains.
—  i.e. Linux on one NUMA domain, co-kernel on the other
Fast Pisces Management Operations
16
Operations Latency (ms)
Booting a co-kernel 265.98
Adding a single CPU core 33.74
Adding a 128MB memory block 82.66
Adding an Ethernet NIC 118.98
Eliminating Cross Kernel Dependencies
17
solitary workloads (us) w/ other workloads (us)
Linux 3.05 3.48
co-kernel fwd 6.12 14.00
co-kernel 0.39 0.36
ExecutionTime of getpid()
—  Co-kernel has the best average performance
—  Co-kernel has the most consistent performance
—  System call forwarding has longer latency and suffers from
cross stack performance interference
Noise Analysis
18
0
5
10
15
20
0 1 2 3 4 5
Latency(us) Time (seconds)
(a) without competing workloads
0 1 2 3 4 5
Time (seconds)
(b) with competing workloads
0
5
10
15
20
0 1 2 3 4 5
Latency(us)
Time (seconds)
(a) without competing workloads
0 1 2 3 4 5
Time (seconds)
(b) with competing workloads
Linux
Kitten co-kernel
Co-Kernel: less noise + better isolation
* Each point represents the latency of an OS interruption
Single Node Performance
19
0
1
CentOS Kitten/KVM co-Kernel
82
83
84
85
CompletionTime(Seconds)
without bg
with bg
0
250
CentOS Kitten/KVM co-Kernel
20250
20500
20750
21000
21250
Throughput(GUPS)
without bg
with bg
CoMD Performance Stream Performance
Co-Kernel: consist performance + performance isolation
8 Node Performance
20
2
4
6
8
10
12
14
16
18
20
1 2 3 4 5 6 7 8
Throughput(GFLOP/s)
Number of Nodes
co-VMM
native
KVM
co-VMM bg
native bg
KVM bg
w/o bg: co-VMM achieves native Linux performance
w/ bg: co-VMM outperforms native Linux
Co-VMM for HPC in the Cloud
21
0
20
40
60
80
100
44 45 46 47 48 49 50 51
CDF(%)
Runtime (seconds)
Co-VMM Native KVM
CDF of HPCCG Performance (running with Hadoop, 8 nodes)
co-VMM: consistent performance + performance isolation
Related Work
22
— Exascale operating systems and runtimes (OS/Rs)
—  Hobbes (SNL, LBNL, LANL, ORNL, U. Pitt, various universities)
—  Argo (ANL, LLNL, PNNL, various universities)
—  FusedOS (Intel / IBM)
—  mOS (Intel)
—  McKernel (RIKENAICS, University ofTokyo)
Our uniqueness: performance isolation, dynamic
resource composition, lightweight virtualization
Conclusion
23
—  Design and implementation of the Pisces co-kernel architecture
—  Pisces framework
—  Kitten co-kernel
—  PalaciosVMM for Kitten co-kernel
—  Demonstrated that the co-kernel architecture provides
—  Optimized execution environments for in situ processing
—  Performance isolation
https://software.sandia.gov/trac/kitten
http://www.prognosticlab.org/pisces/
http://www.prognosticlab.org/palacios
Thank You
Jiannan Ouyang
—  Ph.D. Candidate @ University of Pittsburgh
—  ouyang@cs.pitt.edu
—  http://people.cs.pitt.edu/~ouyang/
—  The Prognostic Lab @ U. Pittsburgh
—  http://www.prognosticlab.org

More Related Content

What's hot

The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce RichardsonThe 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardsonharryvanhaaren
 
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Hajime Tazaki
 
Network stack personality in Android phone - netdev 2.2
Network stack personality in Android phone - netdev 2.2Network stack personality in Android phone - netdev 2.2
Network stack personality in Android phone - netdev 2.2Hajime Tazaki
 
Linux Kernel Library - Reusing Monolithic Kernel
Linux Kernel Library - Reusing Monolithic KernelLinux Kernel Library - Reusing Monolithic Kernel
Linux Kernel Library - Reusing Monolithic KernelHajime Tazaki
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioHajime Tazaki
 
Make Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsMake Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsKernel TLV
 
Accelerating Neutron with Intel DPDK
Accelerating Neutron with Intel DPDKAccelerating Neutron with Intel DPDK
Accelerating Neutron with Intel DPDKAlexander Shalimov
 
Trip down the GPU lane with Machine Learning
Trip down the GPU lane with Machine LearningTrip down the GPU lane with Machine Learning
Trip down the GPU lane with Machine LearningRenaldas Zioma
 
YOW2021 Computing Performance
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing PerformanceBrendan Gregg
 
Library Operating System for Linux #netdev01
Library Operating System for Linux #netdev01Library Operating System for Linux #netdev01
Library Operating System for Linux #netdev01Hajime Tazaki
 
Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013Hajime Tazaki
 
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...Jim St. Leger
 
Ovs perf
Ovs perfOvs perf
Ovs perfMadhu c
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsHisaki Ohara
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPThomas Graf
 
How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.Naoto MATSUMOTO
 

What's hot (20)

The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce RichardsonThe 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
 
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
 
Network stack personality in Android phone - netdev 2.2
Network stack personality in Android phone - netdev 2.2Network stack personality in Android phone - netdev 2.2
Network stack personality in Android phone - netdev 2.2
 
Linux Kernel Library - Reusing Monolithic Kernel
Linux Kernel Library - Reusing Monolithic KernelLinux Kernel Library - Reusing Monolithic Kernel
Linux Kernel Library - Reusing Monolithic Kernel
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osio
 
PROSE
PROSEPROSE
PROSE
 
Make Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsMake Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance Tools
 
Accelerating Neutron with Intel DPDK
Accelerating Neutron with Intel DPDKAccelerating Neutron with Intel DPDK
Accelerating Neutron with Intel DPDK
 
Trip down the GPU lane with Machine Learning
Trip down the GPU lane with Machine LearningTrip down the GPU lane with Machine Learning
Trip down the GPU lane with Machine Learning
 
YOW2021 Computing Performance
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing Performance
 
Library Operating System for Linux #netdev01
Library Operating System for Linux #netdev01Library Operating System for Linux #netdev01
Library Operating System for Linux #netdev01
 
Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013
 
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
 
Ovs perf
Ovs perfOvs perf
Ovs perf
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructions
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
 
Xen in Linux 3.x (or PVOPS)
Xen in Linux 3.x (or PVOPS)Xen in Linux 3.x (or PVOPS)
Xen in Linux 3.x (or PVOPS)
 
DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
 
How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.
 

Viewers also liked

Preemptable ticket spinlocks: improving consolidated performance in the cloud
Preemptable ticket spinlocks: improving consolidated performance in the cloudPreemptable ticket spinlocks: improving consolidated performance in the cloud
Preemptable ticket spinlocks: improving consolidated performance in the cloudJiannan Ouyang, PhD
 
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSStephan Cadene
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversJez Halford
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linuxbrouer
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Eric Van Hensbergen
 
SDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + QuantumSDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + QuantumThe Linux Foundation
 
Q2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingQ2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingLinaro
 
Effect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEffect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEric Van Hensbergen
 
DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2Outlyer
 
reference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_Analysisreference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_AnalysisBuland Singh
 
Linux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emptionLinux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emptionHemanth Venkatesh
 
Memory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelMemory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelDavidlohr Bueso
 
Linux cgroups and namespaces
Linux cgroups and namespacesLinux cgroups and namespaces
Linux cgroups and namespacesLocaweb
 
How Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver ClusterHow Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver ClusterAaron Joue
 
SFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM VirtualizationSFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM VirtualizationLinaro
 
Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...
Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...
Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...David Evans
 
Debugging linux kernel tools and techniques
Debugging linux kernel tools and  techniquesDebugging linux kernel tools and  techniques
Debugging linux kernel tools and techniquesSatpal Parmar
 
SFO15-BFO2: Reducing the arm linux kernel size without losing your mind
SFO15-BFO2: Reducing the arm linux kernel size without losing your mindSFO15-BFO2: Reducing the arm linux kernel size without losing your mind
SFO15-BFO2: Reducing the arm linux kernel size without losing your mindLinaro
 

Viewers also liked (20)

Preemptable ticket spinlocks: improving consolidated performance in the cloud
Preemptable ticket spinlocks: improving consolidated performance in the cloudPreemptable ticket spinlocks: improving consolidated performance in the cloud
Preemptable ticket spinlocks: improving consolidated performance in the cloud
 
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
 
Docker by demo
Docker by demoDocker by demo
Docker by demo
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microservers
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
 
Cache profiling on ARM Linux
Cache profiling on ARM LinuxCache profiling on ARM Linux
Cache profiling on ARM Linux
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
 
SDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + QuantumSDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + Quantum
 
Q2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingQ2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP Scheduling
 
Effect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEffect of Virtualization on OS Interference
Effect of Virtualization on OS Interference
 
DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2
 
reference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_Analysisreference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_Analysis
 
Linux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emptionLinux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emption
 
Memory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelMemory Barriers in the Linux Kernel
Memory Barriers in the Linux Kernel
 
Linux cgroups and namespaces
Linux cgroups and namespacesLinux cgroups and namespaces
Linux cgroups and namespaces
 
How Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver ClusterHow Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver Cluster
 
SFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM VirtualizationSFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM Virtualization
 
Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...
Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...
Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...
 
Debugging linux kernel tools and techniques
Debugging linux kernel tools and  techniquesDebugging linux kernel tools and  techniques
Debugging linux kernel tools and techniques
 
SFO15-BFO2: Reducing the arm linux kernel size without losing your mind
SFO15-BFO2: Reducing the arm linux kernel size without losing your mindSFO15-BFO2: Reducing the arm linux kernel size without losing your mind
SFO15-BFO2: Reducing the arm linux kernel size without losing your mind
 

Similar to Achieving Performance Isolation with Lightweight Co-Kernels

Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...PT Datacomm Diangraha
 
Parallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.pptParallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.pptMohmdUmer
 
Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...Francesco Taurino
 
An Introduce of OPNFV (Open Platform for NFV)
An Introduce of OPNFV (Open Platform for NFV)An Introduce of OPNFV (Open Platform for NFV)
An Introduce of OPNFV (Open Platform for NFV)Mario Cho
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetesTed Jung
 
OCP Engineering Workshop at UNH
OCP Engineering Workshop at UNH OCP Engineering Workshop at UNH
OCP Engineering Workshop at UNH 호용 류
 
ClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptxClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptxBiHongPhc
 
Xen Euro Par07
Xen Euro Par07Xen Euro Par07
Xen Euro Par07congvc
 
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISORLOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISORVanika Kapoor
 
Clusters (Distributed computing)
Clusters (Distributed computing)Clusters (Distributed computing)
Clusters (Distributed computing)Sri Prasanna
 
Linux container & docker
Linux container & dockerLinux container & docker
Linux container & dockerejlp12
 
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)The Linux Foundation
 
2014/09/02 Cisco UCS HPC @ ANL
2014/09/02 Cisco UCS HPC @ ANL2014/09/02 Cisco UCS HPC @ ANL
2014/09/02 Cisco UCS HPC @ ANLdgoodell
 
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...OpenEBS
 
Systems Support for Many Task Computing
Systems Support for Many Task ComputingSystems Support for Many Task Computing
Systems Support for Many Task ComputingEric Van Hensbergen
 
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...The Linux Foundation
 
Walk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoCWalk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoCCeph Community
 

Similar to Achieving Performance Isolation with Lightweight Co-Kernels (20)

Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
 
Libra Library OS
Libra Library OSLibra Library OS
Libra Library OS
 
Parallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.pptParallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.ppt
 
Cluster computer
Cluster  computerCluster  computer
Cluster computer
 
Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...
 
An Introduce of OPNFV (Open Platform for NFV)
An Introduce of OPNFV (Open Platform for NFV)An Introduce of OPNFV (Open Platform for NFV)
An Introduce of OPNFV (Open Platform for NFV)
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetes
 
OCP Engineering Workshop at UNH
OCP Engineering Workshop at UNH OCP Engineering Workshop at UNH
OCP Engineering Workshop at UNH
 
ClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptxClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptx
 
Xen Euro Par07
Xen Euro Par07Xen Euro Par07
Xen Euro Par07
 
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISORLOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
 
Clusters (Distributed computing)
Clusters (Distributed computing)Clusters (Distributed computing)
Clusters (Distributed computing)
 
Linux container & docker
Linux container & dockerLinux container & docker
Linux container & docker
 
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
 
2014/09/02 Cisco UCS HPC @ ANL
2014/09/02 Cisco UCS HPC @ ANL2014/09/02 Cisco UCS HPC @ ANL
2014/09/02 Cisco UCS HPC @ ANL
 
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
 
Systems Support for Many Task Computing
Systems Support for Many Task ComputingSystems Support for Many Task Computing
Systems Support for Many Task Computing
 
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
 
NWU and HPC
NWU and HPCNWU and HPC
NWU and HPC
 
Walk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoCWalk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoC
 

Recently uploaded

The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...Nitya salvi
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...kalichargn70th171
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 

Recently uploaded (20)

The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 

Achieving Performance Isolation with Lightweight Co-Kernels

  • 1. Jiannan Ouyang, Brian Kocoloski, John Lange The Prognostic Lab @ University of Pittsburgh Kevin Pedretti Sandia National Laboratories HPDC 2015 Achieving Performance Isolation with Lightweight Co-Kernels
  • 2. HPC Architecture 2 —  Move computation to data —  Improved data locality —  Reduced power consumption —  Reduced network traffic Compute Node Operating System and Runtimes (OS/R) Simulation Analytic / Visualization Supercomputer Shared Storage Cluster Processing Cluster Problem: massive data movement over interconnects Traditional In Situ Data Processing
  • 3. Challenge: Predictable High Performance 3 —  Tightly coupled HPC workloads are sensitive to OS noise and overhead [Petrini SC’03, Ferreira SC’08, Hoefler SC’10] —  Specialized kernels for predictable performance —  Tailored from Linux: CNL for Cray supercomputers —  Lightweight kernels (LWK) developed from scratch: IBM CNK, Kitten —  Data processing workloads favor Linux environments —  Cross workload interference —  Shared hardware (CPU time, cache, memory bandwidth) —  Shared system software How to provide both Linux and specialized kernels on the same node, while ensuring performance isolation??
  • 4. Approach: Lightweight Co-Kernels 4 —  Hardware resources on one node are dynamically composed into multiple partitions or enclaves —  Independent software stacks are deployed on each enclave —  Optimized for certain applications and hardware —  Performance isolation at both the software and hardware level Hardware Linux LWK Analytic / Visualization Hardware Linux Analytic / Visualization Simulation Simulation
  • 5. Agenda 5 —  Introduction —  The Pisces Lightweight Co-Kernel Architecture —  Implementation —  Evaluation —  RelatedWork —  Conclusion
  • 6. Building Blocks: Kitten and Palacios —  the Kitten Lightweight Kernel (LWK) —  Goal: provide predictable performance for massively parallel HPC applications —  Simple resource management policies —  Limited kernel I/O support + direct user-level network access —  the Palacios LightweightVirtual Machine Monitor (VMM) —  Goal: predictable performance —  Lightweight resource management policies —  Established history of providing virtualized environments for HPC [Lange et al. VEE ’11, Kocoloski and Lange ROSS‘12] Kitten: https://software.sandia.gov/trac/kitten Palacios: http://www.prognosticlab.org/palacios http://www.v3vee.org/
  • 7. The Pisces Lightweight Co-Kernel Architecture 7 Linux Hardware Isolated Virtual Machine Applications + Virtual MachinesPalacios VMM Kitten Co-kernel (1) Kitten Co-kernel (2) Isolated Application Pisces Pisces http://www.prognosticlab.org/pisces/ Pisces Design Goals —  Performance isolation at both software and hardware level —  Dynamic creation of resizable enclaves —  Isolated virtual environments
  • 8. Design Decisions 8 —  Elimination of cross OS dependencies —  Each enclave must implement its own complete set of supported system calls —  No system call forwarding is allowed —  Internalized management of I/O —  Each enclave must provide its own I/O device drivers and manage its hardware resources directly —  Userspace cross enclave communication —  Cross enclave communication is not a kernel provided feature —  Explicitly setup cross enclave shared memory at runtime (XEMEM) —  Using virtualization to provide missing OS features
  • 9. Cross Kernel Communication 9 Hardware'Par))on' Hardware'Par))on' User% Context% Kernel% Context% Linux' Cross%Kernel* Messages* Control' Process' Control' Process' Shared*Mem* *Ctrl*Channel* Linux' Compa)ble' Workloads' Isolated' Processes'' +' Virtual' Machines' Shared*Mem* Communica6on*Channels* Ki@en' CoAKernel' XEMEM: Efficient Shared Memory for Composed Applications on Multi-OS/R Exascale Systems [Kocoloski and Lange, HPDC‘15]
  • 10. Agenda 10 —  Introduction —  The Pisces Lightweight Co-Kernel Architecture —  Implementation —  Evaluation —  RelatedWork —  Conclusion
  • 11. Challenges & Approaches 11 —  How to boot a co-kernel? —  Hot-remove resources from Linux, and load co-kernel —  Reuse Linux boot code with modified target kernel address —  Restrict the Kitten co-kernel to access assigned resources only —  How to share hardware resources among kernels? —  Hot-remove from Linux + direct assignment and adjustment (e.g. CPU cores, memory blocks, PCI devices) —  Managed by Linux and Pisces (e.g. IOMMU) —  How to communicate with a co-kernel? —  Kernel level: IPI + shared memory, primarily for Pisces commands —  Application level: XEMEM [Kocoloski HPDC’15] —  How to route device interrupts?
  • 12. I/O Interrupt Routing 12 Legacy Device IO-APIC Management Kernel Co-Kernel IRQ Forwarder IRQ Handler MSI/MSI-X Device Management Kernel Co-Kernel IRQ Forwarder IRQ Handler MSI/MSI-X Device MSI MSI INTx IPI Legacy Interrupt Forwarding Direct Device Assignment (w/ MSI) •  Legacy interrupt vectors are potentially shared among multiple devices •  Pisces provides IRQ forwarding service •  IRQ forwarding is only used during initialization for PCI devices •  Modern PCI devices support dedicated interrupt vectors (MSI/MSI-X) •  Directly route to the corresponding enclave
  • 13. Implementation 13 —  Pisces —  Linux kernel module supports unmodified Linux kernels (2.6.3x – 3.x.y) —  Co-kernel initialization and management —  Kitten (~9000 LOC changes) —  Manage assigned hardware resources —  Dynamic resource assignment —  Kernel level communication channel —  Palacios (~5000 LOC changes) —  Dynamic resource assignment —  Command forwarding channel Pisces: http://www.prognosticlab.org/pisces/ Kitten: https://software.sandia.gov/trac/kitten Palacios: http://www.prognosticlab.org/palacios http://www.v3vee.org/
  • 14. Agenda 14 —  Introduction —  The Pisces Lightweight Co-Kernel Architecture —  Implementation —  Evaluation —  RelatedWork —  Conclusion
  • 15. Evaluation 15 —  8 node Dell R450 cluster —  Two six-core Intel “Ivy-Bridge” Xeon processors —  24GB RAM split across two NUMA domains —  QDR Infiniband —  CentOS 7, Linux kernel 3.16 —  For performance isolation experiments, the hardware is partitioned by NUMA domains. —  i.e. Linux on one NUMA domain, co-kernel on the other
  • 16. Fast Pisces Management Operations 16 Operations Latency (ms) Booting a co-kernel 265.98 Adding a single CPU core 33.74 Adding a 128MB memory block 82.66 Adding an Ethernet NIC 118.98
  • 17. Eliminating Cross Kernel Dependencies 17 solitary workloads (us) w/ other workloads (us) Linux 3.05 3.48 co-kernel fwd 6.12 14.00 co-kernel 0.39 0.36 ExecutionTime of getpid() —  Co-kernel has the best average performance —  Co-kernel has the most consistent performance —  System call forwarding has longer latency and suffers from cross stack performance interference
  • 18. Noise Analysis 18 0 5 10 15 20 0 1 2 3 4 5 Latency(us) Time (seconds) (a) without competing workloads 0 1 2 3 4 5 Time (seconds) (b) with competing workloads 0 5 10 15 20 0 1 2 3 4 5 Latency(us) Time (seconds) (a) without competing workloads 0 1 2 3 4 5 Time (seconds) (b) with competing workloads Linux Kitten co-kernel Co-Kernel: less noise + better isolation * Each point represents the latency of an OS interruption
  • 19. Single Node Performance 19 0 1 CentOS Kitten/KVM co-Kernel 82 83 84 85 CompletionTime(Seconds) without bg with bg 0 250 CentOS Kitten/KVM co-Kernel 20250 20500 20750 21000 21250 Throughput(GUPS) without bg with bg CoMD Performance Stream Performance Co-Kernel: consist performance + performance isolation
  • 20. 8 Node Performance 20 2 4 6 8 10 12 14 16 18 20 1 2 3 4 5 6 7 8 Throughput(GFLOP/s) Number of Nodes co-VMM native KVM co-VMM bg native bg KVM bg w/o bg: co-VMM achieves native Linux performance w/ bg: co-VMM outperforms native Linux
  • 21. Co-VMM for HPC in the Cloud 21 0 20 40 60 80 100 44 45 46 47 48 49 50 51 CDF(%) Runtime (seconds) Co-VMM Native KVM CDF of HPCCG Performance (running with Hadoop, 8 nodes) co-VMM: consistent performance + performance isolation
  • 22. Related Work 22 — Exascale operating systems and runtimes (OS/Rs) —  Hobbes (SNL, LBNL, LANL, ORNL, U. Pitt, various universities) —  Argo (ANL, LLNL, PNNL, various universities) —  FusedOS (Intel / IBM) —  mOS (Intel) —  McKernel (RIKENAICS, University ofTokyo) Our uniqueness: performance isolation, dynamic resource composition, lightweight virtualization
  • 23. Conclusion 23 —  Design and implementation of the Pisces co-kernel architecture —  Pisces framework —  Kitten co-kernel —  PalaciosVMM for Kitten co-kernel —  Demonstrated that the co-kernel architecture provides —  Optimized execution environments for in situ processing —  Performance isolation https://software.sandia.gov/trac/kitten http://www.prognosticlab.org/pisces/ http://www.prognosticlab.org/palacios
  • 24. Thank You Jiannan Ouyang —  Ph.D. Candidate @ University of Pittsburgh —  ouyang@cs.pitt.edu —  http://people.cs.pitt.edu/~ouyang/ —  The Prognostic Lab @ U. Pittsburgh —  http://www.prognosticlab.org