SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
VM Memory Allocation Schemes and
       PV NUMA Guests

           Dulloor Rao




            Xen Summit AMD 2010
Agenda
●   Motivation
●   VM memory allocation strategies –
    CONFINED, SPLIT, STRIPED
●   AUTOMATIC (default) allocation scheme
●   PV NUMA Guests
●   Summary


                   Xen Summit AMD 2010
Motivation – NUMA Overheads




          Xen Summit AMD 2010
Motivation – NUMA Overheads
●   CPU0 and CPU1 are Hyper-Threads.
●   CPU0 and CPU2 are on the same node.
●   CPU0 and CPU8 are on different nodes.
●   Overheads are due to both Cache Hierarchy (L1/L2/LLC) and
    Memory Organization (NUMA)
●   Modified Cache Coherency State – Cacheline is present only in
    the current cache and is dirty. The cacheline is written back to
    main memory before any reads.
●   Substantial overhead in accessing remote node's memory.




                            Xen Summit AMD 2010
Motivation – NUMA-related OS
    Optimizations (Linux as example)
●   OS employs many optimizations to reduce
    inter-node memory accesses – memory
    management, scheduler, OS data-structures,
    etc.
●   OS defines multiple NUMA allocation policies
    (MPOL_{DEFAULT/BIND/PREFERRED/INTER
    LEAVE}) to suit different applications. DEFAULT
    is local allocation.
●   Significant performance improvement from
    system-level NUMA optimizations.
                     Xen Summit AMD 2010
Motivation – NUMA-related
    Application Optimizations (Linux)

●   DEFAULT memory policy (of allocating from local
    node) and a NUMA-aware scheduler reduce the
    inter-node accesses.
●   Libraries (numactl on Linux) are provided to
    select appropriate memory placement policy for
    specific application requirements.
●   CONCLUSION – NUMA-related optimizations at
    OS-level and Application-level are too important
    and too many to ignore or discard.
                      Xen Summit AMD 2010
Motivation – Virtualization on
        NUMA platforms (Issues)
●   Ad-hoc and Minimum-Effort VM memory allocation
    schemes.
●   For instance, XEN tries to allocate all the memory for
    a VM from a single memory node and pin the VM to
    the node, for a one-to-one mapping between a VM
    and a node.
●   Not always possible to allocate from a single node –
    VM size, node memory fragmentation, etc.
●   Dynamic memory Interfaces (such as memory
    ballooning) could still disrupt the mapping, by
    allocating from some other node.
                        Xen Summit AMD 2010
Motivation – Virtualization on
 NUMA platforms (Issues)




           Xen Summit AMD 2010
VM Memory Allocation Strategies
●   CONFINED : Allocate the entire VM memory from a single
    node. Goal : Maximize performance.

●   SPLIT : Allocate the VM memory from a set of nodes by
    splitting equally across the nodes. Goal : Maximize
    performance (with Enlightenment).

●   STRIPED : Interleave the VM memory across a set of
    nodes. Goal : Predictable (average) performance.



                         Xen Summit AMD 2010
VM Memory Allocation Strategies -
        CONFINED




             Xen Summit AMD 2010
VM Memory Allocation Strategies -
           SPLIT




             Xen Summit AMD 2010
VM Memory Allocation Strategies -
          STRIPED




             Xen Summit AMD 2010
Automatic VM Memory Allocation
               Scheme
●   TRY : Allocate CONFINED using Best-Fit-Decreasing
    (BFD).
●   TRY : Allocate SPLIT using Best-Fit-Decreasing (BFD),
    if the guest is NUMA-enabled. Enlighten the guest.
●   Allocate STRIPED using First-Fit-Increasing (FFI).
●   BFD returns the minimal-subset of nodes.
●   FFI returns the maximal-subset of nodes. Used with
    STRIPED to reduce the fragmentation of free node
    memory.


                        Xen Summit AMD 2010
VM Memory Allocation Strategy -
              SPLIT
●   Used to construct a strict one-to-one mapping
    between virtual nodes and physical nodes.
●   HVM : Export the VM memory layout using
    ACPI tables. VM constructs virtual nodes.
●   PV : Export the VM memory layout using Virtual
    NUMA Enlightenment. VM constructs and
    maintains virtual nodes.



                     Xen Summit AMD 2010
PV NUMA Guest - Enlightenment




           Xen Summit AMD 2010
PV NUMA Guest -
       Construction of Virtual Nodes

●   Guest reads the Virtual NUMA Enlightenment using
    a hypercall.

●   Guest constructs the (virtual) nodes and (virtual)
    cpu-to-node mappings.

●   Guest (virtual) node distances reflect the actual
    distances between the underlying physical nodes.

                        Xen Summit AMD 2010
PV NUMA Guest –
Construction of Virtual Nodes




          Xen Summit AMD 2010
PV NUMA Guest –
      Maintenance of Virtual Nodes
●   Dynamic memory interfaces could
    increase/decrease/exchange the VM memory
    reservations. Eg. Ballooning (Table in slide 7)

●   Modify the interfaces to use Virtual NUMA
    Enlightenment. Maintain the strict mapping
    between Virtual and Physical nodes.



                      Xen Summit AMD 2010
PV NUMA Guest -
Maintenance of Virtual Nodes




          Xen Summit AMD 2010
PV NUMA Guest –
      Maintenance of Virtual Nodes
●   Strict approach could lead to starvation in
    CONFINED/SPLIT VMs.
●   Under memory pressure, relax the strict one-to-
    one mapping between virtual and physical nodes.
●   Provide a mechanism to the guests to look-up
    physical node-id corresponding to a guest
    physical address.
●   Periodically sweep through the VM memory and
    converge to original state (indefinitely).

                       Xen Summit AMD 2010
Results – linpack benchmark




          Xen Summit AMD 2010
Summary
●   VM Memory Allocation Strategies for NUMA –
    CONFINED/SPLIT/STRIPED.
●   Automatic VM Memory Allocation Scheme.
●   NUMA Guests with SPLIT strategy :
    ●   HVM – Inform using SLIT/SRAT ACPI tables
    ●   PV – Inform using Enlightenment
●   PV NUMA Guests
    ●   Construction of Virtual Nodes
    ●   Maintenance of Virtual Nodes (Eg, Ballooning)
                         Xen Summit AMD 2010
Questions ?




 Xen Summit AMD 2010
Thank You !




 Xen Summit AMD 2010

Weitere ähnliche Inhalte

Was ist angesagt?

Dynamic Memory Management HyperV R2 SP1
Dynamic Memory Management HyperV R2 SP1Dynamic Memory Management HyperV R2 SP1
Dynamic Memory Management HyperV R2 SP1
Eduardo Castro
 
Dynamic Memory Management Hyperv 2008 R2 S
Dynamic Memory Management Hyperv 2008 R2 SDynamic Memory Management Hyperv 2008 R2 S
Dynamic Memory Management Hyperv 2008 R2 S
Eduardo Castro
 
Colama rde
Colama rdeColama rde
Colama rde
colama
 

Was ist angesagt? (14)

XPDS13: Zero-copy display of guest framebuffers using GEM - John Baboval, Citrix
XPDS13: Zero-copy display of guest framebuffers using GEM - John Baboval, CitrixXPDS13: Zero-copy display of guest framebuffers using GEM - John Baboval, Citrix
XPDS13: Zero-copy display of guest framebuffers using GEM - John Baboval, Citrix
 
Memory Virtualization
Memory VirtualizationMemory Virtualization
Memory Virtualization
 
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...
 
Virtualization ppt1
Virtualization ppt1Virtualization ppt1
Virtualization ppt1
 
3. CPU virtualization and scheduling
3. CPU virtualization and scheduling3. CPU virtualization and scheduling
3. CPU virtualization and scheduling
 
Dynamic Memory Management HyperV R2 SP1
Dynamic Memory Management HyperV R2 SP1Dynamic Memory Management HyperV R2 SP1
Dynamic Memory Management HyperV R2 SP1
 
Dynamic Memory Management Hyperv 2008 R2 S
Dynamic Memory Management Hyperv 2008 R2 SDynamic Memory Management Hyperv 2008 R2 S
Dynamic Memory Management Hyperv 2008 R2 S
 
CPU Scheduling for Virtual Desktop Infrastructure
CPU Scheduling for Virtual Desktop InfrastructureCPU Scheduling for Virtual Desktop Infrastructure
CPU Scheduling for Virtual Desktop Infrastructure
 
Usb flash driver
Usb flash driverUsb flash driver
Usb flash driver
 
XPDS13: XenGT - A software based Intel Graphics Virtualization Solution - Hai...
XPDS13: XenGT - A software based Intel Graphics Virtualization Solution - Hai...XPDS13: XenGT - A software based Intel Graphics Virtualization Solution - Hai...
XPDS13: XenGT - A software based Intel Graphics Virtualization Solution - Hai...
 
Building a KVM-based Hypervisor for a Heterogeneous System Architecture Compl...
Building a KVM-based Hypervisor for a Heterogeneous System Architecture Compl...Building a KVM-based Hypervisor for a Heterogeneous System Architecture Compl...
Building a KVM-based Hypervisor for a Heterogeneous System Architecture Compl...
 
Colama rde
Colama rdeColama rde
Colama rde
 
Intelligent RAM
Intelligent RAMIntelligent RAM
Intelligent RAM
 
Warehouse scale computer
Warehouse scale computerWarehouse scale computer
Warehouse scale computer
 

Andere mochten auch

Tackling the Management Challenges of Server Consolidation on Multi-core Systems
Tackling the Management Challenges of Server Consolidation on Multi-core SystemsTackling the Management Challenges of Server Consolidation on Multi-core Systems
Tackling the Management Challenges of Server Consolidation on Multi-core Systems
The Linux Foundation
 
Reverse engineering for_beginners-en
Reverse engineering for_beginners-enReverse engineering for_beginners-en
Reverse engineering for_beginners-en
Andri Yabu
 
Cgroup resource mgmt_v1
Cgroup resource mgmt_v1Cgroup resource mgmt_v1
Cgroup resource mgmt_v1
sprdd
 
Gc and-pagescan-attacks-by-linux
Gc and-pagescan-attacks-by-linuxGc and-pagescan-attacks-by-linux
Gc and-pagescan-attacks-by-linux
Cuong Tran
 
P4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC OffloadP4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC Offload
Open-NFP
 
Cpu scheduling(suresh)
Cpu scheduling(suresh)Cpu scheduling(suresh)
Cpu scheduling(suresh)
Nagarajan
 

Andere mochten auch (20)

Tackling the Management Challenges of Server Consolidation on Multi-core Systems
Tackling the Management Challenges of Server Consolidation on Multi-core SystemsTackling the Management Challenges of Server Consolidation on Multi-core Systems
Tackling the Management Challenges of Server Consolidation on Multi-core Systems
 
Reverse engineering for_beginners-en
Reverse engineering for_beginners-enReverse engineering for_beginners-en
Reverse engineering for_beginners-en
 
Specification-Based Test Program Generation for ARM VMSAv8-64 MMUs
Specification-Based Test Program Generation for ARM VMSAv8-64 MMUsSpecification-Based Test Program Generation for ARM VMSAv8-64 MMUs
Specification-Based Test Program Generation for ARM VMSAv8-64 MMUs
 
BKK16-404A PCI Development Meeting
BKK16-404A PCI Development MeetingBKK16-404A PCI Development Meeting
BKK16-404A PCI Development Meeting
 
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingKernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s going
 
Virtualization overheads
Virtualization overheadsVirtualization overheads
Virtualization overheads
 
Docker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in PragueDocker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in Prague
 
Linux numa evolution
Linux numa evolutionLinux numa evolution
Linux numa evolution
 
BKK16-104 sched-freq
BKK16-104 sched-freqBKK16-104 sched-freq
BKK16-104 sched-freq
 
Cgroup resource mgmt_v1
Cgroup resource mgmt_v1Cgroup resource mgmt_v1
Cgroup resource mgmt_v1
 
Gc and-pagescan-attacks-by-linux
Gc and-pagescan-attacks-by-linuxGc and-pagescan-attacks-by-linux
Gc and-pagescan-attacks-by-linux
 
Non-Uniform Memory Access ( NUMA)
Non-Uniform Memory Access ( NUMA)Non-Uniform Memory Access ( NUMA)
Non-Uniform Memory Access ( NUMA)
 
Known basic of NFV Features
Known basic of NFV FeaturesKnown basic of NFV Features
Known basic of NFV Features
 
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
 
Linux NUMA & Databases: Perils and Opportunities
Linux NUMA & Databases: Perils and OpportunitiesLinux NUMA & Databases: Perils and Opportunities
Linux NUMA & Databases: Perils and Opportunities
 
P4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC OffloadP4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC Offload
 
Cpu scheduling(suresh)
Cpu scheduling(suresh)Cpu scheduling(suresh)
Cpu scheduling(suresh)
 
Process scheduling linux
Process scheduling linuxProcess scheduling linux
Process scheduling linux
 
Notes on NUMA architecture
Notes on NUMA architectureNotes on NUMA architecture
Notes on NUMA architecture
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
 

Ähnlich wie Dulloor xen-summit

Hyper-V Best Practices & Tips and Tricks
Hyper-V Best Practices & Tips and TricksHyper-V Best Practices & Tips and Tricks
Hyper-V Best Practices & Tips and Tricks
Amit Gatenyo
 
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdfStorage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
aaajjj4
 

Ähnlich wie Dulloor xen-summit (20)

Nakajima numa-final
Nakajima numa-finalNakajima numa-final
Nakajima numa-final
 
VietOpenStack meetup 7th High Performance VM
VietOpenStack meetup 7th High Performance VMVietOpenStack meetup 7th High Performance VM
VietOpenStack meetup 7th High Performance VM
 
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
 
Hyper-V Best Practices & Tips and Tricks
Hyper-V Best Practices & Tips and TricksHyper-V Best Practices & Tips and Tricks
Hyper-V Best Practices & Tips and Tricks
 
Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVMAchieving the ultimate performance with KVM
Achieving the ultimate performance with KVM
 
IBM Upgrades SVC with Solid State Drives — Achieves Better Storage Utilization
IBM Upgrades SVC with Solid State Drives — Achieves Better Storage UtilizationIBM Upgrades SVC with Solid State Drives — Achieves Better Storage Utilization
IBM Upgrades SVC with Solid State Drives — Achieves Better Storage Utilization
 
IBM Upgrades SVC with Solid State Drives — Achieves Better Storage Utilization
IBM Upgrades SVC with Solid State Drives — Achieves Better Storage UtilizationIBM Upgrades SVC with Solid State Drives — Achieves Better Storage Utilization
IBM Upgrades SVC with Solid State Drives — Achieves Better Storage Utilization
 
Presentation v mware v-sphere advanced troubleshooting by eric sloof
Presentation   v mware v-sphere advanced troubleshooting by eric sloofPresentation   v mware v-sphere advanced troubleshooting by eric sloof
Presentation v mware v-sphere advanced troubleshooting by eric sloof
 
Session 7362 Handout 427 0
Session 7362 Handout 427 0Session 7362 Handout 427 0
Session 7362 Handout 427 0
 
Advancedtroubleshooting 101208145718-phpapp01
Advancedtroubleshooting 101208145718-phpapp01Advancedtroubleshooting 101208145718-phpapp01
Advancedtroubleshooting 101208145718-phpapp01
 
Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVM Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVM
 
AMD - RUNTIME MANAGEMENT OF THE DECOMPOSABLE MEMORY
AMD - RUNTIME MANAGEMENT OF THE DECOMPOSABLE MEMORYAMD - RUNTIME MANAGEMENT OF THE DECOMPOSABLE MEMORY
AMD - RUNTIME MANAGEMENT OF THE DECOMPOSABLE MEMORY
 
Advanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtopAdvanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtop
 
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVM
 
WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...
WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...
WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...
 
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdfStorage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVM
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
 
Rendering Battlefield 4 with Mantle
Rendering Battlefield 4 with MantleRendering Battlefield 4 with Mantle
Rendering Battlefield 4 with Mantle
 

Mehr von The Linux Foundation

Mehr von The Linux Foundation (20)

ELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made SimpleELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made Simple
 
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
 
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
 
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
 
XPDDS19 Keynote: Unikraft Weather Report
XPDDS19 Keynote:  Unikraft Weather ReportXPDDS19 Keynote:  Unikraft Weather Report
XPDDS19 Keynote: Unikraft Weather Report
 
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
 
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, XilinxXPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
 
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
 
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, BitdefenderXPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
 
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
 
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making... OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, CitrixXPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
 
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltdXPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
 
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
 
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&DXPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
 
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM SystemsXPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
 
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
 
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
 
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSEXPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 

Dulloor xen-summit

  • 1. VM Memory Allocation Schemes and PV NUMA Guests Dulloor Rao Xen Summit AMD 2010
  • 2. Agenda ● Motivation ● VM memory allocation strategies – CONFINED, SPLIT, STRIPED ● AUTOMATIC (default) allocation scheme ● PV NUMA Guests ● Summary Xen Summit AMD 2010
  • 3. Motivation – NUMA Overheads Xen Summit AMD 2010
  • 4. Motivation – NUMA Overheads ● CPU0 and CPU1 are Hyper-Threads. ● CPU0 and CPU2 are on the same node. ● CPU0 and CPU8 are on different nodes. ● Overheads are due to both Cache Hierarchy (L1/L2/LLC) and Memory Organization (NUMA) ● Modified Cache Coherency State – Cacheline is present only in the current cache and is dirty. The cacheline is written back to main memory before any reads. ● Substantial overhead in accessing remote node's memory. Xen Summit AMD 2010
  • 5. Motivation – NUMA-related OS Optimizations (Linux as example) ● OS employs many optimizations to reduce inter-node memory accesses – memory management, scheduler, OS data-structures, etc. ● OS defines multiple NUMA allocation policies (MPOL_{DEFAULT/BIND/PREFERRED/INTER LEAVE}) to suit different applications. DEFAULT is local allocation. ● Significant performance improvement from system-level NUMA optimizations. Xen Summit AMD 2010
  • 6. Motivation – NUMA-related Application Optimizations (Linux) ● DEFAULT memory policy (of allocating from local node) and a NUMA-aware scheduler reduce the inter-node accesses. ● Libraries (numactl on Linux) are provided to select appropriate memory placement policy for specific application requirements. ● CONCLUSION – NUMA-related optimizations at OS-level and Application-level are too important and too many to ignore or discard. Xen Summit AMD 2010
  • 7. Motivation – Virtualization on NUMA platforms (Issues) ● Ad-hoc and Minimum-Effort VM memory allocation schemes. ● For instance, XEN tries to allocate all the memory for a VM from a single memory node and pin the VM to the node, for a one-to-one mapping between a VM and a node. ● Not always possible to allocate from a single node – VM size, node memory fragmentation, etc. ● Dynamic memory Interfaces (such as memory ballooning) could still disrupt the mapping, by allocating from some other node. Xen Summit AMD 2010
  • 8. Motivation – Virtualization on NUMA platforms (Issues) Xen Summit AMD 2010
  • 9. VM Memory Allocation Strategies ● CONFINED : Allocate the entire VM memory from a single node. Goal : Maximize performance. ● SPLIT : Allocate the VM memory from a set of nodes by splitting equally across the nodes. Goal : Maximize performance (with Enlightenment). ● STRIPED : Interleave the VM memory across a set of nodes. Goal : Predictable (average) performance. Xen Summit AMD 2010
  • 10. VM Memory Allocation Strategies - CONFINED Xen Summit AMD 2010
  • 11. VM Memory Allocation Strategies - SPLIT Xen Summit AMD 2010
  • 12. VM Memory Allocation Strategies - STRIPED Xen Summit AMD 2010
  • 13. Automatic VM Memory Allocation Scheme ● TRY : Allocate CONFINED using Best-Fit-Decreasing (BFD). ● TRY : Allocate SPLIT using Best-Fit-Decreasing (BFD), if the guest is NUMA-enabled. Enlighten the guest. ● Allocate STRIPED using First-Fit-Increasing (FFI). ● BFD returns the minimal-subset of nodes. ● FFI returns the maximal-subset of nodes. Used with STRIPED to reduce the fragmentation of free node memory. Xen Summit AMD 2010
  • 14. VM Memory Allocation Strategy - SPLIT ● Used to construct a strict one-to-one mapping between virtual nodes and physical nodes. ● HVM : Export the VM memory layout using ACPI tables. VM constructs virtual nodes. ● PV : Export the VM memory layout using Virtual NUMA Enlightenment. VM constructs and maintains virtual nodes. Xen Summit AMD 2010
  • 15. PV NUMA Guest - Enlightenment Xen Summit AMD 2010
  • 16. PV NUMA Guest - Construction of Virtual Nodes ● Guest reads the Virtual NUMA Enlightenment using a hypercall. ● Guest constructs the (virtual) nodes and (virtual) cpu-to-node mappings. ● Guest (virtual) node distances reflect the actual distances between the underlying physical nodes. Xen Summit AMD 2010
  • 17. PV NUMA Guest – Construction of Virtual Nodes Xen Summit AMD 2010
  • 18. PV NUMA Guest – Maintenance of Virtual Nodes ● Dynamic memory interfaces could increase/decrease/exchange the VM memory reservations. Eg. Ballooning (Table in slide 7) ● Modify the interfaces to use Virtual NUMA Enlightenment. Maintain the strict mapping between Virtual and Physical nodes. Xen Summit AMD 2010
  • 19. PV NUMA Guest - Maintenance of Virtual Nodes Xen Summit AMD 2010
  • 20. PV NUMA Guest – Maintenance of Virtual Nodes ● Strict approach could lead to starvation in CONFINED/SPLIT VMs. ● Under memory pressure, relax the strict one-to- one mapping between virtual and physical nodes. ● Provide a mechanism to the guests to look-up physical node-id corresponding to a guest physical address. ● Periodically sweep through the VM memory and converge to original state (indefinitely). Xen Summit AMD 2010
  • 21. Results – linpack benchmark Xen Summit AMD 2010
  • 22. Summary ● VM Memory Allocation Strategies for NUMA – CONFINED/SPLIT/STRIPED. ● Automatic VM Memory Allocation Scheme. ● NUMA Guests with SPLIT strategy : ● HVM – Inform using SLIT/SRAT ACPI tables ● PV – Inform using Enlightenment ● PV NUMA Guests ● Construction of Virtual Nodes ● Maintenance of Virtual Nodes (Eg, Ballooning) Xen Summit AMD 2010
  • 23. Questions ? Xen Summit AMD 2010
  • 24. Thank You ! Xen Summit AMD 2010