SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Downloaden Sie, um offline zu lesen
1
GPU Acceleration for Container on Intel
Processor Graphics
Zhenyu Wang
zhenyuw@linux.intel.com
2
Agenda
GPU Containers
GPU Namespace
GPU Control Group
Integration with Container Runtime
VM Containers
Status
3
GPU Containers
Video Delivery
(Encode/Decode/Transcode)
Cloud Graphics
(Remote desktop/gaming)
HPC/Machine Learning
(AI, Analytic, …)
E3-G Discrete
GPU
Fast provisioning
High Density
Little Overhead
4
What Makes a Container?
Platform
Host OS
Namespaces
Control
groups
App
Libs
App
Libs
App
Libs
Platform
Hypervisor
App
Libs
App
Libs
Guest OS
App
Libs
Guest OS
Bare Metal Containers
Docker DockerDocker
VM Containers
(e.g. Clear Container)
5
Path to GPU Containers
Bare Metal Containers VM Containers
GPU namespaces
GPU control groups
GPU virtualization technology
Integrate with container runtime
6
GPU Namespace:
DRM Render Node
DRM/i915
Card# RenderD#
Xserver ComputeMedia
Xcode
Unprivileged
No mode-setting
Offscreen
3D
Privileged
Mode setting
Global gem flink
7
GPU Namespace:
DRM Render Node
DRM/i915
RenderD#
Media
Xcode
Libs
Offscreen
3D
Libs
Compute
Libs
GPU
Contexts
Sharing happen with dmabuf only (no
global gem object on card0)
No need of creating another namespace
Expose DRM render nodes to provide
isolated views through device cgroup
 e.g “docker run –device=/dev/dri/renderD128
–it debian”
Per-process hardware context in GPU
 Hardware managed render context switch
8
GPU CGroup
CPU Mem BlkIO Net GPU
container
Container
resource
manager
Cgroup
controller
container container container
…
Container-granule GPU resource control
9
GPU CGroup
New subsystem: gpu_cgrp_subsys
Control knobs
 gpu.shares (% share of GPU cycles, work in progress)
 gpu.priority (workload priority in the system)
 gpu.memory (maximum GPU memory size)
DRM/i915
 Favor ‘shares/priority’ in workload scheduler
 Enforce memory limitation (optionally for UMA graphics)
10
Container Runtime Integration
{
“linux”: {
“devices”: [
{
“path”: “/dev/dri/renderD128”,
“type”: “c”,
“major”: 226,
“minor”: 128,
“fileMode”: 432,
“uid”: 0,
“gid”: 0
}
],
“resources”: {
…
“gpu”: {
“memory”: <max_mem_in_bytes>,
“prio”: <workload_priority>
}
}
}
RunC: an universal container runtime
 Understand new gpu cgroup
 Add new Linux resources config.json options for gpu
 Runtime-tools helper to generate config stubs
New Docker GPU control parameters
e.g docker –gpu-mem=xxx –gpu-priority=xxx
–gpu-share=xxx …
11
Image Portability
Main challenge - library compatibility
 User level libraries MUST match underlying kernel/HW
Introduce Docker helper for image inspection
 Compare image libraries with host environment
 Reject to launch container upon any mismatch
12
VM GPU Containers
DRM/i915
vGPU
Media
Xcode
Libs
GPU
vGPU vGPU
Intel GVT-g
Guest OS
Media
Xcode
Libs
Guest OS
Media
Xcode
Libs
Guest OS
CC VM CC VM CC VM Clear Container (CC)
Intel graphics virtualization technology
(Intel GVT-g)
Assign vGPU to VM container
In upstream Linux since v4.10
https://01.org/igvt-g
13
VM GPU Containers
Nested containers in VM also have GPU access
Need Improvement:
Scalability
 Only support 8vGPUs today due to resource partitioning
 Need some enlightened way to further scale in the short term
Boot time
 Full guest graphics driver load takes >1.5s
 Fast optimization to <0.5s in progress
14
Status
 GPU cgroup PoC
https://github.com/zhenyw/linux
https://github.com/zhenyw/runc
https://github.com/zhenyw/moby
https://github.com/zhenyw/cli
 GPU virtualization for Clear container
https://clearlinux.org/features/intel%C2%AE-clear-containers
https://github.com/01org/gvt-linux/wiki/Clear-Container-with-GVTg-Setup-Guide
Q & A

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Dell Technologies Dell EMC ISG Storage, CI, HCI and Data Protection Portfolio...
Dell Technologies Dell EMC ISG Storage, CI, HCI and Data Protection Portfolio...Dell Technologies Dell EMC ISG Storage, CI, HCI and Data Protection Portfolio...
Dell Technologies Dell EMC ISG Storage, CI, HCI and Data Protection Portfolio...
 
Cloud Management
Cloud ManagementCloud Management
Cloud Management
 
Project ACRN expose and pass through platform hidden PCIe devices to SOS
Project ACRN expose and pass through platform hidden PCIe devices to SOSProject ACRN expose and pass through platform hidden PCIe devices to SOS
Project ACRN expose and pass through platform hidden PCIe devices to SOS
 
Accessing Hardware on Android
Accessing Hardware on AndroidAccessing Hardware on Android
Accessing Hardware on Android
 
RunX: deploy real-time OSes as containers at the edge
RunX: deploy real-time OSes as containers at the edgeRunX: deploy real-time OSes as containers at the edge
RunX: deploy real-time OSes as containers at the edge
 
Better Together: Delivering Graph Value with AWS & Neo4j - Antony Prasad The...
Better Together:  Delivering Graph Value with AWS & Neo4j - Antony Prasad The...Better Together:  Delivering Graph Value with AWS & Neo4j - Antony Prasad The...
Better Together: Delivering Graph Value with AWS & Neo4j - Antony Prasad The...
 
Kvm virtualization platform
Kvm virtualization platformKvm virtualization platform
Kvm virtualization platform
 
VMware Tutorial For Beginners | VMware Workstation | VMware Virtualization | ...
VMware Tutorial For Beginners | VMware Workstation | VMware Virtualization | ...VMware Tutorial For Beginners | VMware Workstation | VMware Virtualization | ...
VMware Tutorial For Beginners | VMware Workstation | VMware Virtualization | ...
 
Xen Hypervisor
Xen HypervisorXen Hypervisor
Xen Hypervisor
 
VMware Presentation
VMware PresentationVMware Presentation
VMware Presentation
 
Storage Virtualization
Storage VirtualizationStorage Virtualization
Storage Virtualization
 
Virtualization.ppt
Virtualization.pptVirtualization.ppt
Virtualization.ppt
 
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
 
Virtualization
VirtualizationVirtualization
Virtualization
 
Confidential Computing overview
Confidential Computing overviewConfidential Computing overview
Confidential Computing overview
 
Aws seminar report
Aws seminar report Aws seminar report
Aws seminar report
 
Monitoring MySQL Replication lag with Prometheus & pt-heartbeat
Monitoring MySQL Replication lag with Prometheus & pt-heartbeatMonitoring MySQL Replication lag with Prometheus & pt-heartbeat
Monitoring MySQL Replication lag with Prometheus & pt-heartbeat
 
Embedded Hypervisor for ARM
Embedded Hypervisor for ARMEmbedded Hypervisor for ARM
Embedded Hypervisor for ARM
 
Platform as a Service (PaaS) - A cloud service for Developers
Platform as a Service (PaaS) - A cloud service for Developers Platform as a Service (PaaS) - A cloud service for Developers
Platform as a Service (PaaS) - A cloud service for Developers
 
Open ebs 101
Open ebs 101Open ebs 101
Open ebs 101
 

Andere mochten auch

Andere mochten auch (20)

Make Accelerator Pluggable for Container Engine
Make Accelerator Pluggable for Container EngineMake Accelerator Pluggable for Container Engine
Make Accelerator Pluggable for Container Engine
 
Policy-based Resource Placement
Policy-based Resource PlacementPolicy-based Resource Placement
Policy-based Resource Placement
 
Zephyr: Creating a Best-of-Breed, Secure RTOS for IoT
Zephyr: Creating a Best-of-Breed, Secure RTOS for IoTZephyr: Creating a Best-of-Breed, Secure RTOS for IoT
Zephyr: Creating a Best-of-Breed, Secure RTOS for IoT
 
OpenStack on AArch64
OpenStack on AArch64OpenStack on AArch64
OpenStack on AArch64
 
Hyperledger Technical Community in China.
Hyperledger Technical Community in China. Hyperledger Technical Community in China.
Hyperledger Technical Community in China.
 
OpenDaylight OpenStack Integration
OpenDaylight OpenStack IntegrationOpenDaylight OpenStack Integration
OpenDaylight OpenStack Integration
 
Simplify Networking for Containers
Simplify Networking for ContainersSimplify Networking for Containers
Simplify Networking for Containers
 
Obstacles & Solutions for Livepatch Support on ARM64 Architecture
Obstacles & Solutions for Livepatch Support on ARM64 ArchitectureObstacles & Solutions for Livepatch Support on ARM64 Architecture
Obstacles & Solutions for Livepatch Support on ARM64 Architecture
 
Linux Kernel Development
Linux Kernel DevelopmentLinux Kernel Development
Linux Kernel Development
 
Linuxcon secureefficientcontainerimagemanagementharbor
Linuxcon secureefficientcontainerimagemanagementharborLinuxcon secureefficientcontainerimagemanagementharbor
Linuxcon secureefficientcontainerimagemanagementharbor
 
LiteOS
LiteOS LiteOS
LiteOS
 
Libvirt API Certification
Libvirt API CertificationLibvirt API Certification
Libvirt API Certification
 
Status of Embedded Linux
Status of Embedded LinuxStatus of Embedded Linux
Status of Embedded Linux
 
点融网区块链即服务实践 - The Practice of Blockchain as a Service in Dianrong
点融网区块链即服务实践 - The Practice of Blockchain as a Service in Dianrong点融网区块链即服务实践 - The Practice of Blockchain as a Service in Dianrong
点融网区块链即服务实践 - The Practice of Blockchain as a Service in Dianrong
 
Rethinking the OS
Rethinking the OSRethinking the OS
Rethinking the OS
 
Building a Better Thermostat
Building a Better ThermostatBuilding a Better Thermostat
Building a Better Thermostat
 
OCI Support in Mesos
OCI Support in MesosOCI Support in Mesos
OCI Support in Mesos
 
Releasing a Distribution in the Age of DevOps.
Releasing a Distribution in the Age of DevOps. Releasing a Distribution in the Age of DevOps.
Releasing a Distribution in the Age of DevOps.
 
High Performance Linux Virtual Machine on Microsoft Azure: SR-IOV Networking ...
High Performance Linux Virtual Machine on Microsoft Azure: SR-IOV Networking ...High Performance Linux Virtual Machine on Microsoft Azure: SR-IOV Networking ...
High Performance Linux Virtual Machine on Microsoft Azure: SR-IOV Networking ...
 
kdump: usage and_internals
kdump: usage and_internalskdump: usage and_internals
kdump: usage and_internals
 

Ähnlich wie GPU Acceleration for Containers on Intel Processor Graphics

Kernel Recipes 2014 - The Linux graphics stack and Nouveau driver
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driverKernel Recipes 2014 - The Linux graphics stack and Nouveau driver
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driver
Anne Nicolas
 
ELC-E Linux Awareness
ELC-E Linux AwarenessELC-E Linux Awareness
ELC-E Linux Awareness
Peter Griffin
 
CS 626 - March : Capsicum: Practical Capabilities for UNIX
CS 626 - March : Capsicum: Practical Capabilities for UNIXCS 626 - March : Capsicum: Practical Capabilities for UNIX
CS 626 - March : Capsicum: Practical Capabilities for UNIX
ruchith
 

Ähnlich wie GPU Acceleration for Containers on Intel Processor Graphics (20)

Best practices for optimizing Red Hat platforms for large scale datacenter de...
Best practices for optimizing Red Hat platforms for large scale datacenter de...Best practices for optimizing Red Hat platforms for large scale datacenter de...
Best practices for optimizing Red Hat platforms for large scale datacenter de...
 
Quantifying Your World with AI & Docker on the Edge | OSCONF 2020 Jaipur
Quantifying Your World with AI & Docker  on the Edge | OSCONF 2020 JaipurQuantifying Your World with AI & Docker  on the Edge | OSCONF 2020 Jaipur
Quantifying Your World with AI & Docker on the Edge | OSCONF 2020 Jaipur
 
App container rkt
App container rktApp container rkt
App container rkt
 
Настройка окружения для кросскомпиляции проектов на основе docker'a
Настройка окружения для кросскомпиляции проектов на основе docker'aНастройка окружения для кросскомпиляции проектов на основе docker'a
Настройка окружения для кросскомпиляции проектов на основе docker'a
 
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driver
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driverKernel Recipes 2014 - The Linux graphics stack and Nouveau driver
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driver
 
Tensorflow in Docker
Tensorflow in DockerTensorflow in Docker
Tensorflow in Docker
 
I Just Want to Run My Code: Waypoint, Nomad, and Other Things
I Just Want to Run My Code: Waypoint, Nomad, and Other ThingsI Just Want to Run My Code: Waypoint, Nomad, and Other Things
I Just Want to Run My Code: Waypoint, Nomad, and Other Things
 
State of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDataState of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigData
 
ELC-E Linux Awareness
ELC-E Linux AwarenessELC-E Linux Awareness
ELC-E Linux Awareness
 
S0333 gtc2012-gmac-programming-cuda
S0333 gtc2012-gmac-programming-cudaS0333 gtc2012-gmac-programming-cuda
S0333 gtc2012-gmac-programming-cuda
 
Delivering Docker & K3s worloads to IoT Edge devices
Delivering Docker & K3s worloads to IoT Edge devicesDelivering Docker & K3s worloads to IoT Edge devices
Delivering Docker & K3s worloads to IoT Edge devices
 
Kubernetes - training micro-dragons without getting burnt
Kubernetes -  training micro-dragons without getting burntKubernetes -  training micro-dragons without getting burnt
Kubernetes - training micro-dragons without getting burnt
 
Accelerate your development with Docker
Accelerate your development with DockerAccelerate your development with Docker
Accelerate your development with Docker
 
Accelerate your software development with Docker
Accelerate your software development with DockerAccelerate your software development with Docker
Accelerate your software development with Docker
 
DCSF 19 Accelerating Docker Containers with NVIDIA GPUs
DCSF 19 Accelerating Docker Containers with NVIDIA GPUsDCSF 19 Accelerating Docker Containers with NVIDIA GPUs
DCSF 19 Accelerating Docker Containers with NVIDIA GPUs
 
GPU Programming with Java
GPU Programming with JavaGPU Programming with Java
GPU Programming with Java
 
Kubernetes Multitenancy Karl Isenberg - KubeCon NA 2019
Kubernetes Multitenancy   Karl Isenberg - KubeCon NA 2019Kubernetes Multitenancy   Karl Isenberg - KubeCon NA 2019
Kubernetes Multitenancy Karl Isenberg - KubeCon NA 2019
 
CS 626 - March : Capsicum: Practical Capabilities for UNIX
CS 626 - March : Capsicum: Practical Capabilities for UNIXCS 626 - March : Capsicum: Practical Capabilities for UNIX
CS 626 - March : Capsicum: Practical Capabilities for UNIX
 
The internals and the latest trends of container runtimes
The internals and the latest trends of container runtimesThe internals and the latest trends of container runtimes
The internals and the latest trends of container runtimes
 
Introduction to Docker and all things containers, Docker Meetup at RelateIQ
Introduction to Docker and all things containers, Docker Meetup at RelateIQIntroduction to Docker and all things containers, Docker Meetup at RelateIQ
Introduction to Docker and all things containers, Docker Meetup at RelateIQ
 

Mehr von LinuxCon ContainerCon CloudOpen China

Mehr von LinuxCon ContainerCon CloudOpen China (16)

SecurityPI - Hardening your IoT endpoints in Home.
SecurityPI - Hardening your IoT endpoints in Home. SecurityPI - Hardening your IoT endpoints in Home.
SecurityPI - Hardening your IoT endpoints in Home.
 
Scale Kubernetes to support 50000 services
Scale Kubernetes to support 50000 servicesScale Kubernetes to support 50000 services
Scale Kubernetes to support 50000 services
 
Flowchain: A case study on building a Blockchain for the IoT
Flowchain: A case study on building a Blockchain for the IoTFlowchain: A case study on building a Blockchain for the IoT
Flowchain: A case study on building a Blockchain for the IoT
 
Secure Containers with EPT Isolation
Secure Containers with EPT IsolationSecure Containers with EPT Isolation
Secure Containers with EPT Isolation
 
Open Source Software Business Models Redux
Open Source Software Business Models ReduxOpen Source Software Business Models Redux
Open Source Software Business Models Redux
 
Running Legacy Applications with Containers
Running Legacy Applications with ContainersRunning Legacy Applications with Containers
Running Legacy Applications with Containers
 
Introduction to OCI Image Technologies Serving Container
Introduction to OCI Image Technologies Serving ContainerIntroduction to OCI Image Technologies Serving Container
Introduction to OCI Image Technologies Serving Container
 
Rebuild - Simplifying Embedded and IoT Development Using Linux Containers
Rebuild - Simplifying Embedded and IoT Development Using Linux ContainersRebuild - Simplifying Embedded and IoT Development Using Linux Containers
Rebuild - Simplifying Embedded and IoT Development Using Linux Containers
 
From Resilient to Antifragile Chaos Engineering Primer
From Resilient to Antifragile Chaos Engineering PrimerFrom Resilient to Antifragile Chaos Engineering Primer
From Resilient to Antifragile Chaos Engineering Primer
 
See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...
 
UEFI HTTP/HTTPS Boot
UEFI HTTP/HTTPS BootUEFI HTTP/HTTPS Boot
UEFI HTTP/HTTPS Boot
 
How Open Source Communities do Standardization
How Open Source Communities do StandardizationHow Open Source Communities do Standardization
How Open Source Communities do Standardization
 
Fully automated kubernetes deployment and management
Fully automated kubernetes deployment and managementFully automated kubernetes deployment and management
Fully automated kubernetes deployment and management
 
Is there still room for innovation in container orchestration and scheduling
Is there still room for innovation in container orchestration and scheduling Is there still room for innovation in container orchestration and scheduling
Is there still room for innovation in container orchestration and scheduling
 
Container Security
Container SecurityContainer Security
Container Security
 
Quickly Debug VM Failures in OpenStack
Quickly Debug VM Failures in OpenStackQuickly Debug VM Failures in OpenStack
Quickly Debug VM Failures in OpenStack
 

Kürzlich hochgeladen

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 

GPU Acceleration for Containers on Intel Processor Graphics

  • 1. 1 GPU Acceleration for Container on Intel Processor Graphics Zhenyu Wang zhenyuw@linux.intel.com
  • 2. 2 Agenda GPU Containers GPU Namespace GPU Control Group Integration with Container Runtime VM Containers Status
  • 3. 3 GPU Containers Video Delivery (Encode/Decode/Transcode) Cloud Graphics (Remote desktop/gaming) HPC/Machine Learning (AI, Analytic, …) E3-G Discrete GPU Fast provisioning High Density Little Overhead
  • 4. 4 What Makes a Container? Platform Host OS Namespaces Control groups App Libs App Libs App Libs Platform Hypervisor App Libs App Libs Guest OS App Libs Guest OS Bare Metal Containers Docker DockerDocker VM Containers (e.g. Clear Container)
  • 5. 5 Path to GPU Containers Bare Metal Containers VM Containers GPU namespaces GPU control groups GPU virtualization technology Integrate with container runtime
  • 6. 6 GPU Namespace: DRM Render Node DRM/i915 Card# RenderD# Xserver ComputeMedia Xcode Unprivileged No mode-setting Offscreen 3D Privileged Mode setting Global gem flink
  • 7. 7 GPU Namespace: DRM Render Node DRM/i915 RenderD# Media Xcode Libs Offscreen 3D Libs Compute Libs GPU Contexts Sharing happen with dmabuf only (no global gem object on card0) No need of creating another namespace Expose DRM render nodes to provide isolated views through device cgroup  e.g “docker run –device=/dev/dri/renderD128 –it debian” Per-process hardware context in GPU  Hardware managed render context switch
  • 8. 8 GPU CGroup CPU Mem BlkIO Net GPU container Container resource manager Cgroup controller container container container … Container-granule GPU resource control
  • 9. 9 GPU CGroup New subsystem: gpu_cgrp_subsys Control knobs  gpu.shares (% share of GPU cycles, work in progress)  gpu.priority (workload priority in the system)  gpu.memory (maximum GPU memory size) DRM/i915  Favor ‘shares/priority’ in workload scheduler  Enforce memory limitation (optionally for UMA graphics)
  • 10. 10 Container Runtime Integration { “linux”: { “devices”: [ { “path”: “/dev/dri/renderD128”, “type”: “c”, “major”: 226, “minor”: 128, “fileMode”: 432, “uid”: 0, “gid”: 0 } ], “resources”: { … “gpu”: { “memory”: <max_mem_in_bytes>, “prio”: <workload_priority> } } } RunC: an universal container runtime  Understand new gpu cgroup  Add new Linux resources config.json options for gpu  Runtime-tools helper to generate config stubs New Docker GPU control parameters e.g docker –gpu-mem=xxx –gpu-priority=xxx –gpu-share=xxx …
  • 11. 11 Image Portability Main challenge - library compatibility  User level libraries MUST match underlying kernel/HW Introduce Docker helper for image inspection  Compare image libraries with host environment  Reject to launch container upon any mismatch
  • 12. 12 VM GPU Containers DRM/i915 vGPU Media Xcode Libs GPU vGPU vGPU Intel GVT-g Guest OS Media Xcode Libs Guest OS Media Xcode Libs Guest OS CC VM CC VM CC VM Clear Container (CC) Intel graphics virtualization technology (Intel GVT-g) Assign vGPU to VM container In upstream Linux since v4.10 https://01.org/igvt-g
  • 13. 13 VM GPU Containers Nested containers in VM also have GPU access Need Improvement: Scalability  Only support 8vGPUs today due to resource partitioning  Need some enlightened way to further scale in the short term Boot time  Full guest graphics driver load takes >1.5s  Fast optimization to <0.5s in progress
  • 14. 14 Status  GPU cgroup PoC https://github.com/zhenyw/linux https://github.com/zhenyw/runc https://github.com/zhenyw/moby https://github.com/zhenyw/cli  GPU virtualization for Clear container https://clearlinux.org/features/intel%C2%AE-clear-containers https://github.com/01org/gvt-linux/wiki/Clear-Container-with-GVTg-Setup-Guide
  • 15. Q & A