SlideShare ist ein Scribd-Unternehmen logo
1 von 10
© The Khronos® Group Inc. 2020 - Page 1
This work is licensed under a Creative Commons Attribution 4.0 International License
Machine Learning in
Khronos | Vulkan
Pierre Boudier
NVIDIA
January 2021
© The Khronos® Group Inc. 2020 - Page 2
This work is licensed under a Creative Commons Attribution 4.0 International License
Objectives of Vulkan Machine Learning (ML)
• Enable native Vulkan application to use ML with low latency and overhead
- Avoid interop, or to embed very large third-party frameworks (python)
• There are many common blocks in Neural Nets:
- Matrix multiplication
- Convolution
- Tanh/Sigmoid/ReLU
- …
• But the eco system is actually diverse and new mathematical approaches are introduced
every week:
- Locality sensitive hashing, binary weights, sparse matrices, …
• Vulkan should not limit itself to current architectures
- Luckily, Vulkan already has a compute shader abstraction !
© The Khronos® Group Inc. 2020 - Page 3
This work is licensed under a Creative Commons Attribution 4.0 International License
Training Data
Trained
Networks
NNEF/ONNX
Neural Network
Training
Inferencing Acceleration
GPU
Compiled
Code
Hardware Acceleration APIs
Accélération Hardware
(GPUs)
Applications link to
compiled inferencing code
or dynamically generate it
Networks trained on high-end
desktop and cloud systems
© The Khronos® Group Inc. 2020 - Page 4
This work is licensed under a Creative Commons Attribution 4.0 International License
Data type extensions
• By default, the 32 bit floating precision is used for both training and
inferencing, which are basically just running a computational graph:
- Training runs a forward pass, and often times a backward pass to propagate
back the gradient
- Inferencing is just about doing the forward pass
• But both can be done in lower precision types for faster compute time and
reduced data storage
- The following extensions are available in vulkan 1.2:
- VK_KHR_shader_float16_int8
- VK_KHR_8bit_storage / VK_KHR_16bit_storage
- 8 bit integers data types are used for quantized Neural Nets
- FP16 data types can be used for faster math with gradient rescaling in training
- Upcoming: SPV_KHR_integer_dot_product
- Take advantage of fast integer math without implicit compiler peephole optimization
for quantized neural networks
© The Khronos® Group Inc. 2020 - Page 5
This work is licensed under a Creative Commons Attribution 4.0 International License
Improved Compute Shader
• New extensions are devised to improve efficiency
- Upcoming:
- VK_KHR_workgroup_memory_explicit_layout
- Allow more efficient data loading into shared memory for further use with efficient
matrix multiplication operations.
- VK_EXT_ML_primitives
- Exposes basic primitives used in the main stream Neural Nets as optimized building
blocks
- Available:
- VK_NV_cooperative_matrix (NVIDIA)
- Exposes high throughput matrix/vector multiplication hardware units
- Typically used be convolution / matmul layer in fp16 formats
© The Khronos® Group Inc. 2020 - Page 6
This work is licensed under a Creative Commons Attribution 4.0 International License
Import Formats
Caffe, Keras,
MXNet, ONNX
TensorFlow Graph,
PyTorch, ONNX
Front-end / IR NNVM / Relay IR XLA HLO
Output SPIR-V SPIR-V / vulkan API
Primary Machine Learning Compilers
© The Khronos® Group Inc. 2020 - Page 7
This work is licensed under a Creative Commons Attribution 4.0 International License
Google MLIR and IREE Compilers
MLIR
Multi-level Intermediate Representation
Format and library of compiler utilities that
sits between the trained model representation
and low-level compilers/executors that
generate hardware-specific code
IREE
Intermediate Representation
Execution Environment
Lowers and optimizes ML models for real-time
accelerated inferencing on mobile/edge
heterogeneous hardware
Contains scheduling logic to communicate
data dependencies to low-level parallel
pipelined hardware/APIs like Vulkan,
and execution logic to encode dense
computation in the form of hardware/API-
specific binaries like SPIR-V
IREE is a research project today. Google is working with Khronos
working groups to explore how SPIR-V code can provide effective
inferencing acceleration on APIs such as Vulkan
Trained Models
Generate Hardware
Specific Binaries
Optimizes and Lowers
for Acceleration
© The Khronos® Group Inc. 2020 - Page 8
This work is licensed under a Creative Commons Attribution 4.0 International License
Third Party frameworks
Alibaba Tencent Ax inc.
MNN NCNN Ailia
https://github.com/alibaba/M
NN
https://github.com/Tencent/nc
nn
https://axinc.jp/en/
Lightweight frameworks for
inferencing and training for
mobile platforms
Cross platform optimized
inferencing framework
SDK for optimized running of
many popular neural networks
on multiple platform
© The Khronos® Group Inc. 2020 - Page 9
This work is licensed under a Creative Commons Attribution 4.0 International License
Non Neural Net Machine Learning
Jülich Ethical ML UNITY
VkFFT Kompute ML agents
https://github.com/DTolm/VkF
FT
https://github.com/EthicalML/
vulkan-kompute
https://github.com/Unity-
Technologies/ml-agents
Accelerated fast fourrier
transform library, which can be
used for image resampling,
image registration, signal
processing …
General purpose GPU compute
framework cross graphics card
Open source project that
enables simulation for training
intelligent agents via the Unity
engine
© The Khronos® Group Inc. 2020 - Page 10
This work is licensed under a Creative Commons Attribution 4.0 International License
Call To Action
• The Vulkan Machine Learning Subgroup is welcoming feedback:
- What functionality would you like to see in Vulkan to accelerate your ML needs ?
- Neural Nets
- Other algorithms:
- random forest
- logistic regression
- K-nearest neighbor
- T-SNE
- …
- Do you have existing product using Vulkan for ML:
- Do you want to get in touch with hardware vendors ?
- Do you have suggestion on how to improve your pain points?
• Contact us at: pboudier@nvidia.com
• Help us and join the Vulkan Advisory Panel, or become a Khronos member

Weitere ähnliche Inhalte

Was ist angesagt?

20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"
20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"
20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"
IBM France Lab
 
Open Stack Cloud Services
Open Stack Cloud ServicesOpen Stack Cloud Services
Open Stack Cloud Services
Saurabh Gupta
 
M9 cloud & open source
M9 cloud & open sourceM9 cloud & open source
M9 cloud & open source
Josep Bardallo
 
Presentazione IBM Power System Evento Venaria 14 ottobre
Presentazione IBM Power System Evento Venaria 14 ottobrePresentazione IBM Power System Evento Venaria 14 ottobre
Presentazione IBM Power System Evento Venaria 14 ottobre
PRAGMA PROGETTI
 

Was ist angesagt? (20)

NFV features in kubernetes
NFV features in kubernetesNFV features in kubernetes
NFV features in kubernetes
 
"Current and Planned Standards for Computer Vision and Machine Learning," a P...
"Current and Planned Standards for Computer Vision and Machine Learning," a P..."Current and Planned Standards for Computer Vision and Machine Learning," a P...
"Current and Planned Standards for Computer Vision and Machine Learning," a P...
 
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...
 
Construire une « data fabric » pour les environnements edge
Construire une « data fabric » pour les environnements edgeConstruire une « data fabric » pour les environnements edge
Construire une « data fabric » pour les environnements edge
 
KubeCon China June 2019 - Survey of Kubernetes related solutions for IoT and ...
KubeCon China June 2019 - Survey of Kubernetes related solutions for IoT and ...KubeCon China June 2019 - Survey of Kubernetes related solutions for IoT and ...
KubeCon China June 2019 - Survey of Kubernetes related solutions for IoT and ...
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
 
Shree duth awasthi_cv
Shree duth awasthi_cvShree duth awasthi_cv
Shree duth awasthi_cv
 
Container Native Development Tools - Talk by Mickey Boxell
Container Native Development Tools - Talk by Mickey BoxellContainer Native Development Tools - Talk by Mickey Boxell
Container Native Development Tools - Talk by Mickey Boxell
 
Shree_Duth_Awasthi_Resume
Shree_Duth_Awasthi_ResumeShree_Duth_Awasthi_Resume
Shree_Duth_Awasthi_Resume
 
PKS is Not JAK8sP (Just Another Kubernetes Platform)
PKS is Not JAK8sP (Just Another Kubernetes Platform)PKS is Not JAK8sP (Just Another Kubernetes Platform)
PKS is Not JAK8sP (Just Another Kubernetes Platform)
 
20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"
20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"
20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"
 
Open Stack Cloud Services
Open Stack Cloud ServicesOpen Stack Cloud Services
Open Stack Cloud Services
 
P to V to C: The Value of Bringing “Everything” to Containers
P to V to C: The Value of Bringing “Everything” to ContainersP to V to C: The Value of Bringing “Everything” to Containers
P to V to C: The Value of Bringing “Everything” to Containers
 
M9 cloud & open source
M9 cloud & open sourceM9 cloud & open source
M9 cloud & open source
 
cncf overview and building edge computing using kubernetes
cncf overview and building edge computing using kubernetescncf overview and building edge computing using kubernetes
cncf overview and building edge computing using kubernetes
 
Watson on bluemix
Watson on bluemixWatson on bluemix
Watson on bluemix
 
Presentazione IBM Power System Evento Venaria 14 ottobre
Presentazione IBM Power System Evento Venaria 14 ottobrePresentazione IBM Power System Evento Venaria 14 ottobre
Presentazione IBM Power System Evento Venaria 14 ottobre
 
Considering Bare Metal
Considering Bare MetalConsidering Bare Metal
Considering Bare Metal
 
Curated "Cloud Design Patterns" for Call Center Platforms
Curated "Cloud Design Patterns" for Call Center PlatformsCurated "Cloud Design Patterns" for Call Center Platforms
Curated "Cloud Design Patterns" for Call Center Platforms
 
Apache DeviceMap - ApacheCon Europe 2014
Apache DeviceMap - ApacheCon Europe 2014Apache DeviceMap - ApacheCon Europe 2014
Apache DeviceMap - ApacheCon Europe 2014
 

Ähnlich wie Vulkan ML Japan Virtual Open House Feb 2021

"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre..."APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
Edge AI and Vision Alliance
 
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
Linaro
 

Ähnlich wie Vulkan ML Japan Virtual Open House Feb 2021 (20)

"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...
"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option..."APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...
"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...
 
OpenStackDay - XIFI Federation
OpenStackDay - XIFI FederationOpenStackDay - XIFI Federation
OpenStackDay - XIFI Federation
 
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre..."APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
 
Matteo Valoriani, Antimo Musone - The Future of Factory - Codemotion Rome 2019
Matteo Valoriani, Antimo Musone - The Future of Factory - Codemotion Rome 2019Matteo Valoriani, Antimo Musone - The Future of Factory - Codemotion Rome 2019
Matteo Valoriani, Antimo Musone - The Future of Factory - Codemotion Rome 2019
 
Enabling NFV features in kubernetes
Enabling NFV features in kubernetesEnabling NFV features in kubernetes
Enabling NFV features in kubernetes
 
IBM: The Linux Ecosystem
IBM: The Linux EcosystemIBM: The Linux Ecosystem
IBM: The Linux Ecosystem
 
OpenStack - An Overview
OpenStack - An OverviewOpenStack - An Overview
OpenStack - An Overview
 
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation..."Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
 
Implementing AI: High Performace Architectures
Implementing AI: High Performace ArchitecturesImplementing AI: High Performace Architectures
Implementing AI: High Performace Architectures
 
Conf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsConf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python Processors
 
The Future of Networks is Open...Source
The Future of Networks is Open...SourceThe Future of Networks is Open...Source
The Future of Networks is Open...Source
 
Strata - Scaling Jupyter with Jupyter Enterprise Gateway
Strata - Scaling Jupyter with Jupyter Enterprise GatewayStrata - Scaling Jupyter with Jupyter Enterprise Gateway
Strata - Scaling Jupyter with Jupyter Enterprise Gateway
 
Rapyuta a cloud robotics platform
Rapyuta a cloud robotics platformRapyuta a cloud robotics platform
Rapyuta a cloud robotics platform
 
Learn .NET Core - Introduction
Learn .NET Core - IntroductionLearn .NET Core - Introduction
Learn .NET Core - Introduction
 
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
 
Built Cross-Platform Application with .NET Core Development.pdf
Built Cross-Platform Application with .NET Core Development.pdfBuilt Cross-Platform Application with .NET Core Development.pdf
Built Cross-Platform Application with .NET Core Development.pdf
 
IoTWorld 2016 OSS Keynote Param Singh, Ian Skerrett
IoTWorld 2016 OSS Keynote Param Singh, Ian SkerrettIoTWorld 2016 OSS Keynote Param Singh, Ian Skerrett
IoTWorld 2016 OSS Keynote Param Singh, Ian Skerrett
 
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
 
Parallel universe-issue-29
Parallel universe-issue-29Parallel universe-issue-29
Parallel universe-issue-29
 
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
 

Mehr von The Khronos Group Inc.

Mehr von The Khronos Group Inc. (20)

OpenXR 1.0 Reference Guide
OpenXR 1.0 Reference GuideOpenXR 1.0 Reference Guide
OpenXR 1.0 Reference Guide
 
Vulkan Ray Tracing Update JP Translation
Vulkan Ray Tracing Update JP TranslationVulkan Ray Tracing Update JP Translation
Vulkan Ray Tracing Update JP Translation
 
Vulkan ML JP Translation
Vulkan ML JP TranslationVulkan ML JP Translation
Vulkan ML JP Translation
 
OpenCL Overview JP Translation
OpenCL Overview JP TranslationOpenCL Overview JP Translation
OpenCL Overview JP Translation
 
glTF overview JP Translation
glTF overview JP TranslationglTF overview JP Translation
glTF overview JP Translation
 
Khronos Overview JP Translation
Khronos Overview JP TranslationKhronos Overview JP Translation
Khronos Overview JP Translation
 
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021
 
OpenCL 3.0 Reference Guide
OpenCL 3.0 Reference GuideOpenCL 3.0 Reference Guide
OpenCL 3.0 Reference Guide
 
OpenVX 1.3 Reference Guide
OpenVX 1.3 Reference GuideOpenVX 1.3 Reference Guide
OpenVX 1.3 Reference Guide
 
OpenXR 0.90 Overview Guide
OpenXR 0.90 Overview GuideOpenXR 0.90 Overview Guide
OpenXR 0.90 Overview Guide
 
Vulkan 1.1 Reference Guide
Vulkan 1.1 Reference GuideVulkan 1.1 Reference Guide
Vulkan 1.1 Reference Guide
 
SYCL 1.2.1 Reference Card
SYCL 1.2.1 Reference CardSYCL 1.2.1 Reference Card
SYCL 1.2.1 Reference Card
 
OpenCL 2.2 Reference Guide
OpenCL 2.2 Reference GuideOpenCL 2.2 Reference Guide
OpenCL 2.2 Reference Guide
 
OpenGL 4.6 Reference Guide
OpenGL 4.6 Reference GuideOpenGL 4.6 Reference Guide
OpenGL 4.6 Reference Guide
 
glTF 2.0 Reference Guide
glTF 2.0 Reference GuideglTF 2.0 Reference Guide
glTF 2.0 Reference Guide
 
OpenVX 1.2 Reference Guide
OpenVX 1.2 Reference GuideOpenVX 1.2 Reference Guide
OpenVX 1.2 Reference Guide
 
WebGL 2.0 Reference Guide
WebGL 2.0 Reference GuideWebGL 2.0 Reference Guide
WebGL 2.0 Reference Guide
 
OpenGL SC 2.0 Quick Reference
OpenGL SC 2.0 Quick ReferenceOpenGL SC 2.0 Quick Reference
OpenGL SC 2.0 Quick Reference
 
OpenVX 1.1 Reference Guide
OpenVX 1.1 Reference GuideOpenVX 1.1 Reference Guide
OpenVX 1.1 Reference Guide
 
Vulkan 1.0 Quick Reference
Vulkan 1.0 Quick ReferenceVulkan 1.0 Quick Reference
Vulkan 1.0 Quick Reference
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 

Vulkan ML Japan Virtual Open House Feb 2021

  • 1. © The Khronos® Group Inc. 2020 - Page 1 This work is licensed under a Creative Commons Attribution 4.0 International License Machine Learning in Khronos | Vulkan Pierre Boudier NVIDIA January 2021
  • 2. © The Khronos® Group Inc. 2020 - Page 2 This work is licensed under a Creative Commons Attribution 4.0 International License Objectives of Vulkan Machine Learning (ML) • Enable native Vulkan application to use ML with low latency and overhead - Avoid interop, or to embed very large third-party frameworks (python) • There are many common blocks in Neural Nets: - Matrix multiplication - Convolution - Tanh/Sigmoid/ReLU - … • But the eco system is actually diverse and new mathematical approaches are introduced every week: - Locality sensitive hashing, binary weights, sparse matrices, … • Vulkan should not limit itself to current architectures - Luckily, Vulkan already has a compute shader abstraction !
  • 3. © The Khronos® Group Inc. 2020 - Page 3 This work is licensed under a Creative Commons Attribution 4.0 International License Training Data Trained Networks NNEF/ONNX Neural Network Training Inferencing Acceleration GPU Compiled Code Hardware Acceleration APIs Accélération Hardware (GPUs) Applications link to compiled inferencing code or dynamically generate it Networks trained on high-end desktop and cloud systems
  • 4. © The Khronos® Group Inc. 2020 - Page 4 This work is licensed under a Creative Commons Attribution 4.0 International License Data type extensions • By default, the 32 bit floating precision is used for both training and inferencing, which are basically just running a computational graph: - Training runs a forward pass, and often times a backward pass to propagate back the gradient - Inferencing is just about doing the forward pass • But both can be done in lower precision types for faster compute time and reduced data storage - The following extensions are available in vulkan 1.2: - VK_KHR_shader_float16_int8 - VK_KHR_8bit_storage / VK_KHR_16bit_storage - 8 bit integers data types are used for quantized Neural Nets - FP16 data types can be used for faster math with gradient rescaling in training - Upcoming: SPV_KHR_integer_dot_product - Take advantage of fast integer math without implicit compiler peephole optimization for quantized neural networks
  • 5. © The Khronos® Group Inc. 2020 - Page 5 This work is licensed under a Creative Commons Attribution 4.0 International License Improved Compute Shader • New extensions are devised to improve efficiency - Upcoming: - VK_KHR_workgroup_memory_explicit_layout - Allow more efficient data loading into shared memory for further use with efficient matrix multiplication operations. - VK_EXT_ML_primitives - Exposes basic primitives used in the main stream Neural Nets as optimized building blocks - Available: - VK_NV_cooperative_matrix (NVIDIA) - Exposes high throughput matrix/vector multiplication hardware units - Typically used be convolution / matmul layer in fp16 formats
  • 6. © The Khronos® Group Inc. 2020 - Page 6 This work is licensed under a Creative Commons Attribution 4.0 International License Import Formats Caffe, Keras, MXNet, ONNX TensorFlow Graph, PyTorch, ONNX Front-end / IR NNVM / Relay IR XLA HLO Output SPIR-V SPIR-V / vulkan API Primary Machine Learning Compilers
  • 7. © The Khronos® Group Inc. 2020 - Page 7 This work is licensed under a Creative Commons Attribution 4.0 International License Google MLIR and IREE Compilers MLIR Multi-level Intermediate Representation Format and library of compiler utilities that sits between the trained model representation and low-level compilers/executors that generate hardware-specific code IREE Intermediate Representation Execution Environment Lowers and optimizes ML models for real-time accelerated inferencing on mobile/edge heterogeneous hardware Contains scheduling logic to communicate data dependencies to low-level parallel pipelined hardware/APIs like Vulkan, and execution logic to encode dense computation in the form of hardware/API- specific binaries like SPIR-V IREE is a research project today. Google is working with Khronos working groups to explore how SPIR-V code can provide effective inferencing acceleration on APIs such as Vulkan Trained Models Generate Hardware Specific Binaries Optimizes and Lowers for Acceleration
  • 8. © The Khronos® Group Inc. 2020 - Page 8 This work is licensed under a Creative Commons Attribution 4.0 International License Third Party frameworks Alibaba Tencent Ax inc. MNN NCNN Ailia https://github.com/alibaba/M NN https://github.com/Tencent/nc nn https://axinc.jp/en/ Lightweight frameworks for inferencing and training for mobile platforms Cross platform optimized inferencing framework SDK for optimized running of many popular neural networks on multiple platform
  • 9. © The Khronos® Group Inc. 2020 - Page 9 This work is licensed under a Creative Commons Attribution 4.0 International License Non Neural Net Machine Learning Jülich Ethical ML UNITY VkFFT Kompute ML agents https://github.com/DTolm/VkF FT https://github.com/EthicalML/ vulkan-kompute https://github.com/Unity- Technologies/ml-agents Accelerated fast fourrier transform library, which can be used for image resampling, image registration, signal processing … General purpose GPU compute framework cross graphics card Open source project that enables simulation for training intelligent agents via the Unity engine
  • 10. © The Khronos® Group Inc. 2020 - Page 10 This work is licensed under a Creative Commons Attribution 4.0 International License Call To Action • The Vulkan Machine Learning Subgroup is welcoming feedback: - What functionality would you like to see in Vulkan to accelerate your ML needs ? - Neural Nets - Other algorithms: - random forest - logistic regression - K-nearest neighbor - T-SNE - … - Do you have existing product using Vulkan for ML: - Do you want to get in touch with hardware vendors ? - Do you have suggestion on how to improve your pain points? • Contact us at: pboudier@nvidia.com • Help us and join the Vulkan Advisory Panel, or become a Khronos member

Hinweis der Redaktion

  1. 1. Hi everyone – I am Neil Trevett - I work on developer ecosystems at NVIDA and I am President of the Khronos Group. I am going to give the latest updates on Khronos’ open standards for vision and inferencing acceleration.
  2. 5. Here we see how Khronos standards can be used in a typical mobile or embedded application needing vision or inferencing acceleration. Neural network training typically occurs on high-powered desktops or servers at high precision, using large training data sets. The resulting trained network, encapsulated in an NNEF file, can be imported into the target system in two ways: either ingested into an inferencing engine such as OpenVX, or passed into a machine learning compiler to generate code that executes inferencing operations. Either the compiled code or the inferencing engine API is linked into the application, which may also use the SYCL compiler to generate additional parallel code if the application is written in C++. Code generated by all these methods can target the lower level OpenCL API to portably reach into diverse hardware processors.   Let’s look at each of these Khronos standards in a little more detail – starting with NNEF.
  3. 16. In addition to inferencing engines, there is active research into machine learning compilers from companies such as Amazon with TVM, Facebook with Glow and Google with XLA.