ONNX - the emerging standard for interoperable and optimized AI inference and training. A graduated project of the Linux Foundation Artificial Intelligence - best practice open source - true multi-vendor open governance in a foundation.
ONNX: Past, Present, and
… with use cases
Jim Spohrer (IBM), Prasanth Pulavarthi (Microsoft)
Monday June 29, 2020 11:30am - 12:20pm Central Time
Introduction to ONNX: Past, Present, Future
Jim Spohrer (IBM) – Director, Cognitive Opentech Group/CODAIT
Linux Foundation AI – Technical Advisory Council Chairperson
ONNX Steering Committee Member
ONNX in Practice: Why we use it and how you can too
Prasanth Pulavarthi (Microsoft) – Principal Program Manager, AI Platform
ONNX Steering Committee Member, ONNX Co-Founder
Introduction to ONNX: Past, Present, and Future
› Past: Why ONNX?
› Quick Review: ONNX Website Tour
› Present: Growth of Community and Tools
› Quick Review: April 2020 ONNX Community Meeting
› Future: You, AI Landscape and ONNX!
› Quick Review: Call to Action
April 2020 Community Meeting
› Welcome and Updates
› IBM Chief Data Office
› Huawei Mindspore
› Microsoft Runtime Optimizations
› Xilinx FINN
› UC Santa Cruz Genomics
› Microsoft Azure OCR
› SIGs and Working Groups
› Some Highlights
Common problems impacting ML productivity
• Inference latency is too high to put into production
• Training in Python but need to deploy into a C#/C++/Java app
• Model needs to run on edge/IoT devices
• Same model needs to run on different hardware and operating systems
• Need to support running models created in several different frameworks
• (more recently) Training very large models takes too long
Training framework Deployment target
Improving ML deployment productivity
Freedom to use tool of
Strong performance and
compatibility with platforms
Speech with ONNX Runtime
power a vast landscape of products and
services at MSFT. From Office, Cortana to
Xbox and Bing, the ML models service
hundreds of millions of requests a month.
ONNX Runtime powers inferencing for
Speech at high scale in production
environments including Cognitive Services,
on-premise solutions, and micro services
10x reduction in time to
productize new models
10% accuracy and latency improvements
Azure Kinect with ONNX Runtime
Body Tracking SDK is installed on your PC
Reduced the First Frame Processing Time by
7.8x on the GTX 1070 with ONNX Runtime with
CUDA execution provider.
Body Tracking SDK
Azure Kinect developer kit for tracks bodies in 3D
with advanced AI sensors that use sophisticated
computer vision and speech models.
IoT scenario (in progress)
Body Tracking SDK is installed on Jetson TX2
(ARM CPU + Nvidia GPU)
WindowsML with ONNX Runtime
• Windows API for machine learning
• Input models are ONNX models for
broad framework support
• ONNX Runtime as engine with
Azure Customers with ONNX Runtime
• Uses AI for economic scenario modeling
• Train models in Python with SciKit-Learn and PyTorch but production environment is pure
• ONNX Runtime with C# API was a good fit (bonus 2x speedup)
› Hugging Face provides popular transformer models, like
BERT, GPT2, etc.
› Can be trained with either PyTorch or TensorFlow
› Hugging Face module transformers.convert_graph_to_onnx
exports ONNX models
› ONNX Runtime does inferencing with speedup whether you
are using CPU or GPU
New: Transformer Training
› Integrates with PyTorch (and TensorFlow) to accelerate training and fine-
tuning of large transformer models
› Incorporates latest algorithms and techniques such as DeepSpeed/ZeRO and
› Used by Office, Visual Studio, and others at Microsoft
› Available as preview now
ONNX has open governance
› Annual Steering Committee election
› Technical decisions made by SIGs and Working Groups
› All meetings open to everyone
› Calendar: https://onnx.ai/calendar
› GitHub: https://github.com/onnx
› Gitter: https://gitter.com/onnx
› Mailing List: https://lists.lfai.foundation/g/onnx-announce
Q & A
› The Linux Foundation, The Linux Foundation logos, and other marks that may be used herein are owned by The Linux Foundation or its
affiliated entities, and are subject to The Linux Foundation’s Trademark Usage Policy at https://www.linuxfoundation.org/trademark-usage, as
may be modified from time to time.
› Linux is a registered trademark of Linus Torvalds. Please see the Linux Mark Institute’s trademark usage page at https://lmi.linuxfoundation.org
for details regarding use of this trademark.
› Some marks that may be used herein are owned by projects operating as separately incorporated entities managed by The Linux Foundation,
and have their own trademarks, policies and usage guidelines.
› TWITTER, TWEET, RETWEET and the Twitter logo are trademarks of Twitter, Inc. or its affiliates.
› Facebook and the “f” logo are trademarks of Facebook or its affiliates.
› LinkedIn, the LinkedIn logo, the IN logo and InMail are registered trademarks or trademarks of LinkedIn Corporation and its affiliates in the
United States and/or other countries.
› YouTube and the YouTube icon are trademarks of YouTube or its affiliates.
› All other trademarks are the property of their respective owners. Use of such marks herein does not represent affiliation with or authorization,
sponsorship or approval by such owners unless otherwise expressly specified.
Antitrust Policy at https://www.linuxfoundation.org/antitrust-policy. each as may be modified from time to time. More information about The Linux
Foundation’s policies is available at https://www.linuxfoundation.org.
› Please email email@example.com with any questions about The Linux Foundation’s policies or the notices set forth on this slide.
Hinweis der Redaktion
ONNX: Past, Present, and Future
ONNX is now a graduated project in Linux Foundation AI. Are you a developer looking to operationalize machine learning models from different sources without compromising performance? Are you a data scientist who wishes there was a way to use the machine learning framework you want without worrying about how to deploy it to a variety of end points on cloud and edge? We'll describe ONNX, which provides a common format supported by many popular AI frameworks and hardware. Learn about ONNX and its core concepts and find out how to create ONNX models using frameworks like TensorFlow, PyTorch, and SciKit-Learn. We'll explain how to deploy models to cloud or edge using the high-performance, cross-platform ONNX Runtime, which leverages accelerators like NVIDIA TensorRT. Come learn how ONNX is being used in other LF AI projects, as well as about ONNX Work Groups that you can participate in. Finally, this talk will include case studies of Microsoft teams improving latency and reducing costs, thanks to ONNX.
IBMIBM Cognitive Opentech Group
Jim Spohrer directs IBM’s open source Artificial Intelligence developer ecosystem effort. He led IBM Global University Programs, co-founded Almaden Service Research, and was CTO Venture Capital Group. After his MIT BS in Physics, he developed speech recognition systems at Verbex (Exxon) before receiving his Yale PhD in Computer Science/AI. In the 1990’s, he attained Apple Computers’ Distinguished Engineer Scientist and Technologist role for next generation learning platforms. With over ninety publications and nine patents, he received the Gummesson Service Research award, Vargo and Lusch Service-Dominant Logic award, Daniel Berg Service Systems award, and a PICMET Fellow for advancing service science.
MicrosoftAI Platform at Microsoft
Prasanth Pulavarthi is Principal Program Manager, AI Platform and Co-Founder of ONNX. With nearly two decades of leading software development projects, Prasanth has a MS Computer Science from Stanford. ONNX is a now a graduated projected in Linux Foundation Artificial Intelligence.
Monday June 29, 2020 11:30am - 12:20pm Conference Room 10
AI/ML/DL hosted by LF AI, Machine and Deep Learning (Framework, Libraries, Platform, Tools)
Skill Level Entry level
Nick Pentreath Presentation:
What is ONNX? Open Neural Network Exchange Advantages
Jagreet Kaur Gill
| Data Science
3 mins read
| May 07, 2019
Here are some of the popular frameworks that support conversion to ONNX. For some of these, like PyTorch, ONNX format export is built in natively, and for others, like Tensorflow or Keras, there are separate installable packages that handle the conversion. Support is available for popular models including object detection such as Mask RCNN and Faster RCNN, speech, and NLP including BERT and Transformer models.