SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
C++ AMP V2
BOBY GEORGE
PROGRAM MANAGER, MICROSOFT CORP
HARNESSING HETEROGENEOUS SYSTEMS USING C++ AMP

Introduction

Updates
•Performance
•Productivity
•Portability

Future
2 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
Motivation
How to achieve performance without
compromising

productivity?
C++ AMP TIMELINE

2011
• Introduced @
AMD Fusion
Summit 11
• Announced
C++ AMP open
specification

4 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL

2012
• C++ AMP
V1 released
• Open Spec
V1 released

2013
• C++ AMP
V2 released
• Support in
additional
compilers
C++ ACCELERATED MASSIVE PARALLELISM (C++ AMP)
INTRODUCTION

 What is C++ AMP?
‒ Programming model for expressing data parallel algorithms
‒ Exploit heterogeneous systems using mainstream tools
‒ Just C++ code, consisting of a language extensions and libraries

 What C++ AMP gives you?
‒ Productivity: Write C++ code that runs on heterogeneous systems
‒ Portability: Write code once and run on various hardwareplatforms
‒ Performance: Write C++ code that accelerate massively

C++ Data
Parallelism

5 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL

Microsoft

C++ AMP
CODE INTRODUCTION
SEQUENTIAL C++ CODE
1. #include <iostream>
2.
3.
4. int main()
5. {
6.
int v[11] = {'G', 'd', 'k', 'k', 'n', 31, 'v', 'n', 'q', 'k', 'c'};
7.
8.
9.
10.
11.

for (int idx = 0; idx < 11; idx++)
{
v[idx] += 1;
}

12. for(unsigned int i = 0; i < 11; i++)
13.
std::cout << static_cast<char>( v[i]);
14. }
6 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
CODE INTRODUCTION
C++ AMP CODE
1. #include <iostream>
2. #include <amp.h>
3. using namespace concurrency;
4. int main()
5. {
6.
int v[11] = {'G', 'd', 'k', 'k', 'n', 31, 'v', 'n', 'q', 'k', 'c'};
7.
8.
9.
10.
11.

array_view<int> av(11, v);
parallel_for_each(av.extent, [=](index<1> idx) restrict(amp)
{
av[idx] += 1;
});

12. for(unsigned int i = 0; i < 11; i++)
13.
std::cout << static_cast<char>(av[i]);
14. }

7 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL

Concept Count (5)
array_view: wraps the data to operate on the
accelerator. array_view variables captured and
associated data copied to accelerator (on demand)
parallel_for_each: execute the lambda on
the accelerator once per thread
extent: the parallel loop bounds or
computation “shape”

index: the thread ID that is running the
lambda, used to index into data
restrict(amp): tells the compiler to check
that code conforms to C++ subset, and tells
compiler to target GPU
MAPPING TO HARDWARE

Vector Lanes

GPU
Multicore
CPU
Data
Parallelism

8 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
DEMO
SO WHO IS USING IT?

10 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
Updates
PERFORMANCE

problem size in multiples of 1024

 Support for Shared Memory Architecture in Visual Studio 2013

execution time in milliseconds
12 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
PERFORMANCE  PRODUCTIVITY
 Enhanced Texture Functionality in Visual Studio 2013
‒ Already used to develop portable 3D Face Scanner

13 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
PRODUCTIVITY
 Tooling Updates
‒ Side by Side CPUGPU debugging for WARP accelerator
‒ C++ AMP GPU debugging on Windows 7 and Server 2008 R2
‒ Remote GPU hardware debugging on NVIDIA GPUs

 RuntimeLibrary Updates
‒ Array_view API improvements
‒ C++ AMP runtime improvements like faster texture copying
‒ Added scan algorithms to C++ AMP Algorithms Library

14 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
PORTABILITY
 C++ AMP is a high level language

C++ AMP

 Announcing…
‒ C++ AMP support in CLANG
‒ Via LLVM targeting HSAIL & Khronos SPIR 1.2
‒ AMD is the project sponsor
‒ Attend Ben Sander’s talk for more details
‒ Objectives

Khronos
SPIR 1.2

DirectCompute

Hardware

‒ Offers consistent C++ AMP programming model across hardware and platforms
‒ Open source work to seed additional support on other compilers and hardware

‒ Microsoft’s Engagement
‒ Collaboration with AMD for design and validation inputs
‒ Preview bits @ https://bitbucket.org/multicoreware/cppamp-driver/

 Visual Studio will continue to offer premier C++ AMP dev experience
15 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL

HSAIL
PORTABILITY
 Announcing PathScale ENZO 2014
‒ Targets NVIDIA hardware directly for higher performance
‒ Plans to target AMD hardware and Windows platform
‒ Currently in Private Beta testing phase

 Complete the picture…
C++ AMP

DirectCompute

Khronos
SPIR 1.2

HSAIL

Native Code
Generation

Hardware

16 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL

Your
favorite
compiler
Future
C++ AMP GROWTH CHART

VS 2012
Performance

Productivity

Portability

18 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL

VS 2013

VS Next

End Goal
PERFORMANCE
 Support Shared Virtual Memory Architectures

 More performant CPU accelerator

 Convergence of CPUGPU parallelization technology

19 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
PRODUCTIVITY
 Convergence of platforms
‒ Write code once and run across multiple platforms

 Enhanced tooling support
 Continue to invest in parallel algorithms

20 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
PORTABILITY
 ISO standardization of C++ AMP features like multidimensional arrays, extend etc..
 Update Open Specification to latest version of C++ AMP in Visual Studio
‒ Open Specification v1.2 to be released by November 2013

 Engage with partners for C++ AMP implementation on non Microsoft technologies

21 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
DISCLAIMER & ATTRIBUTION

The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.
The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap
changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software
changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD
reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of
such revisions or changes.
AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY
INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.
AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE
LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION
CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

ATTRIBUTION
© 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices,
Inc. in the United States and/or other jurisdictions. SPEC is a registered trademark of the Standard Performance Evaluation Corporation (SPEC). Other names
are for informational purposes only and may be trademarks of their respective owners.
22 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL

Weitere ähnliche Inhalte

Was ist angesagt?

CE-4030, Optimizing Photo Editing Application with HSA Technology, by Stanley...
CE-4030, Optimizing Photo Editing Application with HSA Technology, by Stanley...CE-4030, Optimizing Photo Editing Application with HSA Technology, by Stanley...
CE-4030, Optimizing Photo Editing Application with HSA Technology, by Stanley...
AMD Developer Central
 

Was ist angesagt? (20)

CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...
 
SE-4087, Leveraging HW-based content security, by Dan Wong
SE-4087, Leveraging HW-based content security, by Dan WongSE-4087, Leveraging HW-based content security, by Dan Wong
SE-4087, Leveraging HW-based content security, by Dan Wong
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
 
CE-4030, Optimizing Photo Editing Application with HSA Technology, by Stanley...
CE-4030, Optimizing Photo Editing Application with HSA Technology, by Stanley...CE-4030, Optimizing Photo Editing Application with HSA Technology, by Stanley...
CE-4030, Optimizing Photo Editing Application with HSA Technology, by Stanley...
 
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
 
WT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon WoodsWT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
 
IS-4082, Real-Time insight in Big Data – Even faster using HSA, by Norbert He...
IS-4082, Real-Time insight in Big Data – Even faster using HSA, by Norbert He...IS-4082, Real-Time insight in Big Data – Even faster using HSA, by Norbert He...
IS-4082, Real-Time insight in Big Data – Even faster using HSA, by Norbert He...
 
MM-4099, Adapting game content to the viewing environment, by Noman Hashim
MM-4099, Adapting game content to the viewing environment, by Noman HashimMM-4099, Adapting game content to the viewing environment, by Noman Hashim
MM-4099, Adapting game content to the viewing environment, by Noman Hashim
 
GS-4139, RapidFire for Cloud Gaming, by Dmitry Kozlov
GS-4139, RapidFire for Cloud Gaming, by Dmitry KozlovGS-4139, RapidFire for Cloud Gaming, by Dmitry Kozlov
GS-4139, RapidFire for Cloud Gaming, by Dmitry Kozlov
 
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
 
Final lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tbFinal lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tb
 
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelIS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
 
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin CoumansGS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
 
PG-4119, 3D Geometry Compression on GPU, by Jacques Lefaucheux
PG-4119, 3D Geometry Compression on GPU, by Jacques LefaucheuxPG-4119, 3D Geometry Compression on GPU, by Jacques Lefaucheux
PG-4119, 3D Geometry Compression on GPU, by Jacques Lefaucheux
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
 
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
 
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
 
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor MillerPL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
 
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon Selley
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon SelleyPT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon Selley
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon Selley
 

Ähnlich wie PT-4056, Harnessing Heterogeneous Systems Using C++ AMP – How the Story is Evolving, by Boby George

Enterprise Cloud with IBM & Chef (ChefConf 2013)
Enterprise Cloud with IBM & Chef (ChefConf 2013)Enterprise Cloud with IBM & Chef (ChefConf 2013)
Enterprise Cloud with IBM & Chef (ChefConf 2013)
Michael Elder
 
IBM Pulse session 2727: Continuous delivery -accelerated with DevOps
IBM Pulse session 2727: Continuous delivery -accelerated with DevOpsIBM Pulse session 2727: Continuous delivery -accelerated with DevOps
IBM Pulse session 2727: Continuous delivery -accelerated with DevOps
Sanjeev Sharma
 
2109 mobile cloud integrating your mobile workloads with the enterprise
2109 mobile cloud  integrating your mobile workloads with the enterprise2109 mobile cloud  integrating your mobile workloads with the enterprise
2109 mobile cloud integrating your mobile workloads with the enterprise
Todd Kaplinger
 
“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...
“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...
“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...
Edge AI and Vision Alliance
 

Ähnlich wie PT-4056, Harnessing Heterogeneous Systems Using C++ AMP – How the Story is Evolving, by Boby George (20)

ROCm and Distributed Deep Learning on Spark and TensorFlow
ROCm and Distributed Deep Learning on Spark and TensorFlowROCm and Distributed Deep Learning on Spark and TensorFlow
ROCm and Distributed Deep Learning on Spark and TensorFlow
 
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
 
Innovate 2014 - DevOps Technical Strategy
Innovate 2014 - DevOps Technical StrategyInnovate 2014 - DevOps Technical Strategy
Innovate 2014 - DevOps Technical Strategy
 
Enterprise Cloud with IBM & Chef (ChefConf 2013)
Enterprise Cloud with IBM & Chef (ChefConf 2013)Enterprise Cloud with IBM & Chef (ChefConf 2013)
Enterprise Cloud with IBM & Chef (ChefConf 2013)
 
Java on zSystems zOS
Java on zSystems zOSJava on zSystems zOS
Java on zSystems zOS
 
Webcast urbancodemobiltomainframe
Webcast urbancodemobiltomainframeWebcast urbancodemobiltomainframe
Webcast urbancodemobiltomainframe
 
Applying DevOps, PaaS and cloud for better citizen service outcomes - IBM Fe...
Applying DevOps, PaaS and cloud for better citizen service  outcomes - IBM Fe...Applying DevOps, PaaS and cloud for better citizen service  outcomes - IBM Fe...
Applying DevOps, PaaS and cloud for better citizen service outcomes - IBM Fe...
 
2017 sitNL Cloud Foundry Masterclass
2017 sitNL Cloud Foundry Masterclass2017 sitNL Cloud Foundry Masterclass
2017 sitNL Cloud Foundry Masterclass
 
Applying lean, dev ops, and cloud for better business outcomes
Applying lean, dev ops, and cloud for better business outcomesApplying lean, dev ops, and cloud for better business outcomes
Applying lean, dev ops, and cloud for better business outcomes
 
IBM Pulse session 2727: Continuous delivery -accelerated with DevOps
IBM Pulse session 2727: Continuous delivery -accelerated with DevOpsIBM Pulse session 2727: Continuous delivery -accelerated with DevOps
IBM Pulse session 2727: Continuous delivery -accelerated with DevOps
 
2109 mobile cloud integrating your mobile workloads with the enterprise
2109 mobile cloud  integrating your mobile workloads with the enterprise2109 mobile cloud  integrating your mobile workloads with the enterprise
2109 mobile cloud integrating your mobile workloads with the enterprise
 
Innovate 2014: Get an A+ on Testing Your Enterprise Applications with Rationa...
Innovate 2014: Get an A+ on Testing Your Enterprise Applications with Rationa...Innovate 2014: Get an A+ on Testing Your Enterprise Applications with Rationa...
Innovate 2014: Get an A+ on Testing Your Enterprise Applications with Rationa...
 
Continuous Application Delivery to WebSphere - Featuring IBM UrbanCode
Continuous Application Delivery to WebSphere - Featuring IBM UrbanCodeContinuous Application Delivery to WebSphere - Featuring IBM UrbanCode
Continuous Application Delivery to WebSphere - Featuring IBM UrbanCode
 
“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...
“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...
“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...
 
Pivotal Platform - December Release A First Look
Pivotal Platform - December Release A First LookPivotal Platform - December Release A First Look
Pivotal Platform - December Release A First Look
 
DevOps KPIs as a Service: Daimler’s Solution
DevOps KPIs as a Service: Daimler’s SolutionDevOps KPIs as a Service: Daimler’s Solution
DevOps KPIs as a Service: Daimler’s Solution
 
z/VM and OpenStack
z/VM and OpenStackz/VM and OpenStack
z/VM and OpenStack
 
Robust collaboration services with OSGi - Satya Maheshwari
Robust collaboration services with OSGi - Satya MaheshwariRobust collaboration services with OSGi - Satya Maheshwari
Robust collaboration services with OSGi - Satya Maheshwari
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
 
Worklight mobile v6
Worklight mobile v6 Worklight mobile v6
Worklight mobile v6
 

Mehr von AMD Developer Central

Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
AMD Developer Central
 

Mehr von AMD Developer Central (20)

DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math Libraries
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
Media SDK Webinar 2014
Media SDK Webinar 2014Media SDK Webinar 2014
Media SDK Webinar 2014
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
 
DirectGMA on AMD’S FirePro™ GPUS
DirectGMA on AMD’S  FirePro™ GPUSDirectGMA on AMD’S  FirePro™ GPUS
DirectGMA on AMD’S FirePro™ GPUS
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
 
Inside XBox- One, by Martin Fuller
Inside XBox- One, by Martin FullerInside XBox- One, by Martin Fuller
Inside XBox- One, by Martin Fuller
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas Thibieroz
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
Gcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodesGcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodes
 
Inside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin FullerInside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin Fuller
 
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornDirect3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

PT-4056, Harnessing Heterogeneous Systems Using C++ AMP – How the Story is Evolving, by Boby George

  • 1. C++ AMP V2 BOBY GEORGE PROGRAM MANAGER, MICROSOFT CORP
  • 2. HARNESSING HETEROGENEOUS SYSTEMS USING C++ AMP Introduction Updates •Performance •Productivity •Portability Future 2 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 3. Motivation How to achieve performance without compromising productivity?
  • 4. C++ AMP TIMELINE 2011 • Introduced @ AMD Fusion Summit 11 • Announced C++ AMP open specification 4 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL 2012 • C++ AMP V1 released • Open Spec V1 released 2013 • C++ AMP V2 released • Support in additional compilers
  • 5. C++ ACCELERATED MASSIVE PARALLELISM (C++ AMP) INTRODUCTION  What is C++ AMP? ‒ Programming model for expressing data parallel algorithms ‒ Exploit heterogeneous systems using mainstream tools ‒ Just C++ code, consisting of a language extensions and libraries  What C++ AMP gives you? ‒ Productivity: Write C++ code that runs on heterogeneous systems ‒ Portability: Write code once and run on various hardwareplatforms ‒ Performance: Write C++ code that accelerate massively C++ Data Parallelism 5 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL Microsoft C++ AMP
  • 6. CODE INTRODUCTION SEQUENTIAL C++ CODE 1. #include <iostream> 2. 3. 4. int main() 5. { 6. int v[11] = {'G', 'd', 'k', 'k', 'n', 31, 'v', 'n', 'q', 'k', 'c'}; 7. 8. 9. 10. 11. for (int idx = 0; idx < 11; idx++) { v[idx] += 1; } 12. for(unsigned int i = 0; i < 11; i++) 13. std::cout << static_cast<char>( v[i]); 14. } 6 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 7. CODE INTRODUCTION C++ AMP CODE 1. #include <iostream> 2. #include <amp.h> 3. using namespace concurrency; 4. int main() 5. { 6. int v[11] = {'G', 'd', 'k', 'k', 'n', 31, 'v', 'n', 'q', 'k', 'c'}; 7. 8. 9. 10. 11. array_view<int> av(11, v); parallel_for_each(av.extent, [=](index<1> idx) restrict(amp) { av[idx] += 1; }); 12. for(unsigned int i = 0; i < 11; i++) 13. std::cout << static_cast<char>(av[i]); 14. } 7 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL Concept Count (5) array_view: wraps the data to operate on the accelerator. array_view variables captured and associated data copied to accelerator (on demand) parallel_for_each: execute the lambda on the accelerator once per thread extent: the parallel loop bounds or computation “shape” index: the thread ID that is running the lambda, used to index into data restrict(amp): tells the compiler to check that code conforms to C++ subset, and tells compiler to target GPU
  • 8. MAPPING TO HARDWARE Vector Lanes GPU Multicore CPU Data Parallelism 8 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 10. SO WHO IS USING IT? 10 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 12. PERFORMANCE problem size in multiples of 1024  Support for Shared Memory Architecture in Visual Studio 2013 execution time in milliseconds 12 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 13. PERFORMANCE PRODUCTIVITY  Enhanced Texture Functionality in Visual Studio 2013 ‒ Already used to develop portable 3D Face Scanner 13 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 14. PRODUCTIVITY  Tooling Updates ‒ Side by Side CPUGPU debugging for WARP accelerator ‒ C++ AMP GPU debugging on Windows 7 and Server 2008 R2 ‒ Remote GPU hardware debugging on NVIDIA GPUs  RuntimeLibrary Updates ‒ Array_view API improvements ‒ C++ AMP runtime improvements like faster texture copying ‒ Added scan algorithms to C++ AMP Algorithms Library 14 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 15. PORTABILITY  C++ AMP is a high level language C++ AMP  Announcing… ‒ C++ AMP support in CLANG ‒ Via LLVM targeting HSAIL & Khronos SPIR 1.2 ‒ AMD is the project sponsor ‒ Attend Ben Sander’s talk for more details ‒ Objectives Khronos SPIR 1.2 DirectCompute Hardware ‒ Offers consistent C++ AMP programming model across hardware and platforms ‒ Open source work to seed additional support on other compilers and hardware ‒ Microsoft’s Engagement ‒ Collaboration with AMD for design and validation inputs ‒ Preview bits @ https://bitbucket.org/multicoreware/cppamp-driver/  Visual Studio will continue to offer premier C++ AMP dev experience 15 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL HSAIL
  • 16. PORTABILITY  Announcing PathScale ENZO 2014 ‒ Targets NVIDIA hardware directly for higher performance ‒ Plans to target AMD hardware and Windows platform ‒ Currently in Private Beta testing phase  Complete the picture… C++ AMP DirectCompute Khronos SPIR 1.2 HSAIL Native Code Generation Hardware 16 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL Your favorite compiler
  • 18. C++ AMP GROWTH CHART VS 2012 Performance Productivity Portability 18 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL VS 2013 VS Next End Goal
  • 19. PERFORMANCE  Support Shared Virtual Memory Architectures  More performant CPU accelerator  Convergence of CPUGPU parallelization technology 19 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 20. PRODUCTIVITY  Convergence of platforms ‒ Write code once and run across multiple platforms  Enhanced tooling support  Continue to invest in parallel algorithms 20 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 21. PORTABILITY  ISO standardization of C++ AMP features like multidimensional arrays, extend etc..  Update Open Specification to latest version of C++ AMP in Visual Studio ‒ Open Specification v1.2 to be released by November 2013  Engage with partners for C++ AMP implementation on non Microsoft technologies 21 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 22. DISCLAIMER & ATTRIBUTION The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION © 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. SPEC is a registered trademark of the Standard Performance Evaluation Corporation (SPEC). Other names are for informational purposes only and may be trademarks of their respective owners. 22 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL