Modern desktop computers have more compute capabilities than ever before. Most of these systems include both a central processing unit (CPU) and a graphics processing unit (GPU), each consisting of multiple computing cores providing tremendous processing power. To date, harnessing the total processing power of a desktop workstation, fully utilizing both the CPU and GPU, has proven difficult for software developers. CPUs and GPUs have few similarities in both design and programming models. OpenCL is the tool that bridges the gap for software developers and enables them to fully tap into the power of both processors with a single software programming interface.
This presentation will examine the details of CPUs and GPUs, explore their differences and similarities, and highlight the computing power they can provide. We will also take a look OpenCL, what it is, what it does, and how this new computing interface will change the way software developers create software and help end users fully realize the compute power contained within today’s modern desktop computers.
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
OpenCL & the Future of Desktop High Performance Computing in CAD
1. OpenCL™ & the Future of
Desktop High Performance
Computing in CAD
2. Before We Start
This webinar will be available afterwards at
designworldonline.com & email
Q&A at the end of the presentation
Hashtag for this webinar: #DWwebinar
7. Who am I?
• Allen Bourgoyne
o Director, ISV Alliances AMD Professional Graphics
8. What is OpenCL™?
• Open Compute Language, OpenCL™, is the first open,
royalty free standard for cross platform programming for
personal computers, servers, workstations, hand-held
devices, supporting a variety of CPUs, GPUs, and DSPs.
10. Modern Computers Have
Lots of Processors
• Central Processing Unit (CPU)
• Graphics Processing Unit (GPU)
• Others
o
o
o
o
Network Controllers
Device controllers (disks, DVD-ROMs, etc)
Smart Batteries
Most of these are not generally available for application software use
11. What Processors Can I
Use?
• CPU: traditionally runs operating system, user programs,
system functions
o General purpose design: run lots of things
o Some parallel processing capability
• CPUs have multiple processing cores (2, 4, 8, up to 12)
• Can use multiple CPUs to increase parallel processing capability
o Processing power & memory generally available to user programs
• Low-level compute functionality generally available to programs
• Programs can directly access memory, operating system can provide for
“virtual memory”
12. What Processors Can I
Use?
• GPU: traditionally runs graphics programs
o Highly focused design: Run graphics programs
o Highly parallel processing designs
• Modern GPUs can have over 1000 processing units!
o Low-level compute functionality not generally available to programs (until very
recently)
o Processing power & memory generally available to graphics programs
• Physical memory only, no virtual memory available
13. What Processors Can I
Use?
• Others:
o Not generally available for application software
o Can be used (not always for good!)
• Recent hacks have used “smart” battery controllers to ruin batteries (over
charge, not charge) and install malware5
o Historically, if there is a processor in the system that has an interface that can be
exploited, it will be eventually
• Focus today: CPUs & GPUs
14. What is HPC?
• High Performance Computing
Historically refers to computing that involves extremely large amounts of computer
processing
Examples:
Simulating nuclear explosions
Global weather
Seismic data processing (looking for Oil & Gas)
Breaking codes/ciphers
Computing Pi to ridiculous numbers of decimal points!!
15. HPC Has Been Around for a
Long Time
• Original mainframe computers designed to solve complex math
problems
• CAD: crash simulation, structural analysis, etc
• Historically done on compute servers
o
o
o
Lots of CPUs with lots of cores
Jobs submitted in batches
Long turn around times, some times takes days or weeks!
• Desktops/laptops getting more powerful
o
HPC workloads showing up here!
16. Brief History of GPU
Computing
1970
1980
1990
Pixel-Plane
5 (1992)
Ikonas
(1978)
SGI GL
(1984)
Pixel
Machine
(1989)
OpenGL
(1992)
2000
2010+
ATI CTM
(2006)
Nvidia
CUDA
(2007)
PixelFlow SIMD gfx
cracks UNIX
password
encryption(1999)
OpenCL
(2009)
17. Using the GPU for
Computing
• Using GPUs for computing has been around for a long
time!
o As soon as they started showing up in computers, people started trying to use them
to help speed up compute tasks
• Why?
o GPUs have some unique design characteristics that enable them to perform certain
mathematical functions extremely fast
18. Computer Graphics 101
• How to draw a triangle on your computer screen:
Y
X
A(x,y,z)
Z
B(x,y,z)
C(x,y,z)
19. Computer Graphics 101 – Math to Draw that
Triangle!
•
Project 3D points onto 2D display:
o
•
For point (Ax, Ay, Az), projected point (Px, Py)
Draw lines, fill triangle:
Determine slope for each line:
Slope = (Bx – Ax) / (By – Ay)
Any coordinate on line:
Ax
Px
Py
=
Sx 0 0
0 0 Sz
Ay
+
Cx
Cz
Cx = Ax + slope * (Cy – Ay)
A(x,y,z)
Az
Do this for vectors
AB and AC, we will have
the points to create lines
to fill the triangle
B(x,y,z)
C(x,y,z)
20. GPU Processing Power
• As you can see, it takes a bit of math processing to draw
that triangle
• Problem: Stuff we want to draw has lots of triangles!
o Interactive rates are 30 frames per second (fps)
• Example: 1 million triangle model @ 30 fps requires drawing 30 million
triangles per second!
o And that’s a small model!
• Solution: GPUs have to be able to process lots of triangles!
21. GPU Processing Power
• Modern GPUs can process hundreds of millions of triangles
per second
o That’s a lot of vector math: DOT products, matrix multiplies, etc.
• High degree of parallel processing enables GPUs to handle
this workload
o Modern GPUs do thousands of operations in parallel in order to meet the demands
that graphics applications place on the hardware
22. GPU Processing Power
• So what does this mean?
o GPUs have a lot of processing power!
• CPU vs GPU computing power:
• CPU: ~ 200 GFLOPS2
• GPU: > 3 TFLOPS1
23. So Why Should I Care if My
GPU is used for Compute?
• Answer: Money & Performance!
• Not quite that simple, let’s take a look at how our
computer generally operates…
25. Compute Usage
• In most cases, the CPU and GPU aren’t that busy when the
other guy is working:
o GPU mostly idle while loading files, writing to disk, etc.
o CPU can be less busy when waiting for the GPU to complete graphics tasks
• These ebbs & flows of workloads create idle cycles, but also the opportunity to
move compute tasks to the available resource
o Try to make use of the idle time!
26. Compute Power
• Remember the CPU vs GPU performance comparison?
ALU
ALU
ALU
CPU
ALU
CACHE
MEMORY
GPU
MEMORY
27. How do I get my $$$s
Worth?
• Software developers are working on tapping into those
unused compute cycles and untapped compute power of
the GPU
o End user software demands ever increasing:
•
Modern automotive models can contain up to 50,000 parts with 10 to 20 GB of data. The number of
triangles can reach 40,000,000 polygons/model6
o OpenCL™ is a tool that will enable software developers to tap into the full power
available on the computer!
29. OpenCL™
• Industry standard programming language for parallel
computing
• Specification by Khronos
• Software using OpenCL runs on many enabled devices
o Runs on CPUs, GPUs, ARM processors, Windows, Linux, Apple OS, Android Os’s
o Supported by major hardware & software vendors including AMD, Intel, Nvidia,
Apple, ARM
30. Who is Khronos?
• Open consortium creating standards
o The Khronos Group is a not for profit industry consortium creating open standards
for the authoring and acceleration of parallel computing, graphics and dynamic
media on a wide variety of platforms and devices
o Commitment to royalty free standards
o Founded almost 10 years ago, over 100 members, any company welcome to join
o Standards include OpenGL®, OpenCL™, WebGL, WebVG™, OpenWF™
33. OpenCL™ & CAD
• OpenCL™ is a powerful tool designed to unleash the power
of processors in a platform independent way
• Software developers are taking advantage of OpenCL™ to
provide significant increases in both computation and
graphics compute performance
34. OpenCL™ Based Solutions
• Engineering analysis
o Very high compute requirements, GPUs can help offload some of the compute
workload
o Solutions shipping today from Dassault Systemes, OpenCASCADE
35. Performance Gains with OpenCL™
% Speed-up
Abaqus/Standard 6.11
Over 200% Speed-up with OpenCL™
VS. CPU only4
% speed-up
with GPU &
OpenCL™
230
228
226
224
222
220
S4B Benchmark
customer Data #1
Job Name
Operations per Iteration
Solver Speed Up
Overall Speedup using CPU plus GPU &
OpenCL
s4B Benchmark dataset
10.3 TFLOPS
2.4X
2.3X
Customer Data #1
5.75 TFLOPS
2.4X
2.2X
36. OpenCL™ & Design
• OpenCL™ will impact the design phase as well
o Ability to accelerate high-quality, photo-realistic rendering in real time
• Interact with realistic models in realistic environments in real time!
o Provide physically accurate rendering: lights, reflections, etc. are rendered physically
where they will show up on the actual product
37. OpenCL™ & Design
Chaos Group V-Ray render engine with
GPU acceleration provides interactive
rendering
Image courtesy of Chaos Group
OPTIS Theiea simulate reality for
design review pipelines
Image courtesy of OPTIS
38. OpenCL ™ & CAD
• Just scratching the surface
o OpenCL only available for less than 2 years
• Already impacting analysis & design software
• Migration from other software verticals
o Digital Content Creation (DCC)
• Realistic cloth simulations, physics, particle simulation, AI already supported
39. What’s Next?
• Any computationally intensive task is a good candidate:
o CFD, FEA, particle simulation, etc.
• Most major ISVs already working with OpenCL™
o Many more software titles will implement OpenCL in 2012
42. Disclaimer
The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical
errors.
The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and
roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing
manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise
this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without
obligation of AMD to notify any person of such revisions or changes.
AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY
INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.
AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL
AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY
INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
43. Questions?
Design World
Joe Gorse
Email: jhgorse@wtwhmedia.com
Phone: 440.234.4531 ext. 111
Twitter: @DesignWorld_EE
AMD
Allen Bourgoyne
Email: Allen.Bourgoyne@amd.com
Phone: 512.602.4738
44. Thank You
This webinar will be available at designworldonline.com & email
Tweet with hashtag #DWwebinar
Connect with
Twitter: @DesignWorld
Facebook.com/engineeringexchange
LinkedIn: Design World Group
YouTube.com/designworldvideo
Discuss this on EngineeringExchange.com