HSA Overview

HETEROGENEOUS SYSTEM ARCHITECTURE

AND

THE HSA FOUNDATION

INTRODUCING HETEROGENEOUS SYSTEM ARCHITECTURE (HSA)

HSA is a purpose designed architecture to enable the
software ecosystem to combine and exploit the
complementary capabilities of sequential programming
elements (CPUs) and parallel processing elements (such as
GPUs) to deliver new capabilities to users that go beyond
the traditional usage scenarios

AMD is making HSA an open standard to jumpstart the
ecosystem

2 | Heterogeneous System Architecture | June 2012

EFFECTIVE COMPUTE OFFLOAD IS MADE EASY BY HSA

APP Accelerated Software Accelerated Processing Unit
Applications

Graphics Workloads

Data Parallel Workloads

Serial and Task Parallel Workloads


AMD HSA FEATURE ROADMAP

Physical Optimized Architectural System
Integration Platforms Integration Integration

Integrate CPU & GPU GPU Compute C++ Unified Address Space GPU compute context
in silicon support for CPU and GPU switch

GPU uses pageable
Unified Memory HSA Memory GPU graphics pre-
system memory via
Controller Management Unit emption
CPU pointers

Common Bi-Directional Power
Fully coherent memory
Manufacturing Mgmt between CPU Quality of service
between CPU & GPU
Technology and GPU


HSA COMPLIANT FEATURES

Optimized
Platforms

Support OpenCL C++ directions and Microsoft’s upcoming C++ AMP language.
GPU Compute C++ This eases programming of both CPU and GPU working together to process
support parallel workloads, such as Computer Vision, Video Encoding/Transcoding, etc.

CPU and GPU can share system memory. This means all system memory is
HSA Memory accessible by both CPU or GPU, depending on need. In today’s world, only a
Management Unit subset of system memory can be used by the GPU.

Bi-Directional Power Enables “power sloshing” where CPU and GPU are able to dynamically lower or
Mgmt between CPU raise their power and performance, depending on the activity and which one is
and GPU more suited to the task at hand.


HSA COMPLIANT FEATURES

Architectural
Integration

The unified address space provides ease of programming for developers to create
Unified Address Space
for CPU and GPU
applications. For HSA platforms, a pointer is really a pointer and does not require
separate memory pointers for CPU and GPU.

GPU uses pageable The GPU can take advantage of the CPU virtual address space. With pageable
system memory via system memory, the GPU can reference the data directly in the CPU domain. In
CPU pointers prior architectures, data had to be copied between the two spaces or page-locked
prior to use.
Allows for data to be cached by both the CPU and the GPU, and referenced by
Fully coherent memory either. In all previous generations, GPU caches had to be flushed at command
between CPU & GPU buffer boundaries prior to CPU access. And unlike discrete GPUs, the CPU
and GPU in an APU share a high speed coherent bus.


FULL HSA FEATURES

System
Integration

GPU tasks can be context switched, making the GPU a multi-tasker. Context
GPU compute context switching means faster application, graphics and compute
switch
interoperation. Users get a snappier, more interactive experience.

As more applications enjoy the performance and features of the GPU, it is important
GPU graphics pre- that interactivity of the system is good. This means low latency access to the GPU
emption from any process.

With context switching and pre-emption, time criticality is added to the tasks
Quality of service assigned to the processors. Direct access to the hardware for multi-users or
multiple applications are either prioritized or equalized.


UNLEASHING DEVELOPER INNOVATION
PROBLEM HSA + SDKs = SOLUTION
Productivity & Performance with low Power

Few M
Few K
Wide range of GPU/HW blocks hard to program
HSA Differentiated Not all workloads accelerate
Apps
Coders Experiences
Developer
Return ~100K
~200+
Significant
GPU niche
(Differentiation in Apps
Coders Value
Performance, Developers historically program CPUs
Power,
Features, ~30+M
~4M+ Good User
Time2Market) CPU
Apps Experiences
Coders

Developer Investment
(Effort, Time, New skills)


HSA SOLUTION STACK

 How we deliver the HSA value
proposition Application

SW Developers
Domain Specific Libs
 Overall Vision: Standard SW (Bolt, OpenCV,…)
– Make GPU easily accessible
OpenCL DirectX Other
 Support mainstream languages Runtime Runtime Runtime
 Expandable to domain specific
languages
Legacy
– Make compute offload efficient HSA Runtime
User Mode
 Direct path to GPU (avoid Graphics Drivers
overhead) HSAIL
 Eliminate memory copy HW Vendors
Finalizer
 Low-latency dispatch Custom Drivers
GPU ISA
– Make it ubiquitous
Other
 Drive HSA as a standard through Differentiated HW CPU(s) GPU(s)
Accelerators
HSA Foundation
 Open Source key components


HSA INTERMEDIATE LAYER - HSAIL

 HSAIL is a virtual ISA for parallel programs
 Finalized to native ISA by a JIT compiler or “Finalizer”

 Allow rapid innovations in native GPU architectures
 HSAIL will be constant across implementations

 Explicitly parallel
 Designed for data parallel programming

 Support for exceptions, virtual functions, and other high level language features

 Syscall methods
 GPU code can call directly to system services, IO, printf, etc

 Debugging support


C++ AMP

 C++ AMP: a data parallel programming model initiated by Microsoft for accelerators
 First announced at the 2011 AFDS

 C++ based higher level programming model with advanced C++11 features

 Single source model to well integrate host and device programming

 Implicit programming model that is “future proofed” to enable HSA features, e.g. avoiding
host-to-device copies

 A C++ AMP implementation available from the Microsoft Visual Studio 11 suite under a beta
release


C++ AMP AND HSA

 Compute-focused efficient HSA implementation to replace a graphics-centric implementation
for C++ AMP
 E.g. low latency dispatch, HSAIL enabled

 The shared virtual memory in HSA eliminates the data copies between host and device in
existing C++ AMP programs without any source changes.

 Additional advanced C++ features on GPU, e.g.
 More data types
 Function calls
 Virtual functions
 Arbitrary control flow
 Exceptional handling
 Device and platform atomics


OPENCL™ AND HSA

 HSA is an optimized platform architecture for OpenCL™
 Not an alternative to OpenCL™
 OpenCL™ on HSA will benefit from
 Avoidance of wasteful copies
 Low latency dispatch
 Improved memory model
 Pointers shared between CPU and GPU
 HSA also exposes a lower level programming interface, for those that want the
ultimate in control and performance
 Optimized libraries may choose the lower level interface


HSA TAKING PLATFORM TO PROGRAMMERS

 Balance between CPU and GPU for performance and power efficiency

 Make GPUs accessible to wider audience of programmers
 Programming models close to today’s CPU programming models
 Enabling more advanced language features on GPU
 Shared virtual memory enables complex pointer-containing data structures (lists, trees,
etc) and hence more applications on GPU
 Kernel can enqueue work to any other device in the system (e.g. GPU->GPU, GPU->CPU)
• Enabling task-graph style algorithms, Ray-Tracing, etc

 Clearly defined HSA memory model enables effective reasoning for parallel programming

 HSA provides a compatible architecture across a wide range of programming models and
HW implementations.


THE HSA FOUNDATION - BRINGING ABOUT THE NEXT GENERATION PLATFORM

 An open standardization body to bring about broad industry support for Heterogeneous
Computing via the full value chain Silicon IP to ISV.
 GPU computing as a first class co-processor to the CPU through architecture definition
 Architectural support for special purpose hardware accelerators ( Rasterizer, Security
Processors, DSP, etc.)
 Own and evolve the specifications and conformance suite
 Bring to market strong development solutions to drive innovative advanced content and
applications
 Cultivate programing talent via HSA developer training and academic programs


THANK YOU


Disclaimer & Attribution
The information presented in this document is for informational purposes only and may contain technical inaccuracies,
omissions and typographical errors.

The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not
limited to product and roadmap changes, component and motherboard version changes, new model and/or product
releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the
like. There is no obligation to update or otherwise correct or revise this information. However, we reserve the right to revise
this information and to make changes from time to time to the content hereof without obligation to notify any person of such
revisions or changes.

NO REPRESENTATIONS OR WARRANTIES ARE MADE WITH RESPECT TO THE CONTENTS HEREOF AND NO
RESPONSIBILITY IS ASSUMED FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS
INFORMATION.

ALL IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE ARE
EXPRESSLY DISCLAIMED. IN NO EVENT WILL ANY LIABILITY TO ANY PERSON BE INCURRED FOR ANY DIRECT,
INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION
CONTAINED HEREIN, EVEN IF EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

AMD, the AMD arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. All other names
used in this presentation are for informational purposes only and may be trademarks of their respective owners.

OpenCL is a trademark of Apple Inc. used by permission by Khronos.

© 2012 Advanced Micro Devices, Inc.


HSA Overview

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to HSA Overview

Similar to HSA Overview (20)

More from HSA Foundation

More from HSA Foundation (19)

Recently uploaded

Recently uploaded (20)

HSA Overview