SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
THE PROGRAMMER’S GUIDE TO REACHING FOR THE CLOUD
PHIL ROGERS, CORPORATE FELLOW, AMD
NOV. 11, 2013
MODERN CLOUD WORKLOADS ARE HETEROGENEOUS
SCALAR CONTENT WITH A GROWING MIX OF PARALLEL CONTENT

 Video is expected to represent two thirds of mobile data traffic by 2017
‒ Video is continuously being captured, uploaded, transcoded and streamed
‒ Video processing is inherently parallel … and can be accelerated

 Big data growing exponentially with Exabytes of data crawled monthly
‒ Indexing the web and extracting high definition information
‒ Map reduce is a heterogeneous workload

 Natural User Interfaces are still in their infancy
‒ Accurate extraction of meaning from gesture and voice
‒ Getting to the fingertips and voice inflections

NEED TO SIMULTANEOUSLY
INCREASE PERFORMANCE AND
REDUCE POWER
2 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
FUTURE TECHNOLOGY GROWTH WILL ACCELERATE THE TREND
 Rapid growth of Sensor Networks

RAPID GROWTH OF THE NUMBER OF THINGS
CONNECTED TO THE INTERNET

‒ Drives exponential increase in data

 Internet of Everything (IoE) results
in explosion of data sources

“Fixed” Computing
(you go to the device)

Mobility / BYOD
(the device goes with
you)

Internet of Things
(age of devices)

HOW MUCH VALUE IS AT
STAKE IN THE IOE ECONOMY?

Internet of Everything
(people, process, data,
things)

$14.4
trillion

50B

‒ Another exponential growth in data
at local and cloud level

 Context Aware Computing is a
Huge Big-Data Problem

$9.5

$4.9

trillion

‒ Both local and cloud compute must
get faster/lower power
1995

2000

2005

2010

2015

2020

trillion

From
industry-specific
use cases

From
cross-industry
use cases

DRIVING FUTURE DEMAND FOR LOCAL AND CLOUD PARALLEL EFFICIENCY
3 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

Source: Cisco IBSG, 2013
HSA APU PROCESSORS OPERATE HARMONIOUSLY AT LOW POWER
EXAMPLE: VIDEO ENHANCEMENT

 Techniques include:
‒ Image Stabilization, Super Resolution, Deblur, Deinterlace, Lighting & Contrast

 Enhancements examine pixels from a large number of video frames
‒ Super-resolution based on information from surrounding frames

 Algorithms can be run on multiple processors in the APU
‒ CPU, GPU, DSPs, Fixed Function Accelerators
‒ Convolutions, motion estimation, histograms,
format conversions, etc.
‒ Processing flows freely between processors
for best efficiency

4 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
HETEROGENEOUS PROCESSORS - EVERYWHERE
SMARTPHONES TO SUPER-COMPUTERS

Super computer
Dense Server
Tablet
Phone

Workstation
Notebook

A SINGLE SCALABLE ARCHITECTURE
FOR THE WORLD’S PROGRAMMERS
IS DEMANDED AT THIS POINT
5 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
HOW DOES HSA MAKE THIS ALL WORK?
 Enables acceleration of languages like Java, C++ AMP and Python
 All processors use the same addresses, and can share data structures in place
 Heterogeneous computing can use all of virtual and physical memory
 Extends multicore coherency to the GPU and other processors
 Pass work quickly between the processors
 Enables quality of service

HSA FOUNDATION – BUILDING
THE ECOSYSTEM

6 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
HSA in 2013
HSA FOUNDATION AT LAUNCH
BORN IN JUNE 2012

Founders

8 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
HSA FOUNDATION TODAY – NOVEMBER 2013
A GROWING AND POWERFUL FAMILY

Founders
Promoters
Supporters
Contributors

TBA at APU-13

Universities

9 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

NTHU Programming
Language Lab

NTHU System
Software Lab

COMPUTER SCIENCE
HSA FOUNDATION PROGRESS
WHAT AN AMAZING FIRST YEAR

 Membership growing rapidly
‒ 2-3 new members per month
‒ Universities enrolling

 Four working groups generating specifications
‒ HSA Programmers Reference Manual published
‒ HSA System Architecture spec going to ratification by the
end of the year
‒ Runtime WG and Tools WG will publish early next year

 HSA Development platforms to ship in early 2014

10 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
PROGRAMMING LANGUAGES PROLIFERATING ON HSA
OpenCL™
App

Java App

C++ AMP
App

Python
App

OpenCL
Runtime

Java JVM
(Sumatra)

Various
Runtimes

Fabric
Engine RT

HSAIL
HSA
Helper Libraries

HSA Core
Runtime

Kernel Fusion
Driver (KFD)

11 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

HSA
Finalizer
Workloads
HIGH EFFICIENCY VIDEO CODEC – HEVC (H.265)
VALUE PROPOSITION

HEVC VISUAL QUALITY IS
SIGNIFICANTLY BETTER THAN
H.264 AT ANY GIVEN BIT RATE

30% TO 50% MORE EFFICIENT
THAN H.264 AT 1080P RESOLUTION

4K Ultra HDTV
Sony XBR
$4999

H.265 @ 500 kbps

H.264 @ 500 kbps

13 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

4K VIDEO BENEFITS ARE EVEN
MORE SIGNIFICANT WITH HEVC

30% to 50%

4K Video Cameras
GoPro
$399
HIGH EFFICIENCY VIDEO CODEC – HEVC (H.265)
WHY HEVC WILL PROLIFERATE
 The next generation MPEG video encoding standard
 Significantly higher efficiency (up to 50% lower bit
rates at given quality) than AVC (H.264)
 Highly beneficial for HD video (1080p or below)
 Especially beneficial for 4K video
 Scales to 8K Ultra High Definition video (up to
8192×4320)
 Computationally complex, but by design easier to
parallelize than H.264

Traffic Share

Mobile Video
Mobile M2M

Exabytes Per Month
12

Mobile Web/Data
Mobile File Sharing

3.5%
5.1%

10

24.9%
8
6
4

66.5%

2

CLOUD VIDEO PROVIDERS NEED THE HIGHER
COMPRESSION FOR QUALITY OF SERVICE
14 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

0

2012

2013

2014

2015

2016

2017

Source: Cisco VNI Mobile Forecast, 2013
HEVC (H.265) ACCELERATION
EFFICIENT CLOUD DEPLOYMENT

ALL STAGES OF HEVC ARE
ACCELERATED ON THE APU






Decrypt
Decode and decompress
Scaling and Enhancement
Encode and compress
Encrypt

15 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

ENCODE IS THE HEAVIEST
STAGE

H.265 ENCODING IS 5 – 10X MORE
COMPUTATIONALLY COMPLEX THAN H.264

 Leverage point for
compression
 Highly parallel
 Algorithms improve
monthly
 Must stay programmable

 Picture can be divided
into Macroblock
regions with a much
wider range of sizes
and shapes
 Motion vectors have
33 prediction
directions compared
to 8 for H.264
OVERVIEW OF B+ TREES
 B+ Trees are a special case of B Trees

 A B+ Tree …
‒ is a dynamic, multi-level index
‒ Is efficient for retrieval of data, stored in a block-oriented
context

 Fundamental data structure used in several
popular database management systems
‒ SQLite
‒ CouchDB

 Order (b) of a B+ Tree measures the capacity of its nodes

3
2

5

4

6

7

1

2

3

4

5

6

7

8

d1

d2

d3

d4

d5

d6

d7

d8

16 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
APPLICATIONS THAT USE B/B+ TREES

primary data store on the clientside

multi-data center key-value store

Mail, Safari, iPhone, iPod, iTunes

market-data framework

Firefox and Thunderbird

large hadron collider

Android, Chrome

17 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

http://www.sqlite.org/famous.html

http://wiki.apache.org/couchdb/CouchDB_in_the_wild
HOW WE ACCELERATE
 Utilize coarse-grained parallelism in B+ Tree searches
‒ Perform many queries in parallel
‒ Increase memory bandwidth utilization with parallel reads
‒ Increase throughput (transactions per second for OLTP)

 B+ Tree searches on an HSA enabled APU
‒ Allows much larger B+ Trees to be searched, than traditional GPU compute
‒ Eliminates data-copies since CPU and GPU cores can access the same memory

18 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
RESULTS
1M search queries in parallel
7

 Input B+ Tree contains 112 million
keys and uses 6GB of memory

 Software: OpenCL on HSA

5

Speedup

 Hardware: AMD “Kaveri” APU
with Quad Core CPU and 8 GCN
Compute Units at 35W TDP

6

4
3
2
1
0
8

16

32

64

128

Order of B+ Tree

Baseline: 4-core OpenMP + hand-tuned SSE CPU implementation
19 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

Results measured in AMD Labs on “Kaveri” APU, 35W TDP, 16GB DRAM
REVERSE TIME MIGRATION (RTM)

Land crews

 A technique for creating images based on
sensor data to improve seismic interpretations
done by geophysicists

Marine crews

 A memory-intensive and highly parallel
algorithm
 RTM is run on massive data sets
 A natural scale out algorithm
 Often run today on 100K node CPU systems
 Bringing this to HSA and APU based
supercomputing will increase performance for
current sensor arrays, and allow more sensors
and accuracy in the future.

20 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

HOWEVER, SPEED OF PROCESSING AND
INTERPRETATION IS A CRITICAL
BOTTLENECK IN MAKING FULL USE
OF ACQUISITION ASSETS
TEXT ANALYTICS – HADOOP TERASORT AND BIG DATA SEARCH
MINING BIG DATA
 Multi-stage pipeline or parallel
processing stages
 Traditional GPU Compute is challenged
by copies

Input HDFS
sort
split 0

map

Sort
Compression
Regular expression parsing
CRC generation

 Acceleration of large data search scales
out across the cluster of APU nodes

21 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

Output HDFS
merge
reduce

split 1

split 2

part 0

HDFS
Replication

reduce

 APU with HSA accelerates each stage in
place
‒
‒
‒
‒

copy

part 1

HDFS
Replication

map

map
Programming
Languages
PROGRAMMING MODELS EMBRACING HSAIL AND HSA
THE RIGHT LEVEL OF ABSTRACTION

UNDER DEVELOPMENT





Java: Project Sumatra OpenJDK 9
OpenMP from SuSE
C++ AMP, based on CLANG/LLVM
Python and KL from Fabric Engine

23 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

NEXT






DSLs: Halide, Julia, Rust
Fortran
JavaScript
Open Shading Language
R
HSA ENABLES DEVELOPERS TO LEVERAGE HC … EASILY & NATURALLY

PREFERRED PROGRAMMING
LANGUAGES

TRANSPARENT CALLS TO POPULAR
LIBRARIES

 Java, C++, OpenMP, Python *

 OpenCV, SciPy, NumPy,
ImageMagick, Bolt, …

 SVM, Coherence, GPU Enqueue
 OpenJDK/Sumatra, Fabric
Engine

 Arbitrary data structures, SVM,
Coherence, User mode
queueing
 OpenCV API, Bolt STL library

* Java 8, C++ AMP, OpenMP 4.0 next generation standards and extensions
24 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

USING CONVENTIONAL
METHODS
 Arbitrary data structures,
malloc, function pointers, callbacks, recursion,
semaphores, atomics
 SVM, Coherence, User-mode
queueing, GPU Enqueue, HSAIL
 Linked-list/tree traversal +
other complex shared host data
structures
C++ AMP ACCELERATION GOES MULTI-PLATFORM
 Herb Sutter Announced C++ AMP for the Windows® Platform at ADS 2011
 We very much liked the single source model of development, and decided to extend it
to be multi-platform
 Today we are announcing C++ AMP is moving beyond Microsoft® Windows to embrace
Linux. We will offer this acceleration on both our APUs and our discrete GPUs
 We are also bringing Bolt STL Library support to C++ AMP

C++AMP

25 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

CLANG Front-end

LLVM-IR or
SPIR 1.2

Any HSA
Implementation

SPIR 1.2

AVAILABLE IN
OPEN SOURCE
1H-2014

HSAIL

Any OpenCL™+SPIR
Implementation

LLVM Compiler
HSA ENABLEMENT OF JAVA
JAVA 7 – OpenCL ENABLED APARAPI

JAVA 8 – HSA ENABLED APARAPI

JAVA 9 – HSA ENABLED JAVA (SUMATRA)

 AMD initiated Open Source project

 Java 8 brings Stream + Lambda API.

 Adds native GPU acceleration to Java Virtual
Machine (JVM)

 APIs for data parallel algorithms
‒ GPU accelerate Java applications
‒ No need to learn OpenCL™

 Active community captured mindshare
‒ ~20 contributors
‒ >7000 downloads
‒ ~150 visits per day

‒ More natural way of expressing data parallel
algorithms
‒ Initially targeted at multi-core.

We will provide
HSA Enabled Aparapi on Java 8
 APARAPI will :
to bridge between Aparapi on Java 7
‒ Support Java 8 Lambdas
‒ Dispatch code to HSA enabled devices at 9
and HSA/Sumatra on Java
runtime via HSAIL

Java Application

 Developer uses JDK Lambda, Stream API
 JVM uses GRAAL compiler to generate HSAIL
 JVM decides at runtime to execute on either
CPU or GPU depending on workload
characteristics.

Java Application

Java Application
Java JDK Stream + Lambda API

APARAPI API

APARAPI + Lambda API

OpenCL™
OpenCL™ Compiler
& Runtime

CPU

HSAIL

HSA Finalizer
& Runtime

JVM
CPU ISA

Java GRAAL JIT
backend

HSAIL

HSA Finalizer
& Runtime

JVM
GPU ISA
GPU

26 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

CPU ISA
CPU

JVM
GPU ISA
GPU

CPU ISA
CPU

GPU ISA
GPU
JAVA DEMO

WELCOME GARY FROST TO THE STAGE

27 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
NBODY REVISTED
 NBody problem:
‒ Calculate the position of ‘N’ bodies in 3D space by computing the gravitational effect each has on all
of the others and updating it’s position.

 A Java sequential NBody implementation would start with an Object for each Body.
public class Body{
// State of object
private float x, y, z, m, vx, vy, vz;
// Method to update position relative to other bodies
void updatePosition(Body[] bodies){ /* code omitted */ }
}

 Then we would iterate over all bodies updating the position of each
for (Body b: bodies) {
b.updatePosition(bodies)
});

 A pre Java 8 Java ‘parallel’ version would not fit so nicely on this slide ;)
28 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
JAVA 8’S ‘PROJECT LAMBDA’ SIMPLIFIES PARALLEL PROGRAMMING
 Offers an alternate syntax for processing arrays/collections of data
for (Body b; bodies)
b -> updatePosition(bodies);

Arrays.stream(bodies) // wrap array in a stream
.forEach(b -> b.updatePosition(bodies);

 To process a stream in parallel we just tag the stream with the parallel() modifier
Arrays.stream(bodies) // Wrap an array in a stream
.parallel();
// tag the stream as parallel
.forEach(b -> b.updatePosition(bodies);

 In Java 8 a parallel stream executes across all CPU cores.
 In Java 9 (Sumatra) a parallel stream executes across all CPU and GPU cores

29 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
JAVA DEMO

30 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
JAVA AND THE CLOUD

THE RIGHT LANGUAGE WITH ACCELERATION ON CLOUD APUS

 Java 8 and Java 9 provide parallel acceleration
 Parallel workloads are proliferating in the cloud
 Hadoop framework for scale out
 HSA APUs provide workload acceleration

DON’T MISS THE KEYNOTE
TOMORROW FROM ORACLE’S
NANDINI RAMANI

“THE ROLE OF JAVA™ IN HETEROGENEOUS
COMPUTING, AND HOW YOU CAN HELP”
31 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
Programming Tools
ANNOUNCING AMD’S UNIFIED SDK
 Access to AMD APU and GPU programmable
components
 Component installer - choose just what you need

 Initial release includes:
‒ APP SDK v2.9
‒ Media SDK 1.0 Beta

AMD Unified SDK

APP SDK 2.9

MEDIA SDK 1.0 BETA

 Web-based sample browser

 GPU accelerated video pre/post processing library

 Supports programming standards: OpenCL™, C++ AMP

 Leverage AMD's media encode/decode acceleration blocks

 Code samples for accelerated open source libraries:

 Library for low latency video encoding

‒ OpenCV, OpenNI, Bolt, Aparapi

 OpenCL™ source editing plug-in for visual studio
 Now supports Cmake

33 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

 Supports both Windows Store and Classic desktop
ANNOUNCING AMD

V1.3

 AMD’s comprehensive heterogeneous
developer tool suite including:
‒ CPU and GPU Profiling
‒ GPU kernel Debugging
‒ GPU kernel analysis

 New features in version 1.3:
‒ Supports Java
‒ Integrated static kernel analysis
‒ Remote debugging/profiling
‒ Supports latest AMD APU and GPU products

CPU PROFILER

GPU PROFILER

GPU DEBUGGER

STATIC KERNEL ANALYZER

 Time-based profiling

 OpenCL™ Application Trace

 Analyze call-chain relationships

 Profile OpenCL kernels

 Compile, analyze and
disassemble OpenCL Kernels

 Java profiling with inline
function support

 Timeline visualization of GPU
counter data

 Real-time OpenCL kernel
debugging with stepping and
variable display

 Cache-line utilization profiling

 Kernel Occupancy Viewer

 Supports latest AMD processors

 Remote GPU Profiling

34 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

 OpenCL and OpenGL API
Statistics
 Object visualization
 Remote GPU debugging

 View kernel compilation
errors/warnings
 Estimate kernel performance
 View generated ISA code
 View registers
OPEN SOURCE LIBRARIES ACCELERATED BY AMD

OpenCV

Bolt

clMath

Aparapi

 Most popular computer
vision library

 C++ template library

 AMD released APPML as
open source to create
clMath

 OpenCL™ accelerated Java 7

 Now with many OpenCL™
accelerated functions

 Provides GPU off-load for
common data-parallel
algorithms
 Now with cross-OS support
and improved
performance/functionality

35 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

 Accelerated BLAS and FFT
libraries
 Accessible from Fortran, C
and C++

 Java APIs for data parallel
algorithms (no need to
learn OpenCL™
AMD APUS, HSA – CLIENT TO THE CLOUD
A CONVERGENCE AT THE RIGHT TIME

 Parallel workloads are booming
‒ Acceleration where the data is
‒ On the client for a snappy user experience
‒ In the cloud for scalable services

 HSA enabled APUs in the cloud
‒ Big data analytics
‒ Video processing
‒ Science, imaging, genomics
‒ Unleashing the Java development community

 Acceleration at all tiers of the cloud
‒ Data centers, media hubs, cloud periphery

36 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
A SPECIAL GUEST

Gary Campbell

Infrastructure Technology Strategy CTO
HP

37 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
DISCLAIMER & ATTRIBUTION

The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.
The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap
changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software
changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD
reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of
such revisions or changes.
AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY
INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.
AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE
LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION
CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
ATTRIBUTION
© 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices,
Inc. in the United States and/or other jurisdictions. OpenCL is a trademark of Apple Inc. and Microsoft and Windows are trademarks of Microsoft Corp. Other
names are for informational purposes only and may be trademarks of their respective owners.
38 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC

Weitere ähnliche Inhalte

Was ist angesagt?

Optimizing High Performance Computing Applications for Energy
Optimizing High Performance Computing Applications for EnergyOptimizing High Performance Computing Applications for Energy
Optimizing High Performance Computing Applications for EnergyDavid Lecomber
 
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorGS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorAMD Developer Central
 
Deeper Look Into HSAIL And It's Runtime
Deeper Look Into HSAIL And It's Runtime Deeper Look Into HSAIL And It's Runtime
Deeper Look Into HSAIL And It's Runtime HSA Foundation
 
Big_Data_Heterogeneous_Programming IEEE_Big_Data 2015
Big_Data_Heterogeneous_Programming IEEE_Big_Data 2015Big_Data_Heterogeneous_Programming IEEE_Big_Data 2015
Big_Data_Heterogeneous_Programming IEEE_Big_Data 2015Junli Gu
 
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...AMD Developer Central
 
APSys Presentation Final copy2
APSys Presentation Final copy2APSys Presentation Final copy2
APSys Presentation Final copy2Junli Gu
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLinaro
 
OpenCL caffe IWOCL 2016 presentation final
OpenCL caffe IWOCL 2016 presentation finalOpenCL caffe IWOCL 2016 presentation final
OpenCL caffe IWOCL 2016 presentation finalJunli Gu
 
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ..."New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...Edge AI and Vision Alliance
 
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu FengHC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu FengAMD Developer Central
 
Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsAnand Haridass
 
LEG Keynote: Linda Knippers - HP
LEG Keynote: Linda Knippers - HPLEG Keynote: Linda Knippers - HP
LEG Keynote: Linda Knippers - HPLinaro
 
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialGanesan Narayanasamy
 

Was ist angesagt? (20)

Optimizing High Performance Computing Applications for Energy
Optimizing High Performance Computing Applications for EnergyOptimizing High Performance Computing Applications for Energy
Optimizing High Performance Computing Applications for Energy
 
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorGS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
 
Deeper Look Into HSAIL And It's Runtime
Deeper Look Into HSAIL And It's Runtime Deeper Look Into HSAIL And It's Runtime
Deeper Look Into HSAIL And It's Runtime
 
AMD It's Time to ROC
AMD It's Time to ROCAMD It's Time to ROC
AMD It's Time to ROC
 
Deeplearningusingcloudpakfordata
DeeplearningusingcloudpakfordataDeeplearningusingcloudpakfordata
Deeplearningusingcloudpakfordata
 
Big_Data_Heterogeneous_Programming IEEE_Big_Data 2015
Big_Data_Heterogeneous_Programming IEEE_Big_Data 2015Big_Data_Heterogeneous_Programming IEEE_Big_Data 2015
Big_Data_Heterogeneous_Programming IEEE_Big_Data 2015
 
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
 
APSys Presentation Final copy2
APSys Presentation Final copy2APSys Presentation Final copy2
APSys Presentation Final copy2
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience Report
 
OpenCL caffe IWOCL 2016 presentation final
OpenCL caffe IWOCL 2016 presentation finalOpenCL caffe IWOCL 2016 presentation final
OpenCL caffe IWOCL 2016 presentation final
 
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ..."New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
 
OpenPOWER Webinar
OpenPOWER Webinar OpenPOWER Webinar
OpenPOWER Webinar
 
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu FengHC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
 
IBM BOA for POWER
IBM BOA for POWER IBM BOA for POWER
IBM BOA for POWER
 
Ac922 cdac webinar
Ac922 cdac webinarAc922 cdac webinar
Ac922 cdac webinar
 
OpenPOWER Latest Updates
OpenPOWER Latest UpdatesOpenPOWER Latest Updates
OpenPOWER Latest Updates
 
Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of Systems
 
HP Moonshot system
HP Moonshot systemHP Moonshot system
HP Moonshot system
 
LEG Keynote: Linda Knippers - HP
LEG Keynote: Linda Knippers - HPLEG Keynote: Linda Knippers - HP
LEG Keynote: Linda Knippers - HP
 
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
 

Ähnlich wie Final apu13 phil-rogers-keynote-21

Carpe Datum: Building Big Data Analytical Applications with HP Haven
Carpe Datum: Building Big Data Analytical Applications with HP HavenCarpe Datum: Building Big Data Analytical Applications with HP Haven
Carpe Datum: Building Big Data Analytical Applications with HP HavenDataWorks Summit
 
OpenACC Monthly Highlights September 2020
OpenACC Monthly Highlights September 2020OpenACC Monthly Highlights September 2020
OpenACC Monthly Highlights September 2020OpenACC
 
OpenACC Monthly Highlights: November 2020
OpenACC Monthly Highlights: November 2020OpenACC Monthly Highlights: November 2020
OpenACC Monthly Highlights: November 2020OpenACC
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataLviv Startup Club
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Lviv Startup Club
 
Big Data Benchmarking with RDMA solutions
Big Data Benchmarking with RDMA solutions Big Data Benchmarking with RDMA solutions
Big Data Benchmarking with RDMA solutions Mellanox Technologies
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCENETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCEcsandit
 
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...DataWorks Summit/Hadoop Summit
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCENETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCEcscpconf
 
End to End Machine Learning Open Source Solution Presented in Cisco Developer...
End to End Machine Learning Open Source Solution Presented in Cisco Developer...End to End Machine Learning Open Source Solution Presented in Cisco Developer...
End to End Machine Learning Open Source Solution Presented in Cisco Developer...Manish Harsh
 
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDATAVERSITY
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Sumeet Singh
 
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...Sumeet Singh
 
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...Paul Hofmann
 
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...AMD Developer Central
 
Sig13 ce future_gfx
Sig13 ce future_gfxSig13 ce future_gfx
Sig13 ce future_gfxCass Everitt
 
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...IJCSES Journal
 
A comparative survey based on processing network traffic data using hadoop pi...
A comparative survey based on processing network traffic data using hadoop pi...A comparative survey based on processing network traffic data using hadoop pi...
A comparative survey based on processing network traffic data using hadoop pi...ijcses
 

Ähnlich wie Final apu13 phil-rogers-keynote-21 (20)

Carpe Datum: Building Big Data Analytical Applications with HP Haven
Carpe Datum: Building Big Data Analytical Applications with HP HavenCarpe Datum: Building Big Data Analytical Applications with HP Haven
Carpe Datum: Building Big Data Analytical Applications with HP Haven
 
OpenACC Monthly Highlights September 2020
OpenACC Monthly Highlights September 2020OpenACC Monthly Highlights September 2020
OpenACC Monthly Highlights September 2020
 
OpenACC Monthly Highlights: November 2020
OpenACC Monthly Highlights: November 2020OpenACC Monthly Highlights: November 2020
OpenACC Monthly Highlights: November 2020
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big Data
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)
 
Big Data Benchmarking with RDMA solutions
Big Data Benchmarking with RDMA solutions Big Data Benchmarking with RDMA solutions
Big Data Benchmarking with RDMA solutions
 
Sql 2017 net raf
Sql 2017  net rafSql 2017  net raf
Sql 2017 net raf
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCENETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
 
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCENETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
 
End to End Machine Learning Open Source Solution Presented in Cisco Developer...
End to End Machine Learning Open Source Solution Presented in Cisco Developer...End to End Machine Learning Open Source Solution Presented in Cisco Developer...
End to End Machine Learning Open Source Solution Presented in Cisco Developer...
 
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
 
Sql 2016 2017 full
Sql 2016   2017 fullSql 2016   2017 full
Sql 2016 2017 full
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
 
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
 
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
 
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
 
Sig13 ce future_gfx
Sig13 ce future_gfxSig13 ce future_gfx
Sig13 ce future_gfx
 
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
 
A comparative survey based on processing network traffic data using hadoop pi...
A comparative survey based on processing network traffic data using hadoop pi...A comparative survey based on processing network traffic data using hadoop pi...
A comparative survey based on processing network traffic data using hadoop pi...
 

Kürzlich hochgeladen

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Kürzlich hochgeladen (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

Final apu13 phil-rogers-keynote-21

  • 1. THE PROGRAMMER’S GUIDE TO REACHING FOR THE CLOUD PHIL ROGERS, CORPORATE FELLOW, AMD NOV. 11, 2013
  • 2. MODERN CLOUD WORKLOADS ARE HETEROGENEOUS SCALAR CONTENT WITH A GROWING MIX OF PARALLEL CONTENT  Video is expected to represent two thirds of mobile data traffic by 2017 ‒ Video is continuously being captured, uploaded, transcoded and streamed ‒ Video processing is inherently parallel … and can be accelerated  Big data growing exponentially with Exabytes of data crawled monthly ‒ Indexing the web and extracting high definition information ‒ Map reduce is a heterogeneous workload  Natural User Interfaces are still in their infancy ‒ Accurate extraction of meaning from gesture and voice ‒ Getting to the fingertips and voice inflections NEED TO SIMULTANEOUSLY INCREASE PERFORMANCE AND REDUCE POWER 2 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 3. FUTURE TECHNOLOGY GROWTH WILL ACCELERATE THE TREND  Rapid growth of Sensor Networks RAPID GROWTH OF THE NUMBER OF THINGS CONNECTED TO THE INTERNET ‒ Drives exponential increase in data  Internet of Everything (IoE) results in explosion of data sources “Fixed” Computing (you go to the device) Mobility / BYOD (the device goes with you) Internet of Things (age of devices) HOW MUCH VALUE IS AT STAKE IN THE IOE ECONOMY? Internet of Everything (people, process, data, things) $14.4 trillion 50B ‒ Another exponential growth in data at local and cloud level  Context Aware Computing is a Huge Big-Data Problem $9.5 $4.9 trillion ‒ Both local and cloud compute must get faster/lower power 1995 2000 2005 2010 2015 2020 trillion From industry-specific use cases From cross-industry use cases DRIVING FUTURE DEMAND FOR LOCAL AND CLOUD PARALLEL EFFICIENCY 3 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC Source: Cisco IBSG, 2013
  • 4. HSA APU PROCESSORS OPERATE HARMONIOUSLY AT LOW POWER EXAMPLE: VIDEO ENHANCEMENT  Techniques include: ‒ Image Stabilization, Super Resolution, Deblur, Deinterlace, Lighting & Contrast  Enhancements examine pixels from a large number of video frames ‒ Super-resolution based on information from surrounding frames  Algorithms can be run on multiple processors in the APU ‒ CPU, GPU, DSPs, Fixed Function Accelerators ‒ Convolutions, motion estimation, histograms, format conversions, etc. ‒ Processing flows freely between processors for best efficiency 4 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 5. HETEROGENEOUS PROCESSORS - EVERYWHERE SMARTPHONES TO SUPER-COMPUTERS Super computer Dense Server Tablet Phone Workstation Notebook A SINGLE SCALABLE ARCHITECTURE FOR THE WORLD’S PROGRAMMERS IS DEMANDED AT THIS POINT 5 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 6. HOW DOES HSA MAKE THIS ALL WORK?  Enables acceleration of languages like Java, C++ AMP and Python  All processors use the same addresses, and can share data structures in place  Heterogeneous computing can use all of virtual and physical memory  Extends multicore coherency to the GPU and other processors  Pass work quickly between the processors  Enables quality of service HSA FOUNDATION – BUILDING THE ECOSYSTEM 6 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 8. HSA FOUNDATION AT LAUNCH BORN IN JUNE 2012 Founders 8 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 9. HSA FOUNDATION TODAY – NOVEMBER 2013 A GROWING AND POWERFUL FAMILY Founders Promoters Supporters Contributors TBA at APU-13 Universities 9 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC NTHU Programming Language Lab NTHU System Software Lab COMPUTER SCIENCE
  • 10. HSA FOUNDATION PROGRESS WHAT AN AMAZING FIRST YEAR  Membership growing rapidly ‒ 2-3 new members per month ‒ Universities enrolling  Four working groups generating specifications ‒ HSA Programmers Reference Manual published ‒ HSA System Architecture spec going to ratification by the end of the year ‒ Runtime WG and Tools WG will publish early next year  HSA Development platforms to ship in early 2014 10 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 11. PROGRAMMING LANGUAGES PROLIFERATING ON HSA OpenCL™ App Java App C++ AMP App Python App OpenCL Runtime Java JVM (Sumatra) Various Runtimes Fabric Engine RT HSAIL HSA Helper Libraries HSA Core Runtime Kernel Fusion Driver (KFD) 11 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC HSA Finalizer
  • 13. HIGH EFFICIENCY VIDEO CODEC – HEVC (H.265) VALUE PROPOSITION HEVC VISUAL QUALITY IS SIGNIFICANTLY BETTER THAN H.264 AT ANY GIVEN BIT RATE 30% TO 50% MORE EFFICIENT THAN H.264 AT 1080P RESOLUTION 4K Ultra HDTV Sony XBR $4999 H.265 @ 500 kbps H.264 @ 500 kbps 13 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC 4K VIDEO BENEFITS ARE EVEN MORE SIGNIFICANT WITH HEVC 30% to 50% 4K Video Cameras GoPro $399
  • 14. HIGH EFFICIENCY VIDEO CODEC – HEVC (H.265) WHY HEVC WILL PROLIFERATE  The next generation MPEG video encoding standard  Significantly higher efficiency (up to 50% lower bit rates at given quality) than AVC (H.264)  Highly beneficial for HD video (1080p or below)  Especially beneficial for 4K video  Scales to 8K Ultra High Definition video (up to 8192×4320)  Computationally complex, but by design easier to parallelize than H.264 Traffic Share Mobile Video Mobile M2M Exabytes Per Month 12 Mobile Web/Data Mobile File Sharing 3.5% 5.1% 10 24.9% 8 6 4 66.5% 2 CLOUD VIDEO PROVIDERS NEED THE HIGHER COMPRESSION FOR QUALITY OF SERVICE 14 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC 0 2012 2013 2014 2015 2016 2017 Source: Cisco VNI Mobile Forecast, 2013
  • 15. HEVC (H.265) ACCELERATION EFFICIENT CLOUD DEPLOYMENT ALL STAGES OF HEVC ARE ACCELERATED ON THE APU      Decrypt Decode and decompress Scaling and Enhancement Encode and compress Encrypt 15 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC ENCODE IS THE HEAVIEST STAGE H.265 ENCODING IS 5 – 10X MORE COMPUTATIONALLY COMPLEX THAN H.264  Leverage point for compression  Highly parallel  Algorithms improve monthly  Must stay programmable  Picture can be divided into Macroblock regions with a much wider range of sizes and shapes  Motion vectors have 33 prediction directions compared to 8 for H.264
  • 16. OVERVIEW OF B+ TREES  B+ Trees are a special case of B Trees  A B+ Tree … ‒ is a dynamic, multi-level index ‒ Is efficient for retrieval of data, stored in a block-oriented context  Fundamental data structure used in several popular database management systems ‒ SQLite ‒ CouchDB  Order (b) of a B+ Tree measures the capacity of its nodes 3 2 5 4 6 7 1 2 3 4 5 6 7 8 d1 d2 d3 d4 d5 d6 d7 d8 16 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 17. APPLICATIONS THAT USE B/B+ TREES primary data store on the clientside multi-data center key-value store Mail, Safari, iPhone, iPod, iTunes market-data framework Firefox and Thunderbird large hadron collider Android, Chrome 17 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC http://www.sqlite.org/famous.html http://wiki.apache.org/couchdb/CouchDB_in_the_wild
  • 18. HOW WE ACCELERATE  Utilize coarse-grained parallelism in B+ Tree searches ‒ Perform many queries in parallel ‒ Increase memory bandwidth utilization with parallel reads ‒ Increase throughput (transactions per second for OLTP)  B+ Tree searches on an HSA enabled APU ‒ Allows much larger B+ Trees to be searched, than traditional GPU compute ‒ Eliminates data-copies since CPU and GPU cores can access the same memory 18 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 19. RESULTS 1M search queries in parallel 7  Input B+ Tree contains 112 million keys and uses 6GB of memory  Software: OpenCL on HSA 5 Speedup  Hardware: AMD “Kaveri” APU with Quad Core CPU and 8 GCN Compute Units at 35W TDP 6 4 3 2 1 0 8 16 32 64 128 Order of B+ Tree Baseline: 4-core OpenMP + hand-tuned SSE CPU implementation 19 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC Results measured in AMD Labs on “Kaveri” APU, 35W TDP, 16GB DRAM
  • 20. REVERSE TIME MIGRATION (RTM) Land crews  A technique for creating images based on sensor data to improve seismic interpretations done by geophysicists Marine crews  A memory-intensive and highly parallel algorithm  RTM is run on massive data sets  A natural scale out algorithm  Often run today on 100K node CPU systems  Bringing this to HSA and APU based supercomputing will increase performance for current sensor arrays, and allow more sensors and accuracy in the future. 20 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC HOWEVER, SPEED OF PROCESSING AND INTERPRETATION IS A CRITICAL BOTTLENECK IN MAKING FULL USE OF ACQUISITION ASSETS
  • 21. TEXT ANALYTICS – HADOOP TERASORT AND BIG DATA SEARCH MINING BIG DATA  Multi-stage pipeline or parallel processing stages  Traditional GPU Compute is challenged by copies Input HDFS sort split 0 map Sort Compression Regular expression parsing CRC generation  Acceleration of large data search scales out across the cluster of APU nodes 21 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC Output HDFS merge reduce split 1 split 2 part 0 HDFS Replication reduce  APU with HSA accelerates each stage in place ‒ ‒ ‒ ‒ copy part 1 HDFS Replication map map
  • 23. PROGRAMMING MODELS EMBRACING HSAIL AND HSA THE RIGHT LEVEL OF ABSTRACTION UNDER DEVELOPMENT     Java: Project Sumatra OpenJDK 9 OpenMP from SuSE C++ AMP, based on CLANG/LLVM Python and KL from Fabric Engine 23 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC NEXT      DSLs: Halide, Julia, Rust Fortran JavaScript Open Shading Language R
  • 24. HSA ENABLES DEVELOPERS TO LEVERAGE HC … EASILY & NATURALLY PREFERRED PROGRAMMING LANGUAGES TRANSPARENT CALLS TO POPULAR LIBRARIES  Java, C++, OpenMP, Python *  OpenCV, SciPy, NumPy, ImageMagick, Bolt, …  SVM, Coherence, GPU Enqueue  OpenJDK/Sumatra, Fabric Engine  Arbitrary data structures, SVM, Coherence, User mode queueing  OpenCV API, Bolt STL library * Java 8, C++ AMP, OpenMP 4.0 next generation standards and extensions 24 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC USING CONVENTIONAL METHODS  Arbitrary data structures, malloc, function pointers, callbacks, recursion, semaphores, atomics  SVM, Coherence, User-mode queueing, GPU Enqueue, HSAIL  Linked-list/tree traversal + other complex shared host data structures
  • 25. C++ AMP ACCELERATION GOES MULTI-PLATFORM  Herb Sutter Announced C++ AMP for the Windows® Platform at ADS 2011  We very much liked the single source model of development, and decided to extend it to be multi-platform  Today we are announcing C++ AMP is moving beyond Microsoft® Windows to embrace Linux. We will offer this acceleration on both our APUs and our discrete GPUs  We are also bringing Bolt STL Library support to C++ AMP C++AMP 25 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC CLANG Front-end LLVM-IR or SPIR 1.2 Any HSA Implementation SPIR 1.2 AVAILABLE IN OPEN SOURCE 1H-2014 HSAIL Any OpenCL™+SPIR Implementation LLVM Compiler
  • 26. HSA ENABLEMENT OF JAVA JAVA 7 – OpenCL ENABLED APARAPI JAVA 8 – HSA ENABLED APARAPI JAVA 9 – HSA ENABLED JAVA (SUMATRA)  AMD initiated Open Source project  Java 8 brings Stream + Lambda API.  Adds native GPU acceleration to Java Virtual Machine (JVM)  APIs for data parallel algorithms ‒ GPU accelerate Java applications ‒ No need to learn OpenCL™  Active community captured mindshare ‒ ~20 contributors ‒ >7000 downloads ‒ ~150 visits per day ‒ More natural way of expressing data parallel algorithms ‒ Initially targeted at multi-core. We will provide HSA Enabled Aparapi on Java 8  APARAPI will : to bridge between Aparapi on Java 7 ‒ Support Java 8 Lambdas ‒ Dispatch code to HSA enabled devices at 9 and HSA/Sumatra on Java runtime via HSAIL Java Application  Developer uses JDK Lambda, Stream API  JVM uses GRAAL compiler to generate HSAIL  JVM decides at runtime to execute on either CPU or GPU depending on workload characteristics. Java Application Java Application Java JDK Stream + Lambda API APARAPI API APARAPI + Lambda API OpenCL™ OpenCL™ Compiler & Runtime CPU HSAIL HSA Finalizer & Runtime JVM CPU ISA Java GRAAL JIT backend HSAIL HSA Finalizer & Runtime JVM GPU ISA GPU 26 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC CPU ISA CPU JVM GPU ISA GPU CPU ISA CPU GPU ISA GPU
  • 27. JAVA DEMO WELCOME GARY FROST TO THE STAGE 27 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 28. NBODY REVISTED  NBody problem: ‒ Calculate the position of ‘N’ bodies in 3D space by computing the gravitational effect each has on all of the others and updating it’s position.  A Java sequential NBody implementation would start with an Object for each Body. public class Body{ // State of object private float x, y, z, m, vx, vy, vz; // Method to update position relative to other bodies void updatePosition(Body[] bodies){ /* code omitted */ } }  Then we would iterate over all bodies updating the position of each for (Body b: bodies) { b.updatePosition(bodies) });  A pre Java 8 Java ‘parallel’ version would not fit so nicely on this slide ;) 28 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 29. JAVA 8’S ‘PROJECT LAMBDA’ SIMPLIFIES PARALLEL PROGRAMMING  Offers an alternate syntax for processing arrays/collections of data for (Body b; bodies) b -> updatePosition(bodies); Arrays.stream(bodies) // wrap array in a stream .forEach(b -> b.updatePosition(bodies);  To process a stream in parallel we just tag the stream with the parallel() modifier Arrays.stream(bodies) // Wrap an array in a stream .parallel(); // tag the stream as parallel .forEach(b -> b.updatePosition(bodies);  In Java 8 a parallel stream executes across all CPU cores.  In Java 9 (Sumatra) a parallel stream executes across all CPU and GPU cores 29 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 30. JAVA DEMO 30 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 31. JAVA AND THE CLOUD THE RIGHT LANGUAGE WITH ACCELERATION ON CLOUD APUS  Java 8 and Java 9 provide parallel acceleration  Parallel workloads are proliferating in the cloud  Hadoop framework for scale out  HSA APUs provide workload acceleration DON’T MISS THE KEYNOTE TOMORROW FROM ORACLE’S NANDINI RAMANI “THE ROLE OF JAVA™ IN HETEROGENEOUS COMPUTING, AND HOW YOU CAN HELP” 31 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 33. ANNOUNCING AMD’S UNIFIED SDK  Access to AMD APU and GPU programmable components  Component installer - choose just what you need  Initial release includes: ‒ APP SDK v2.9 ‒ Media SDK 1.0 Beta AMD Unified SDK APP SDK 2.9 MEDIA SDK 1.0 BETA  Web-based sample browser  GPU accelerated video pre/post processing library  Supports programming standards: OpenCL™, C++ AMP  Leverage AMD's media encode/decode acceleration blocks  Code samples for accelerated open source libraries:  Library for low latency video encoding ‒ OpenCV, OpenNI, Bolt, Aparapi  OpenCL™ source editing plug-in for visual studio  Now supports Cmake 33 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC  Supports both Windows Store and Classic desktop
  • 34. ANNOUNCING AMD V1.3  AMD’s comprehensive heterogeneous developer tool suite including: ‒ CPU and GPU Profiling ‒ GPU kernel Debugging ‒ GPU kernel analysis  New features in version 1.3: ‒ Supports Java ‒ Integrated static kernel analysis ‒ Remote debugging/profiling ‒ Supports latest AMD APU and GPU products CPU PROFILER GPU PROFILER GPU DEBUGGER STATIC KERNEL ANALYZER  Time-based profiling  OpenCL™ Application Trace  Analyze call-chain relationships  Profile OpenCL kernels  Compile, analyze and disassemble OpenCL Kernels  Java profiling with inline function support  Timeline visualization of GPU counter data  Real-time OpenCL kernel debugging with stepping and variable display  Cache-line utilization profiling  Kernel Occupancy Viewer  Supports latest AMD processors  Remote GPU Profiling 34 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC  OpenCL and OpenGL API Statistics  Object visualization  Remote GPU debugging  View kernel compilation errors/warnings  Estimate kernel performance  View generated ISA code  View registers
  • 35. OPEN SOURCE LIBRARIES ACCELERATED BY AMD OpenCV Bolt clMath Aparapi  Most popular computer vision library  C++ template library  AMD released APPML as open source to create clMath  OpenCL™ accelerated Java 7  Now with many OpenCL™ accelerated functions  Provides GPU off-load for common data-parallel algorithms  Now with cross-OS support and improved performance/functionality 35 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC  Accelerated BLAS and FFT libraries  Accessible from Fortran, C and C++  Java APIs for data parallel algorithms (no need to learn OpenCL™
  • 36. AMD APUS, HSA – CLIENT TO THE CLOUD A CONVERGENCE AT THE RIGHT TIME  Parallel workloads are booming ‒ Acceleration where the data is ‒ On the client for a snappy user experience ‒ In the cloud for scalable services  HSA enabled APUs in the cloud ‒ Big data analytics ‒ Video processing ‒ Science, imaging, genomics ‒ Unleashing the Java development community  Acceleration at all tiers of the cloud ‒ Data centers, media hubs, cloud periphery 36 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 37. A SPECIAL GUEST Gary Campbell Infrastructure Technology Strategy CTO HP 37 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC
  • 38. DISCLAIMER & ATTRIBUTION The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION © 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. OpenCL is a trademark of Apple Inc. and Microsoft and Windows are trademarks of Microsoft Corp. Other names are for informational purposes only and may be trademarks of their respective owners. 38 | APU-13 KEYNOTE | NOVEMBER 11, 2013 | PUBLIC