SlideShare ist ein Scribd-Unternehmen logo
1 von 39
DIRECT3D AND THE FUTURE OF
GRAPHICS APIS
Dave Oldcorn, AMD
Dan Baker, Oxide Games
Johan Andersson, EA / DICE
2 | AMD Direct3D Futures | March 20th, 2014
NITROUS AND DX12
Dan Baker
Partner, Oxide Games
3 | AMD Direct3D Futures | March 20th, 2014
HAVEN’T WE BEEN HERE BEFORE?
Goal of DX9
–Remember State blocks?
Goal of DX10
–Large state groups
Goal of DX11
–Deferred contexts
Are we actually getting faster, or are CPUs just faster?
–Quite possible no perf improvements due to API features in 10 years
Maybe adding features isn’t the answer…
4 | AMD Direct3D Futures | March 20th, 2014
DEEPLY ROOTED PROBLEM
 Coding design philosophies clash with real world
 OOP, data hiding, polymorphic design clashes with task-driven, data parallel
 Evident in language trends, striking disconnect between what is considered good code, and what is fast
 Gap has always been there, but has grown in recent years
– 15 years ago, processors often bound by computation
– Now, usually bound by cache misses, serialization, pipeline stalls, etc.
– Multi-Core CPUs are ineffectively utilized
 „Heavy Iron‟ , e.g. Big Object, Opaque memory is a dead end for performance
 The revolt is beginning in high performance graphics APIS, but will spread
5 | AMD Direct3D Futures | March 20th, 2014
BUT… HOW MUCH FASTER?
Biggest problem with industry today: Acceptance
Only 1 secret in API design: That it can be done.
–And isn‟t that hard
–And our code isn‟t that ugly
Star Swarm already demonstrating what is possible on a PC
6 | AMD Direct3D Futures | March 20th, 2014
D3D12 FEATURES THAT NITROUS USES
True de-coupled multi-core rendering
– Expecting near linear thread scheduling
Manual Hazard tracking
– Hazards have been resolved already
Memory Heaps
– Bigger chunks of memory pool grouping make management simpler
Descriptor Tables
– Table exposure allows a cheaper way of binding textures
– Allows texture bindings to be shared between non-adjacent batches
7 | AMD Direct3D Futures | March 20th, 2014
WHAT’S DIFFERENT NOW?
Spec Written
Spec
Reviewed
API
implemented
Released to
public
First Engine
use
Analysis
done
Thenn
8 | AMD Direct3D Futures | March 20th, 2014
WHAT’S DIFFERENT NOW?
Nown
Create
Spec
Implement
Spec
Prototype
on Actual
Engines
Analyze
Discuss
with IHVs,
ISVs
Start Here
If Ready, exit
here to prep
for release
9 | AMD Direct3D Futures | March 20th, 2014
IN THE SPIRIT OF CONTRIBUTING
Oxide proud to announce
that we have a proto-type of
Nitrous running on D3D12
*PR DISCLAIMER* This is
not an official
announcement regarding
D3D12 support
Porting from other modern
APIs is much simpler than
porting from D3D11 to
D3D12
10 | AMD Direct3D Futures | March 20th, 2014
EXPECTED RESULTS
CPU Driver overhead largely put to rest
Huge increases in driver reliability
Huge decreases in frame latency, expecting median frame latency to be
1.5 frames
–Increased perceptual responsiveness
Never a dropped frame or stall due to driver API issues
–*Other OS events could cause stalls
Driver should be far smaller, simpler to implement, IHVs can spend more
time on optimizations
DIRECT3D12 AND THE FUTURE OF
GRAPHICS APIS
Dave Oldcorn, Direct3D12 Driver Architect, AMD
12 | AMD Direct3D Futures | March 20th, 2014
THE PROBLEM
13 | AMD Direct3D Futures | March 20th, 2014
THE PROBLEM
 Mismatch between existing Direct3D and hardware capabilities
– Lots of CPU cores, but only one stream of data
– State communication in small chunks
– “Hidden” work
 Hard to predict from any one given call what the overhead might be
 Implicit memory management
– Hardware evolving away from classical register programming
14 | AMD Direct3D Futures | March 20th, 2014
Metal
(register level access)
API LANDSCAPE
 Gap between PC „raw‟ 3D APIs and the
hardware has opened up
 Very high level APIs now ubiquitous; easy to
access even for casual developers, plenty of
choice
 Where the PC APIs are is a middle ground
Capability,easeofuse,distancefrom3Dengine
Game Engines
Frostbite
Unity
Unreal
CryEngine
BlitzTech
Flash / Silverlight
Console APIs
Opportunity
D3D9
OpenGL
D3D11
D3D7/8
Application
15 | AMD Direct3D Futures | March 20th, 2014
WHAT ARE THE CONSEQUENCES?
WHAT ARE THE SOLUTIONS?
16 | AMD Direct3D Futures | March 20th, 2014
SEQUENTIAL API
 Sequential API: state for given draw comes from arbitrary
previous time
 Some states must be reconciled on the CPU (“delayed
validation”)
– All contributing state needs to be visible
 GPU isn‟t like this, uses command buffers
– Must save and restore state at start and end
...
Draw
Set PS CB
Draw x 5
Set VS CB
Draw x 3
Set Blend
Set PS
Set RT state
Draw
Set VS VB
Draw
...
(more, earlier)
PS CB
VS CB
Blend state
PS
RT state
Draw
State contributing
to draw
API input
17 | AMD Direct3D Futures | March 20th, 2014
THREADING A SEQUENTIAL API
 Sequential API threading
– Simple producer / consumer model
 Extra latency
 Buffering has a cost
 More threading would mean dividing tasks on finer grain
– Bottlenecked on application or driver thread
 Difficult to extract parallelism (Amdahl‟s Law)
Application simulation
Prebuild
Thread 0
Prebuild
Thread 1
Application Render Thread
GPU Execution Queue
Queued
Buffer 0
Queued
Buffer 1
...
Runtime / Driver
Application
Driver Thread
Queued
Buffer 2
18 | AMD Direct3D Futures | March 20th, 2014
COMMAND BUFFER API
 GPUs only listen to command buffers
 Let the app build them
– Command Lists, at the API level
 Solves sequential API CPU issues
Application simulation
Thread 0 Thread 1
Build Cmd
Buffer
Build
Cmd
Buffer
GPU Execution Queue
Queued
Buffer 0
Queued
Buffer 1
...
Runtime / Driver
Application
19 | AMD Direct3D Futures | March 20th, 2014
BETTER SCHEDULING
 App has much more control over scheduling work
– Both CPU side and GPU
 Threads don‟t really share much resource
 Many more options for streaming assets
Driver thread
Create thread
D3D11: CB building threads tend to interfere
GPU load still added but only after queuing
Render work
Create work
GPU executes
D3D12: CB building threads more independent
Create thread
Build threads
20 | AMD Direct3D Futures | March 20th, 2014
PIPELINE OBJECTS
 Pipeline objects get rid of JIT and enable LTCG for GPUs
 Decouple interface and implementation
 We‟re aware that this is a hairpin bend for many graphics
engines to negotiate.
– Many engines don‟t think in terms of predicting state up
front
– The benefits are worth it
Simplified dataflow
through pipeline
VS
PS
Index
Process
Primitive
Generation
Rasteriser
Rendertarget
Output
?
?
?
21 | AMD Direct3D Futures | March 20th, 2014
RENDER OBJECT BINDING MISMATCH
 Hardware uses tables in video memory
 BUT still programmed like a register solution
– So one bind becomes:
 Allocate a new chunk of video memory
 Create a new copy of the entire table
 Update the one entry
 Write the register with the new table base
address
SR
CB
On-chip
root table
(1 per stage) Pointer to table
(here, textures)
GPU Memory
SRD table
GPU Memory
resource
Pointer to table
(constant buffers)
Pointer to (+ params
of) resource
22 | AMD Direct3D Futures | March 20th, 2014
DESCRIPTOR TABLES
 Several tables of each type of resource
– Easy to divide up by frequency
 Tables can be of arbitrary size; dynamically indexed to
provide bindless textures
 Changing a table pointer is cheap
 Updating a descriptor in a table is not
SR.T[0]
SR.T[3]
SR.T[2]
SR.T[1]
UAV
CB.T[1]
CB.T[0]
Samp
SR.T[0][0]
SR.T[0][2]
SR.T[0][1]
CB.T[1][0]
CB.T[1][1]
On-chip
table Pointer to table
(textures table 0)
GPU Memory
SRD table
Pointer to table
(constbuf table 1)
23 | AMD Direct3D Futures | March 20th, 2014
KEY INNOVATIONS
Innovation CPU-side win GPU-side win
Command buffers
Build on many threads
Control of scheduling
Lower latency
Simplified state tracking
Pipeline state objects
Link at create time
No JIT shader compiles
Efficient batched updates
Cheaper state updates
Enables LTCG
Bind objects in
groups
Cheap to change group
Cheap to change group
Fits hardware paradigm
Move work to Create Predictability Enables optimisations
24 | AMD Direct3D Futures | March 20th, 2014
KEY INNOVATIONS
Innovation CPU-side win GPU-side win
Explicit
Synchronisation
Efficiency
Required for bindless textures
Less overhead
Explicit Memory
Management
Efficiency
Predictability
Application flexibility
Zero copy
Control over placement
Do less
Predictability, Efficiency
Enables aggressive schedule
FEWER BUGS
25 | AMD Direct3D Futures | March 20th, 2014
NEW PROBLEMS
(AND TIPS TO SOLVE THEM)
26 | AMD Direct3D Futures | March 20th, 2014
NEW VISIBLE LIMITS
 More draws in does not automatically mean more
triangles out
– You will not see full rendering rates with triangles
averaging 1 pixel each.
– Wireframe mode should look different to filled
rendering
27 | AMD Direct3D Futures | March 20th, 2014
NEW VISIBLE LIMITS
 Feeding the GPU much more efficiently means exploring interesting new limits that weren‟t visible before
 10k/frame of anything is ~1µs per thing.
 GPU pipeline depth is likely to be 1-10µs (1k-10k cycles).
 Specific limit: context registers
– Shader tables are NOT in the context
– Compute doesn‟t bottleneck on context
28 | AMD Direct3D Futures | March 20th, 2014
APPLICATION IN CHARGE
 Application is arbiter of correct rendering
– This is a serious responsibility
– The benefits of D3D12 aren‟t readily available without this condition
Applications must be warning-free on the debug layer
 Different opportunities for driver intervention
29 | AMD Direct3D Futures | March 20th, 2014
APPLICATION IN CHARGE
 No driver thread in play
– App can target much lower latency
– BUT implies app has to be ready with new
GPU work
Driver F1
App Render Frame 1
GPU F1
Frame 2
F2
F2
Frame 3
F3
F3
D3D11: No dead GPU time after 1st frame (but extra latency)
Dead
Time
First work sent to driver Driver buffers Present; no future dead time
No buffered present reveals dead time on GPU
30 | AMD Direct3D Futures | March 20th, 2014
USE COMMAND BUFFERS SPARINGLY
 Each API command list maps to a single hardware
command buffer
 Starting / ending a command list has an overhead
– Writes full 3D state, may flush caches or idle GPU
 We think a good rule of thumb will be to target around 100
command buffers/frame
– Use the multiple submission API where possible
CB0 CB1 CB2CB0
Multiple applications running on system
Application 0 queue
CB0 CB1 CB2
CB0
Application 1 queue
GPU executes
31 | AMD Direct3D Futures | March 20th, 2014
ROUND-UP
32 | AMD Direct3D Futures | March 20th, 2014
ALL-NEW
 There‟s a learning curve here for all of us
 In the main it‟s a shallow one
– Compared at least to the general problem of multithreaded rendering
 Multithread is always hard.
– Simpler design means fewer bugs and more predictable performance
33 | AMD Direct3D Futures | March 20th, 2014
WHAT AMD PLAN TO DELIVER
 An early preview driver “soon”
 Release driver for Direct3D12 launch
 Continuous engagement
– With Microsoft
– With ISVs
 Bring your opinions to us and to Microsoft.
34 | AMD Direct3D Futures | March 20th, 2014
DX12 AND FROSTBITE
Johan Andersson
Technical Director
35 | AMD Direct3D Futures | March 20th, 2014
DX12 AND FROSTBITE
 PC is very important for EA and we‟ve been pushing hard to improve graphics capabilities on Windows
 Excited to be working with Microsoft and the IHVs on Direct3D again!
 Good & very healthy collaboration between Microsoft, the IHVs and us game/engine developers
 DX12 is a really big step forward from DX11 or GL4
36 | AMD Direct3D Futures | March 20th, 2014
DX12 FEATURES AND FROSTBITE
 Key DX12 features that are a great fit for Frostbite:
– Efficient parallel command buffers
– Descriptor tables
– Pipeline objects
– Explicit resource synchronization
– Explicit memory management
 DX12 is still in development so actively working with Microsoft & the IHVs to help make sure all of it fits
together and is efficient
37 | AMD Direct3D Futures | March 20th, 2014
DX12 PLATFORMS
 DX12 support on Windows 7 & most existing PC hardware is critical for us
– Huge user base still on Windows 7
– Gamers would see major benefits without upgrading
 DX12 support on Xbox One is critical for us
– Will lead to improved performance & quality for future Xbox One titles
– Almost all of our games are cross platform Gen4/PC
– Easier development – renderer is shared between Windows & Xbox One
 Looking forward to DX12 on mobile/tablets
– Power efficiency & low overhead is really key
– Need larger user base to target on Windows for mobile
38 | AMD Direct3D Futures | March 20th, 2014
DX12 AND FROSTBITE
 We are building a DX12 renderer for Frostbite!
– Will work on GPUs from all vendors – benefits a wide set of gamers
 Expected benefits over DX11:
– More stable and consistent performance
– Higher overall performance
– Move our design target – more richer & more detailed game worlds
– Thinner drivers – easier to work with / less of a black box
– More control for us developers – new techniques & optimizations
 Really happy that the full Windows & Xbox eco systems are moving to low-level graphics API!
39 | AMD Direct3D Futures | March 20th, 2014
QUESTIONS

Weitere ähnliche Inhalte

Was ist angesagt?

Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...AMD Developer Central
 
CE-4027, Sensor Fusion – HID virtualized over LPC, by Reed Hinkel
CE-4027, Sensor Fusion – HID virtualized over LPC, by Reed HinkelCE-4027, Sensor Fusion – HID virtualized over LPC, by Reed Hinkel
CE-4027, Sensor Fusion – HID virtualized over LPC, by Reed HinkelAMD Developer Central
 
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...AMD Developer Central
 
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelIS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelAMD Developer Central
 
Final lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tbFinal lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tbr Skip
 
HC-4019, "Exploiting Coarse-grained Parallelism in B+ Tree Searches on an APU...
HC-4019, "Exploiting Coarse-grained Parallelism in B+ Tree Searches on an APU...HC-4019, "Exploiting Coarse-grained Parallelism in B+ Tree Searches on an APU...
HC-4019, "Exploiting Coarse-grained Parallelism in B+ Tree Searches on an APU...AMD Developer Central
 
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderPT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderAMD Developer Central
 
MM-4099, Adapting game content to the viewing environment, by Noman Hashim
MM-4099, Adapting game content to the viewing environment, by Noman HashimMM-4099, Adapting game content to the viewing environment, by Noman Hashim
MM-4099, Adapting game content to the viewing environment, by Noman HashimAMD Developer Central
 
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorGS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorAMD Developer Central
 
CC-4005, Performance analysis of 3D Finite Difference computational stencils ...
CC-4005, Performance analysis of 3D Finite Difference computational stencils ...CC-4005, Performance analysis of 3D Finite Difference computational stencils ...
CC-4005, Performance analysis of 3D Finite Difference computational stencils ...AMD Developer Central
 
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...AMD Developer Central
 
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningPL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningAMD Developer Central
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellAMD Developer Central
 
HSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben GasterHSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben GasterAMD Developer Central
 
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...AMD Developer Central
 
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...AMD Developer Central
 
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...AMD Developer Central
 
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...AMD Developer Central
 
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...AMD Developer Central
 

Was ist angesagt? (20)

Media SDK Webinar 2014
Media SDK Webinar 2014Media SDK Webinar 2014
Media SDK Webinar 2014
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
 
CE-4027, Sensor Fusion – HID virtualized over LPC, by Reed Hinkel
CE-4027, Sensor Fusion – HID virtualized over LPC, by Reed HinkelCE-4027, Sensor Fusion – HID virtualized over LPC, by Reed Hinkel
CE-4027, Sensor Fusion – HID virtualized over LPC, by Reed Hinkel
 
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...
 
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelIS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
 
Final lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tbFinal lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tb
 
HC-4019, "Exploiting Coarse-grained Parallelism in B+ Tree Searches on an APU...
HC-4019, "Exploiting Coarse-grained Parallelism in B+ Tree Searches on an APU...HC-4019, "Exploiting Coarse-grained Parallelism in B+ Tree Searches on an APU...
HC-4019, "Exploiting Coarse-grained Parallelism in B+ Tree Searches on an APU...
 
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderPT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
 
MM-4099, Adapting game content to the viewing environment, by Noman Hashim
MM-4099, Adapting game content to the viewing environment, by Noman HashimMM-4099, Adapting game content to the viewing environment, by Noman Hashim
MM-4099, Adapting game content to the viewing environment, by Noman Hashim
 
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorGS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
 
CC-4005, Performance analysis of 3D Finite Difference computational stencils ...
CC-4005, Performance analysis of 3D Finite Difference computational stencils ...CC-4005, Performance analysis of 3D Finite Difference computational stencils ...
CC-4005, Performance analysis of 3D Finite Difference computational stencils ...
 
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
 
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningPL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
 
HSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben GasterHSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben Gaster
 
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...
 
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...
 
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
 
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
 
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
 

Ähnlich wie Direct3D and the Future of Graphics APIs - AMD at GDC14

GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahAMD Developer Central
 
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio [Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio Owen Wu
 
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...AMD Developer Central
 
Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14AMD Developer Central
 
Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Rob Gillen
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14AMD Developer Central
 
YOW2021 Computing Performance
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing PerformanceBrendan Gregg
 
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon Selley
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon SelleyPT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon Selley
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon SelleyAMD Developer Central
 
Optimizing Direct X On Multi Core Architectures
Optimizing Direct X On Multi Core ArchitecturesOptimizing Direct X On Multi Core Architectures
Optimizing Direct X On Multi Core Architecturespsteinb
 
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...Stefano Di Carlo
 
Create Amazing VFX with the Visual Effect Graph
Create Amazing VFX with the Visual Effect GraphCreate Amazing VFX with the Visual Effect Graph
Create Amazing VFX with the Visual Effect GraphUnity Technologies
 
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...AMD Developer Central
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Brendan Gregg
 
Sig13 ce future_gfx
Sig13 ce future_gfxSig13 ce future_gfx
Sig13 ce future_gfxCass Everitt
 
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019Unity Technologies
 
A beginner’s guide to programming GPUs with CUDA
A beginner’s guide to programming GPUs with CUDAA beginner’s guide to programming GPUs with CUDA
A beginner’s guide to programming GPUs with CUDAPiyush Mittal
 
Amd accelerated computing -ufrj
Amd   accelerated computing -ufrjAmd   accelerated computing -ufrj
Amd accelerated computing -ufrjRoberto Brandao
 

Ähnlich wie Direct3D and the Future of Graphics APIs - AMD at GDC14 (20)

GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
 
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio [Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
 
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
 
APU in nepal 2
APU in nepal 2APU in nepal 2
APU in nepal 2
 
Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14
 
Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
 
YOW2021 Computing Performance
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing Performance
 
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon Selley
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon SelleyPT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon Selley
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon Selley
 
Optimizing Direct X On Multi Core Architectures
Optimizing Direct X On Multi Core ArchitecturesOptimizing Direct X On Multi Core Architectures
Optimizing Direct X On Multi Core Architectures
 
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
 
Create Amazing VFX with the Visual Effect Graph
Create Amazing VFX with the Visual Effect GraphCreate Amazing VFX with the Visual Effect Graph
Create Amazing VFX with the Visual Effect Graph
 
Gpgpu
GpgpuGpgpu
Gpgpu
 
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
 
Sig13 ce future_gfx
Sig13 ce future_gfxSig13 ce future_gfx
Sig13 ce future_gfx
 
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019
 
0507036
05070360507036
0507036
 
A beginner’s guide to programming GPUs with CUDA
A beginner’s guide to programming GPUs with CUDAA beginner’s guide to programming GPUs with CUDA
A beginner’s guide to programming GPUs with CUDA
 
Amd accelerated computing -ufrj
Amd   accelerated computing -ufrjAmd   accelerated computing -ufrj
Amd accelerated computing -ufrj
 

Mehr von AMD Developer Central

DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsAMD Developer Central
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesAMD Developer Central
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAMD Developer Central
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceAMD Developer Central
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonAMD Developer Central
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevAMD Developer Central
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasAMD Developer Central
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...AMD Developer Central
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...AMD Developer Central
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14AMD Developer Central
 
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...AMD Developer Central
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14AMD Developer Central
 
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...AMD Developer Central
 
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...AMD Developer Central
 

Mehr von AMD Developer Central (19)

DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math Libraries
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
 
DirectGMA on AMD’S FirePro™ GPUS
DirectGMA on AMD’S  FirePro™ GPUSDirectGMA on AMD’S  FirePro™ GPUS
DirectGMA on AMD’S FirePro™ GPUS
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
 
Inside XBox- One, by Martin Fuller
Inside XBox- One, by Martin FullerInside XBox- One, by Martin Fuller
Inside XBox- One, by Martin Fuller
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
Gcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodesGcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodes
 
Inside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin FullerInside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin Fuller
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
 
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
 
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
 
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
 

Kürzlich hochgeladen

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Direct3D and the Future of Graphics APIs - AMD at GDC14

  • 1. DIRECT3D AND THE FUTURE OF GRAPHICS APIS Dave Oldcorn, AMD Dan Baker, Oxide Games Johan Andersson, EA / DICE
  • 2. 2 | AMD Direct3D Futures | March 20th, 2014 NITROUS AND DX12 Dan Baker Partner, Oxide Games
  • 3. 3 | AMD Direct3D Futures | March 20th, 2014 HAVEN’T WE BEEN HERE BEFORE? Goal of DX9 –Remember State blocks? Goal of DX10 –Large state groups Goal of DX11 –Deferred contexts Are we actually getting faster, or are CPUs just faster? –Quite possible no perf improvements due to API features in 10 years Maybe adding features isn’t the answer…
  • 4. 4 | AMD Direct3D Futures | March 20th, 2014 DEEPLY ROOTED PROBLEM  Coding design philosophies clash with real world  OOP, data hiding, polymorphic design clashes with task-driven, data parallel  Evident in language trends, striking disconnect between what is considered good code, and what is fast  Gap has always been there, but has grown in recent years – 15 years ago, processors often bound by computation – Now, usually bound by cache misses, serialization, pipeline stalls, etc. – Multi-Core CPUs are ineffectively utilized  „Heavy Iron‟ , e.g. Big Object, Opaque memory is a dead end for performance  The revolt is beginning in high performance graphics APIS, but will spread
  • 5. 5 | AMD Direct3D Futures | March 20th, 2014 BUT… HOW MUCH FASTER? Biggest problem with industry today: Acceptance Only 1 secret in API design: That it can be done. –And isn‟t that hard –And our code isn‟t that ugly Star Swarm already demonstrating what is possible on a PC
  • 6. 6 | AMD Direct3D Futures | March 20th, 2014 D3D12 FEATURES THAT NITROUS USES True de-coupled multi-core rendering – Expecting near linear thread scheduling Manual Hazard tracking – Hazards have been resolved already Memory Heaps – Bigger chunks of memory pool grouping make management simpler Descriptor Tables – Table exposure allows a cheaper way of binding textures – Allows texture bindings to be shared between non-adjacent batches
  • 7. 7 | AMD Direct3D Futures | March 20th, 2014 WHAT’S DIFFERENT NOW? Spec Written Spec Reviewed API implemented Released to public First Engine use Analysis done Thenn
  • 8. 8 | AMD Direct3D Futures | March 20th, 2014 WHAT’S DIFFERENT NOW? Nown Create Spec Implement Spec Prototype on Actual Engines Analyze Discuss with IHVs, ISVs Start Here If Ready, exit here to prep for release
  • 9. 9 | AMD Direct3D Futures | March 20th, 2014 IN THE SPIRIT OF CONTRIBUTING Oxide proud to announce that we have a proto-type of Nitrous running on D3D12 *PR DISCLAIMER* This is not an official announcement regarding D3D12 support Porting from other modern APIs is much simpler than porting from D3D11 to D3D12
  • 10. 10 | AMD Direct3D Futures | March 20th, 2014 EXPECTED RESULTS CPU Driver overhead largely put to rest Huge increases in driver reliability Huge decreases in frame latency, expecting median frame latency to be 1.5 frames –Increased perceptual responsiveness Never a dropped frame or stall due to driver API issues –*Other OS events could cause stalls Driver should be far smaller, simpler to implement, IHVs can spend more time on optimizations
  • 11. DIRECT3D12 AND THE FUTURE OF GRAPHICS APIS Dave Oldcorn, Direct3D12 Driver Architect, AMD
  • 12. 12 | AMD Direct3D Futures | March 20th, 2014 THE PROBLEM
  • 13. 13 | AMD Direct3D Futures | March 20th, 2014 THE PROBLEM  Mismatch between existing Direct3D and hardware capabilities – Lots of CPU cores, but only one stream of data – State communication in small chunks – “Hidden” work  Hard to predict from any one given call what the overhead might be  Implicit memory management – Hardware evolving away from classical register programming
  • 14. 14 | AMD Direct3D Futures | March 20th, 2014 Metal (register level access) API LANDSCAPE  Gap between PC „raw‟ 3D APIs and the hardware has opened up  Very high level APIs now ubiquitous; easy to access even for casual developers, plenty of choice  Where the PC APIs are is a middle ground Capability,easeofuse,distancefrom3Dengine Game Engines Frostbite Unity Unreal CryEngine BlitzTech Flash / Silverlight Console APIs Opportunity D3D9 OpenGL D3D11 D3D7/8 Application
  • 15. 15 | AMD Direct3D Futures | March 20th, 2014 WHAT ARE THE CONSEQUENCES? WHAT ARE THE SOLUTIONS?
  • 16. 16 | AMD Direct3D Futures | March 20th, 2014 SEQUENTIAL API  Sequential API: state for given draw comes from arbitrary previous time  Some states must be reconciled on the CPU (“delayed validation”) – All contributing state needs to be visible  GPU isn‟t like this, uses command buffers – Must save and restore state at start and end ... Draw Set PS CB Draw x 5 Set VS CB Draw x 3 Set Blend Set PS Set RT state Draw Set VS VB Draw ... (more, earlier) PS CB VS CB Blend state PS RT state Draw State contributing to draw API input
  • 17. 17 | AMD Direct3D Futures | March 20th, 2014 THREADING A SEQUENTIAL API  Sequential API threading – Simple producer / consumer model  Extra latency  Buffering has a cost  More threading would mean dividing tasks on finer grain – Bottlenecked on application or driver thread  Difficult to extract parallelism (Amdahl‟s Law) Application simulation Prebuild Thread 0 Prebuild Thread 1 Application Render Thread GPU Execution Queue Queued Buffer 0 Queued Buffer 1 ... Runtime / Driver Application Driver Thread Queued Buffer 2
  • 18. 18 | AMD Direct3D Futures | March 20th, 2014 COMMAND BUFFER API  GPUs only listen to command buffers  Let the app build them – Command Lists, at the API level  Solves sequential API CPU issues Application simulation Thread 0 Thread 1 Build Cmd Buffer Build Cmd Buffer GPU Execution Queue Queued Buffer 0 Queued Buffer 1 ... Runtime / Driver Application
  • 19. 19 | AMD Direct3D Futures | March 20th, 2014 BETTER SCHEDULING  App has much more control over scheduling work – Both CPU side and GPU  Threads don‟t really share much resource  Many more options for streaming assets Driver thread Create thread D3D11: CB building threads tend to interfere GPU load still added but only after queuing Render work Create work GPU executes D3D12: CB building threads more independent Create thread Build threads
  • 20. 20 | AMD Direct3D Futures | March 20th, 2014 PIPELINE OBJECTS  Pipeline objects get rid of JIT and enable LTCG for GPUs  Decouple interface and implementation  We‟re aware that this is a hairpin bend for many graphics engines to negotiate. – Many engines don‟t think in terms of predicting state up front – The benefits are worth it Simplified dataflow through pipeline VS PS Index Process Primitive Generation Rasteriser Rendertarget Output ? ? ?
  • 21. 21 | AMD Direct3D Futures | March 20th, 2014 RENDER OBJECT BINDING MISMATCH  Hardware uses tables in video memory  BUT still programmed like a register solution – So one bind becomes:  Allocate a new chunk of video memory  Create a new copy of the entire table  Update the one entry  Write the register with the new table base address SR CB On-chip root table (1 per stage) Pointer to table (here, textures) GPU Memory SRD table GPU Memory resource Pointer to table (constant buffers) Pointer to (+ params of) resource
  • 22. 22 | AMD Direct3D Futures | March 20th, 2014 DESCRIPTOR TABLES  Several tables of each type of resource – Easy to divide up by frequency  Tables can be of arbitrary size; dynamically indexed to provide bindless textures  Changing a table pointer is cheap  Updating a descriptor in a table is not SR.T[0] SR.T[3] SR.T[2] SR.T[1] UAV CB.T[1] CB.T[0] Samp SR.T[0][0] SR.T[0][2] SR.T[0][1] CB.T[1][0] CB.T[1][1] On-chip table Pointer to table (textures table 0) GPU Memory SRD table Pointer to table (constbuf table 1)
  • 23. 23 | AMD Direct3D Futures | March 20th, 2014 KEY INNOVATIONS Innovation CPU-side win GPU-side win Command buffers Build on many threads Control of scheduling Lower latency Simplified state tracking Pipeline state objects Link at create time No JIT shader compiles Efficient batched updates Cheaper state updates Enables LTCG Bind objects in groups Cheap to change group Cheap to change group Fits hardware paradigm Move work to Create Predictability Enables optimisations
  • 24. 24 | AMD Direct3D Futures | March 20th, 2014 KEY INNOVATIONS Innovation CPU-side win GPU-side win Explicit Synchronisation Efficiency Required for bindless textures Less overhead Explicit Memory Management Efficiency Predictability Application flexibility Zero copy Control over placement Do less Predictability, Efficiency Enables aggressive schedule FEWER BUGS
  • 25. 25 | AMD Direct3D Futures | March 20th, 2014 NEW PROBLEMS (AND TIPS TO SOLVE THEM)
  • 26. 26 | AMD Direct3D Futures | March 20th, 2014 NEW VISIBLE LIMITS  More draws in does not automatically mean more triangles out – You will not see full rendering rates with triangles averaging 1 pixel each. – Wireframe mode should look different to filled rendering
  • 27. 27 | AMD Direct3D Futures | March 20th, 2014 NEW VISIBLE LIMITS  Feeding the GPU much more efficiently means exploring interesting new limits that weren‟t visible before  10k/frame of anything is ~1µs per thing.  GPU pipeline depth is likely to be 1-10µs (1k-10k cycles).  Specific limit: context registers – Shader tables are NOT in the context – Compute doesn‟t bottleneck on context
  • 28. 28 | AMD Direct3D Futures | March 20th, 2014 APPLICATION IN CHARGE  Application is arbiter of correct rendering – This is a serious responsibility – The benefits of D3D12 aren‟t readily available without this condition Applications must be warning-free on the debug layer  Different opportunities for driver intervention
  • 29. 29 | AMD Direct3D Futures | March 20th, 2014 APPLICATION IN CHARGE  No driver thread in play – App can target much lower latency – BUT implies app has to be ready with new GPU work Driver F1 App Render Frame 1 GPU F1 Frame 2 F2 F2 Frame 3 F3 F3 D3D11: No dead GPU time after 1st frame (but extra latency) Dead Time First work sent to driver Driver buffers Present; no future dead time No buffered present reveals dead time on GPU
  • 30. 30 | AMD Direct3D Futures | March 20th, 2014 USE COMMAND BUFFERS SPARINGLY  Each API command list maps to a single hardware command buffer  Starting / ending a command list has an overhead – Writes full 3D state, may flush caches or idle GPU  We think a good rule of thumb will be to target around 100 command buffers/frame – Use the multiple submission API where possible CB0 CB1 CB2CB0 Multiple applications running on system Application 0 queue CB0 CB1 CB2 CB0 Application 1 queue GPU executes
  • 31. 31 | AMD Direct3D Futures | March 20th, 2014 ROUND-UP
  • 32. 32 | AMD Direct3D Futures | March 20th, 2014 ALL-NEW  There‟s a learning curve here for all of us  In the main it‟s a shallow one – Compared at least to the general problem of multithreaded rendering  Multithread is always hard. – Simpler design means fewer bugs and more predictable performance
  • 33. 33 | AMD Direct3D Futures | March 20th, 2014 WHAT AMD PLAN TO DELIVER  An early preview driver “soon”  Release driver for Direct3D12 launch  Continuous engagement – With Microsoft – With ISVs  Bring your opinions to us and to Microsoft.
  • 34. 34 | AMD Direct3D Futures | March 20th, 2014 DX12 AND FROSTBITE Johan Andersson Technical Director
  • 35. 35 | AMD Direct3D Futures | March 20th, 2014 DX12 AND FROSTBITE  PC is very important for EA and we‟ve been pushing hard to improve graphics capabilities on Windows  Excited to be working with Microsoft and the IHVs on Direct3D again!  Good & very healthy collaboration between Microsoft, the IHVs and us game/engine developers  DX12 is a really big step forward from DX11 or GL4
  • 36. 36 | AMD Direct3D Futures | March 20th, 2014 DX12 FEATURES AND FROSTBITE  Key DX12 features that are a great fit for Frostbite: – Efficient parallel command buffers – Descriptor tables – Pipeline objects – Explicit resource synchronization – Explicit memory management  DX12 is still in development so actively working with Microsoft & the IHVs to help make sure all of it fits together and is efficient
  • 37. 37 | AMD Direct3D Futures | March 20th, 2014 DX12 PLATFORMS  DX12 support on Windows 7 & most existing PC hardware is critical for us – Huge user base still on Windows 7 – Gamers would see major benefits without upgrading  DX12 support on Xbox One is critical for us – Will lead to improved performance & quality for future Xbox One titles – Almost all of our games are cross platform Gen4/PC – Easier development – renderer is shared between Windows & Xbox One  Looking forward to DX12 on mobile/tablets – Power efficiency & low overhead is really key – Need larger user base to target on Windows for mobile
  • 38. 38 | AMD Direct3D Futures | March 20th, 2014 DX12 AND FROSTBITE  We are building a DX12 renderer for Frostbite! – Will work on GPUs from all vendors – benefits a wide set of gamers  Expected benefits over DX11: – More stable and consistent performance – Higher overall performance – Move our design target – more richer & more detailed game worlds – Thinner drivers – easier to work with / less of a black box – More control for us developers – new techniques & optimizations  Really happy that the full Windows & Xbox eco systems are moving to low-level graphics API!
  • 39. 39 | AMD Direct3D Futures | March 20th, 2014 QUESTIONS