SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
DIRECTGMA ON AMD’S 
FIREPRO™ GPUS 
BRUNO STEFANIZZI 
SEP 2014
2 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
 Exposing Graphic memory of a GPU to any device has been always the goal for 
any application looking for low latency communication of his data between 
every device and the GPU. This is why AMD has introduced DirectGMA (Direct 
Graphics Memory Access) in order to: 
‒ Makes a portion of the GPU memory accessible to other devices 
‒ Allows devices on the bus to write directly into this area of GPU memory 
‒ Allows GPUs to write directly into the memory of remote devices on the bus 
supporting DirectGMA 
‒ Provides a driver interface to allow 3rd party hardware vendors to support data 
exchange with an AMD GPU using DirectGMA 
‒ APIs supporting AMD’s DirectGMA are: OpenGL, OpenCLTM, DirectX® 
‒ The supported operation systems are: Windows ® 7 64 Bit and Linux ® 64 Bit 
‒ The supported cards (AMD FirePro™ W W5x00 and above as well as all AMD FireProTM 
S series) 
INTRODUCTION TO DIRECTGMA
3 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
 Peer-to-Peer Transfers between GPUs 
Use high-speed DMA transfers to copy data between the memories of two 
GPUs on the same system/PCIe bus. 
 Peer-to-Peer Transfers between GPU and FPGAs 
Use high-speed DMA transfers to copy data between the memories of the GPU 
and the FPGA memory. 
 DirectGMA for Video 
Optimized pipeline for frame-based devices such as frame grabbers, video 
switchers, HD-SDI capture, and CameraLink devices. See our SDI webpage 
INTRODUCTION TO DIRECTGMA
4 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
AMD’S DIRECTGMA P2P 
 Direct communication between PCI cards 
 Bidirectional DirectGMA P2P requires memory on both cards 
CPU 
PCI Bus
5 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
 The OpenGL extension AMD_BUS_ADDRESSABLE_MEMORY provides access to 
DirectGMA 
 The functions are: 
 The new tokens are: 
DIRECTGMA IN OPENGL 
void glMakeBuffersResident(sizei n, uint* buffers, uint64* baddr, uint64* maddr); 
void glBufferBusAddress(enum target, sizeiptr size, uint64 surfbusaddress, uint64 markerbusaddress); 
void glWaitMarker(uint buf, uint value); 
void glWriteMarker(uint buf, uint value, uint64 offset); 
GL_BUS_ADDRESSABLE_MEMORY_AMD 
GL_EXTERNAL_PHYSICAL_MEMORY_AMD
6 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
 To receive data a buffer needs to be created that can be accessed by other 
devices on the bus 
 The physical address of this buffer needs to be known in order to have a remote 
device writing to this address 
DIRECTGMA IN OPENGL | CREATING A BUFFER TO 
RECEIVE DATA 
glGenBuffers(m_uiNumBuffers, m_pBuffer); 
m_pBufferBusAddress = new unsigned long long[m_uiNumBuffers]; 
m_pMarkerBusAddress = new unsigned long long[m_uiNumBuffers]; 
for (unsigned int i = 0; i < m_uiNumBuffers; i++) 
{ 
glBindBuffer(GL_BUS_ADDRESSABLE_MEMORY_AMD, m_pBuffer[i]); 
glBufferData(GL_BUS_ADDRESSABLE_MEMORY_AMD, m_uiBufferSize, 0, GL_DYNAMIC_DRAW); 
} 
// Call makeResident when all BufferData calls were submitted. 
glMakeBuffersResidentAMD(m_uiNumBuffers, m_pBuffer, m_pBufferBusAddress, m_pMarkerBusAddress); 
// Make sure that the buffer creation really succeeded 
if (glGetError() != GL_NO_ERROR) 
return false; 
glBindBuffer(GL_BUS_ADDRESSABLE_MEMORY_AMD, 0);
7 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
 To write into the buffer on a remote device we need to create an OpenGL buffer 
and assign the physical addresses of the memory on the remote device 
DIRECTGMA IN OPENGL | USING A BUFFER ON A 
REMOTE DEVICE 
glGenBuffers(m_uiNumBuffers, m_pBuffer); 
for (unsigned int i = 0; i < m_uiNumBuffers; i++) 
{ 
glBindBuffer(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, m_pBuffer[i]); 
glBufferBusAddressAMD(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, m_uiBufferSize, m_pBufferBusAddress[i], m_pMarkerBusAddress[i]); 
if (glGetError() != GL_NO_ERROR) 
return false; 
} 
glBindBuffer(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, 0);
8 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
 Create one thread per GPU. Each thread creates its own context. One thread 
adds as data sink the other as source. 
 On the sink GPU a GL_BUS_ADDRESSABLE_MEMORY_AMD buffer is created 
 On the source GPU a buffer is created. 
DIRECTGMA IN OPENGL | GPU TO GPU COPY 
glGenBuffers(m_uiNumBuffers, m_pSinkBuffer); 
for (unsigned int i = 0; i < m_uiNumBuffers; i++) 
{ 
glBindBuffer(GL_BUS_ADDRESSABLE_MEMORY_AMD, m_pSinkBuffer[i]); 
glBufferData(GL_BUS_ADDRESSABLE_MEMORY_AMD, m_uiBufferSize, 0, GL_DYNAMIC_DRAW); 
} 
// Call makeResident when all BufferData calls were submitted. 
glMakeBuffersResidentAMD(m_uiNumBuffers, m_pBuffer, m_pBufferBusAddress, 
m_pMarkerBusAddress); 
glGenBuffers(m_uiNumBuffers, m_pSourceBuffer); 
for (unsigned int i = 0; i < m_uiNumBuffers; i++) 
{ 
glBindBuffer(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, m_pSourceBuffer[i]); 
glBufferBusAddressAMD(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, m_uiBufferSize, 
m_pBufferBusAddress[i], m_pMarkerBusAddress[i]); 
} 
GPU 0: Sink GPU 1: Source
9 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
 The source creates data and copies it into the 
GL_EXTERNAL_PHYSICAL_MEMORY buffer that has it’s data store on the sink 
device 
 The sink device receives the data and copies it into a texture to be displayed 
DIRECTGMA IN OPENGL | GPU TO GPU COPY 
// Submit draw calls that do not require data sent by the source 
… 
glBindTexture(GL_TEXTURE_2D, m_uiTexture); 
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, uiBufferIdx); 
// Indicate that the following commands will need the data transferred by the source 
glWaitMarkerAMD(uiBufferId, uiTransferId); 
// Copy buffer into texture 
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, m_uiTextureWidth, m_uiTextureHeight, m_nExtFormat, 
m_nType, NULL); 
// Draw using received texture 
// Draw 
… 
++uiTransferId; 
// Bind buffer that has its data store on the sink GPU 
glBindBuffer(GL_PIXEL_PACK_BUFFER, uiBufferid); 
// Copy local buffer into remote buffer 
glReadPixels(0, 0, m_uiBufferWidth, m_uiBufferHeight, m_nExtFormat, m_nType, NULL); 
// Write marker 
glWriteMarkerAMD(uiBufferId, uiTransferId , ullMarkerBusAddress); 
glFlush(); 
GPU 0: Sink GPU 1: Source
10 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
DIRECTGMA IN OPENGL | OVERLAPPING EXECUTION 
GPU 1 render 
GPU 1 transfer 
GPU 0 render 
GPU 0 use 
buffer 
GPU 0 wait
11 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
 The OpenCL extension CL_AMD_BUS_ADDRESSABLE_MEMORY provides access 
to DirectGMA 
 The functions are: 
 The new tokens are: 
DIRECTGMA IN OPENCL 
cl_int clEnqueueWaitSignalAMD(cl_command_queue command_queue, cl_mem mem_object, uint value, cl_uint num_events, … 
cl_int clEnqueueWriteSignalAMD(cl_command_queue command_queue, cl_mem mem_object, uint value, cl_ulong offset, … 
cl_int clEnqueueMakeBuffersResidentAMD(cl_command_queue command_queue, cl_uint num_mem_objects, cl_mem* mem_objects, 
cl_bool blocking_make_resident, cl_bus_address_amd * bus_addresses, cl_uint num_events, … 
CL_BUS_ADDRESSABLE_MEMORY_AMD 
CL_EXTERNAL_PHYSICAL_MEMORY_AMD
12 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
DIRECTGMA | DX9 
 The DirectGMA functionality in DX9 is made available through a so called 
communication surface 
 The process for using it is as follow: 
‒ Create an 1x1 offscreen plain surface of format FOURCC_SDIF 
‒ Lock the surface. On lock, the driver will allocate and return a pointer to a 
AMDDX9SDICOMMPACKET structure. This structure is the communication surface. 
‒ Assign and cast the pBits pointer to a locally created AMDDX9SDICOMMPACKET 
pointer. 
 The most essential commands are: AMD_SDI_CMD_GET_CAPS_DATA 
AMD_SDI_CMD_CREATE_SURFACE_LOCAL_BEGIN 
AMD_SDI_CMD_CREATE_SURFACE_LOCAL_END 
AMD_SDI_CMD_CREATE_SURFACE_REMOTE_BEGIN 
AMD_SDI_CMD_CREATE_SURFACE_REMOTE_END 
AMD_SDI_CMD_QUERY_PHY_ADDRESS_LOCAL 
AMD_SDI_CMD_SYNC_WAIT_MARKER 
AMD_SDI_CMD_SYNC_WRITE_MARKER
13 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
DIRECTGMA | DX9 
 Running a DirectGMA command: 
HRESULT RunSDICommand(IN LPDIRECT3DDEVICE9 pd3dDevice, IN AMDDX9SDICMD sdiCmd, IN PBYTE pInBuf, IN DWORD dwInBufSize, IN PBYTE pOutBuf, IN DWORD dwOutBufSize) 
{ 
HRESULT hr; 
PAMDDX9SDICOMMPACKET pCommPacket; 
D3DLOCKED_RECT lockedRect; 
LPDIRECT3DSURFACE9 pCommSurf = NULL; 
hr = pd3dDevice->CreateOffscreenPlainSurface(1, 1, (D3DFORMAT) FOURCC_SDIF, D3DPOOL_DEFAULT, &pCommSurf, NULL); 
hr = pCommSurf->LockRect(&lockedRect, NULL, 0); 
pCommPacket = (PAMDDX9SDICOMMPACKET)(lockedRect.pBits); 
pCommPacket->dwSign = 'SDIF'; 
pCommPacket->pResult = &hr; 
pCommPacket->sdiCmd = sdiCmd; 
pCommPacket->pOutBuf = pOutBuf; 
pCommPacket->dwOutBufSize = dwOutBufSize; 
pCommPacket->pInBuf = pInBuf; 
pCommPacket->dwInBufSize = dwInBufSize; 
pCommSurf->UnlockRect(); 
REL(pCommSurf); 
return hr; 
}
14 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
DIRECTGMA | DX9 
 Create a local surface that can be accessed by a remote device 
hr = RunSDICommand(pd3dDevice, AMD_SDI_CMD_CREATE_SURFACE_LOCAL_BEGIN, NULL, 0, NULL, 0); 
if (SUCCEEDED(hr)) 
{ 
// Create SDI_LOCAL resources here 
hr = pd3dDevice->CreateTexture(width, height, 1, usage, format, D3DPOOL_DEFAULT, ppTex, NULL); 
if (SUCCEEDED(hr)) 
{ 
hr = MakeAllocDoneViaDumpDraw( pd3dDevice, *ppTex ); 
hr = RunSDICommand(pd3dDevice, AMD_SDI_CMD_CREATE_SURFACE_LOCAL_END, NULL, 0, (PBYTE)pAttrib, sizeof(AMDDX9SDISURFACEATTRIBUTES)); 
if (SUCCEEDED(hr)) 
{ 
pAttrib->surfaceHandle, 
pAttrib->surfaceAddr.surfaceBusAddr, 
pAttrib->surfaceAddr.markerBusAddr); 
} 
} 
} 
return hr;
15 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
DIRECTGMA | DX10 DX 11 
 The AMD’s DirectGMA extension is accessed by way of the IAmdDxExt interface. 
In order to create this interface, the extension client must do the following: 
‒ Include the “AmdDxExtSDIApi.h” file 
‒ Get the exported function AmdDxExtCreate() from the DXX driver using 
GetProcAddress() 
‒ Call AmdDxExtCreate to create an IAmdDxExt interface 
‒ Get and use the desired specific extension interfaces 
‒ Close the AMD DirectX extension interface IAmdDxExt once it is no longer needed 
‒ Release the SDI interface IAmdDxExtSDI 
‒ Release the extension interface IAmdDxExt
16 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
DIRECTGMA | DX 10 DX11 
 The following DirectGMA functions are provided: 
HRESULT CreateSDIAdapterSurfaces(AmdDxRemoteSDISurfaceList *pList) ; 
HRESULT QuerySDIAllocationAddress(AmdDxSDIQueryAllocInfo *pInfo) ; 
HRESULT MakeResidentSDISurfaces(AmdDxLocalSDISurfaceList *pList) ; 
BOOL WriteMarker(ID3D10Resource *pResource, AmdDxMarkerInfo *pMarkerInfo); 
BOOL WaitMarker(ID3D10Resource *pResource, UINT val); 
BOOL WriteMarker11(ID3D11Resource *pResource, AmdDxMarkerInfo *pMarkerInfo) ; 
BOOL WaitMarker11(ID3D11Resource *pResource, UINT val);
17 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | 
DISCLAIMER & ATTRIBUTION 
The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and 
typographical errors. 
The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to 
product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences 
between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or 
otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to 
time to the content hereof without obligation of AMD to notify any person of such revisions or changes. 
AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR 
ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. 
AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO 
EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM 
THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 
ATTRIBUTION 
© 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of 
Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. SPEC is a registered trademark of the Standard Performance 
Evaluation Corporation (SPEC). Other names are for informational purposes only and may be trademarks of their respective owners.

Weitere ähnliche Inhalte

Was ist angesagt?

Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceAMD Developer Central
 
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin CoumansGS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin CoumansAMD Developer Central
 
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...AMD Developer Central
 
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...AMD Developer Central
 
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...AMD Developer Central
 
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...AMD Developer Central
 
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...AMD Developer Central
 
GS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill BilodeauGS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill BilodeauAMD Developer Central
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozAMD Developer Central
 
Shader model 5 0 and compute shader
Shader model 5 0 and compute shaderShader model 5 0 and compute shader
Shader model 5 0 and compute shaderzaywalker
 
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...AMD Developer Central
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...AMD Developer Central
 
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningPL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningAMD Developer Central
 
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...AMD Developer Central
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...AMD Developer Central
 
ONNC - 0.9.1 release
ONNC - 0.9.1 releaseONNC - 0.9.1 release
ONNC - 0.9.1 releaseLuba Tang
 
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...AMD Developer Central
 
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...AMD Developer Central
 
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary DemosMM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary DemosAMD Developer Central
 
PT-4054, "OpenCL™ Accelerated Compute Libraries" by John Melonakos
PT-4054, "OpenCL™ Accelerated Compute Libraries" by John MelonakosPT-4054, "OpenCL™ Accelerated Compute Libraries" by John Melonakos
PT-4054, "OpenCL™ Accelerated Compute Libraries" by John MelonakosAMD Developer Central
 

Was ist angesagt? (20)

Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
 
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin CoumansGS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
 
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
 
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
 
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
 
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
 
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
 
GS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill BilodeauGS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill Bilodeau
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas Thibieroz
 
Shader model 5 0 and compute shader
Shader model 5 0 and compute shaderShader model 5 0 and compute shader
Shader model 5 0 and compute shader
 
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
 
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningPL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
 
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
 
ONNC - 0.9.1 release
ONNC - 0.9.1 releaseONNC - 0.9.1 release
ONNC - 0.9.1 release
 
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
 
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...
GS-4136, Optimizing Game Development using AMD’s GPU PerfStudio 2, by Gordon ...
 
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary DemosMM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
 
PT-4054, "OpenCL™ Accelerated Compute Libraries" by John Melonakos
PT-4054, "OpenCL™ Accelerated Compute Libraries" by John MelonakosPT-4054, "OpenCL™ Accelerated Compute Libraries" by John Melonakos
PT-4054, "OpenCL™ Accelerated Compute Libraries" by John Melonakos
 

Andere mochten auch

Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonAMD Developer Central
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...AMD Developer Central
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasAMD Developer Central
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevAMD Developer Central
 
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsAMD Developer Central
 
Compiler Design Lecture Notes
Compiler Design Lecture NotesCompiler Design Lecture Notes
Compiler Design Lecture NotesFellowBuddy.com
 
Compiler Design(NANTHU NOTES)
Compiler Design(NANTHU NOTES)Compiler Design(NANTHU NOTES)
Compiler Design(NANTHU NOTES)guest251d9a
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahAMD Developer Central
 
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28AMD
 

Andere mochten auch (12)

Inside XBox- One, by Martin Fuller
Inside XBox- One, by Martin FullerInside XBox- One, by Martin Fuller
Inside XBox- One, by Martin Fuller
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
 
Compiler Design Lecture Notes
Compiler Design Lecture NotesCompiler Design Lecture Notes
Compiler Design Lecture Notes
 
Compiler Design(NANTHU NOTES)
Compiler Design(NANTHU NOTES)Compiler Design(NANTHU NOTES)
Compiler Design(NANTHU NOTES)
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
 
Inside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin FullerInside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin Fuller
 
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
 

Ähnlich wie DirectGMA on AMD’S FirePro™ GPUS

Building Chromium on an Embedded Platform using Ozone-Wayland Layer (GENIVI 1...
Building Chromium on an Embedded Platform using Ozone-Wayland Layer (GENIVI 1...Building Chromium on an Embedded Platform using Ozone-Wayland Layer (GENIVI 1...
Building Chromium on an Embedded Platform using Ozone-Wayland Layer (GENIVI 1...Igalia
 
망고100 보드로 놀아보자 15
망고100 보드로 놀아보자 15망고100 보드로 놀아보자 15
망고100 보드로 놀아보자 15종인 전
 
HSA Kernel Code (KFD v0.6)
HSA Kernel Code (KFD v0.6)HSA Kernel Code (KFD v0.6)
HSA Kernel Code (KFD v0.6)Hann Yu-Ju Huang
 
Kernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driverKernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driverAnne Nicolas
 
Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14AMD Developer Central
 
GPGPU programming with CUDA
GPGPU programming with CUDAGPGPU programming with CUDA
GPGPU programming with CUDASavith Satheesh
 
CUDA lab's slides of "parallel programming" course
CUDA lab's slides of "parallel programming" courseCUDA lab's slides of "parallel programming" course
CUDA lab's slides of "parallel programming" courseShuai Yuan
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Introduction to cuda geek camp singapore 2011
Introduction to cuda   geek camp singapore 2011Introduction to cuda   geek camp singapore 2011
Introduction to cuda geek camp singapore 2011Raymond Tay
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Ctrl-C redesign for gcc cauldron in 2022 in prague
Ctrl-C redesign for gcc cauldron in 2022 in pragueCtrl-C redesign for gcc cauldron in 2022 in prague
Ctrl-C redesign for gcc cauldron in 2022 in praguessuser866937
 
CUDA by Example : Graphics Interoperability : Notes
CUDA by Example : Graphics Interoperability : NotesCUDA by Example : Graphics Interoperability : Notes
CUDA by Example : Graphics Interoperability : NotesSubhajit Sahu
 
Provision Intel® Optane™ DC Persistent Memory in Linux*
Provision Intel® Optane™ DC Persistent Memory in Linux*Provision Intel® Optane™ DC Persistent Memory in Linux*
Provision Intel® Optane™ DC Persistent Memory in Linux*Intel® Software
 
Anatomy of ROCgdb presentation at gcc cauldron 2022
Anatomy of ROCgdb presentation at gcc cauldron 2022Anatomy of ROCgdb presentation at gcc cauldron 2022
Anatomy of ROCgdb presentation at gcc cauldron 2022ssuser866937
 
Nagios Conference 2013 - Troy Lea - Leveraging and Understanding Performance ...
Nagios Conference 2013 - Troy Lea - Leveraging and Understanding Performance ...Nagios Conference 2013 - Troy Lea - Leveraging and Understanding Performance ...
Nagios Conference 2013 - Troy Lea - Leveraging and Understanding Performance ...Nagios
 
CUDA Deep Dive
CUDA Deep DiveCUDA Deep Dive
CUDA Deep Divekrasul
 

Ähnlich wie DirectGMA on AMD’S FirePro™ GPUS (20)

Building Chromium on an Embedded Platform using Ozone-Wayland Layer (GENIVI 1...
Building Chromium on an Embedded Platform using Ozone-Wayland Layer (GENIVI 1...Building Chromium on an Embedded Platform using Ozone-Wayland Layer (GENIVI 1...
Building Chromium on an Embedded Platform using Ozone-Wayland Layer (GENIVI 1...
 
Presentation1
Presentation1Presentation1
Presentation1
 
망고100 보드로 놀아보자 15
망고100 보드로 놀아보자 15망고100 보드로 놀아보자 15
망고100 보드로 놀아보자 15
 
HSA Kernel Code (KFD v0.6)
HSA Kernel Code (KFD v0.6)HSA Kernel Code (KFD v0.6)
HSA Kernel Code (KFD v0.6)
 
Kernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driverKernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driver
 
Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14
 
GPGPU programming with CUDA
GPGPU programming with CUDAGPGPU programming with CUDA
GPGPU programming with CUDA
 
CUDA lab's slides of "parallel programming" course
CUDA lab's slides of "parallel programming" courseCUDA lab's slides of "parallel programming" course
CUDA lab's slides of "parallel programming" course
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Introduction to cuda geek camp singapore 2011
Introduction to cuda   geek camp singapore 2011Introduction to cuda   geek camp singapore 2011
Introduction to cuda geek camp singapore 2011
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Ctrl-C redesign for gcc cauldron in 2022 in prague
Ctrl-C redesign for gcc cauldron in 2022 in pragueCtrl-C redesign for gcc cauldron in 2022 in prague
Ctrl-C redesign for gcc cauldron in 2022 in prague
 
CUDA by Example : Graphics Interoperability : Notes
CUDA by Example : Graphics Interoperability : NotesCUDA by Example : Graphics Interoperability : Notes
CUDA by Example : Graphics Interoperability : Notes
 
Introduction to GPUs in HPC
Introduction to GPUs in HPCIntroduction to GPUs in HPC
Introduction to GPUs in HPC
 
Provision Intel® Optane™ DC Persistent Memory in Linux*
Provision Intel® Optane™ DC Persistent Memory in Linux*Provision Intel® Optane™ DC Persistent Memory in Linux*
Provision Intel® Optane™ DC Persistent Memory in Linux*
 
Anatomy of ROCgdb presentation at gcc cauldron 2022
Anatomy of ROCgdb presentation at gcc cauldron 2022Anatomy of ROCgdb presentation at gcc cauldron 2022
Anatomy of ROCgdb presentation at gcc cauldron 2022
 
Nagios Conference 2013 - Troy Lea - Leveraging and Understanding Performance ...
Nagios Conference 2013 - Troy Lea - Leveraging and Understanding Performance ...Nagios Conference 2013 - Troy Lea - Leveraging and Understanding Performance ...
Nagios Conference 2013 - Troy Lea - Leveraging and Understanding Performance ...
 
CUDA Deep Dive
CUDA Deep DiveCUDA Deep Dive
CUDA Deep Dive
 

Mehr von AMD Developer Central

RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14AMD Developer Central
 
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...AMD Developer Central
 
Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14AMD Developer Central
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14AMD Developer Central
 
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...AMD Developer Central
 
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...AMD Developer Central
 
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...AMD Developer Central
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...AMD Developer Central
 

Mehr von AMD Developer Central (8)

RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
 
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
 
Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
 
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
 
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
 
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
 

Kürzlich hochgeladen

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Kürzlich hochgeladen (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

DirectGMA on AMD’S FirePro™ GPUS

  • 1. DIRECTGMA ON AMD’S FIREPRO™ GPUS BRUNO STEFANIZZI SEP 2014
  • 2. 2 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  Exposing Graphic memory of a GPU to any device has been always the goal for any application looking for low latency communication of his data between every device and the GPU. This is why AMD has introduced DirectGMA (Direct Graphics Memory Access) in order to: ‒ Makes a portion of the GPU memory accessible to other devices ‒ Allows devices on the bus to write directly into this area of GPU memory ‒ Allows GPUs to write directly into the memory of remote devices on the bus supporting DirectGMA ‒ Provides a driver interface to allow 3rd party hardware vendors to support data exchange with an AMD GPU using DirectGMA ‒ APIs supporting AMD’s DirectGMA are: OpenGL, OpenCLTM, DirectX® ‒ The supported operation systems are: Windows ® 7 64 Bit and Linux ® 64 Bit ‒ The supported cards (AMD FirePro™ W W5x00 and above as well as all AMD FireProTM S series) INTRODUCTION TO DIRECTGMA
  • 3. 3 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  Peer-to-Peer Transfers between GPUs Use high-speed DMA transfers to copy data between the memories of two GPUs on the same system/PCIe bus.  Peer-to-Peer Transfers between GPU and FPGAs Use high-speed DMA transfers to copy data between the memories of the GPU and the FPGA memory.  DirectGMA for Video Optimized pipeline for frame-based devices such as frame grabbers, video switchers, HD-SDI capture, and CameraLink devices. See our SDI webpage INTRODUCTION TO DIRECTGMA
  • 4. 4 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | AMD’S DIRECTGMA P2P  Direct communication between PCI cards  Bidirectional DirectGMA P2P requires memory on both cards CPU PCI Bus
  • 5. 5 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  The OpenGL extension AMD_BUS_ADDRESSABLE_MEMORY provides access to DirectGMA  The functions are:  The new tokens are: DIRECTGMA IN OPENGL void glMakeBuffersResident(sizei n, uint* buffers, uint64* baddr, uint64* maddr); void glBufferBusAddress(enum target, sizeiptr size, uint64 surfbusaddress, uint64 markerbusaddress); void glWaitMarker(uint buf, uint value); void glWriteMarker(uint buf, uint value, uint64 offset); GL_BUS_ADDRESSABLE_MEMORY_AMD GL_EXTERNAL_PHYSICAL_MEMORY_AMD
  • 6. 6 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  To receive data a buffer needs to be created that can be accessed by other devices on the bus  The physical address of this buffer needs to be known in order to have a remote device writing to this address DIRECTGMA IN OPENGL | CREATING A BUFFER TO RECEIVE DATA glGenBuffers(m_uiNumBuffers, m_pBuffer); m_pBufferBusAddress = new unsigned long long[m_uiNumBuffers]; m_pMarkerBusAddress = new unsigned long long[m_uiNumBuffers]; for (unsigned int i = 0; i < m_uiNumBuffers; i++) { glBindBuffer(GL_BUS_ADDRESSABLE_MEMORY_AMD, m_pBuffer[i]); glBufferData(GL_BUS_ADDRESSABLE_MEMORY_AMD, m_uiBufferSize, 0, GL_DYNAMIC_DRAW); } // Call makeResident when all BufferData calls were submitted. glMakeBuffersResidentAMD(m_uiNumBuffers, m_pBuffer, m_pBufferBusAddress, m_pMarkerBusAddress); // Make sure that the buffer creation really succeeded if (glGetError() != GL_NO_ERROR) return false; glBindBuffer(GL_BUS_ADDRESSABLE_MEMORY_AMD, 0);
  • 7. 7 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  To write into the buffer on a remote device we need to create an OpenGL buffer and assign the physical addresses of the memory on the remote device DIRECTGMA IN OPENGL | USING A BUFFER ON A REMOTE DEVICE glGenBuffers(m_uiNumBuffers, m_pBuffer); for (unsigned int i = 0; i < m_uiNumBuffers; i++) { glBindBuffer(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, m_pBuffer[i]); glBufferBusAddressAMD(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, m_uiBufferSize, m_pBufferBusAddress[i], m_pMarkerBusAddress[i]); if (glGetError() != GL_NO_ERROR) return false; } glBindBuffer(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, 0);
  • 8. 8 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  Create one thread per GPU. Each thread creates its own context. One thread adds as data sink the other as source.  On the sink GPU a GL_BUS_ADDRESSABLE_MEMORY_AMD buffer is created  On the source GPU a buffer is created. DIRECTGMA IN OPENGL | GPU TO GPU COPY glGenBuffers(m_uiNumBuffers, m_pSinkBuffer); for (unsigned int i = 0; i < m_uiNumBuffers; i++) { glBindBuffer(GL_BUS_ADDRESSABLE_MEMORY_AMD, m_pSinkBuffer[i]); glBufferData(GL_BUS_ADDRESSABLE_MEMORY_AMD, m_uiBufferSize, 0, GL_DYNAMIC_DRAW); } // Call makeResident when all BufferData calls were submitted. glMakeBuffersResidentAMD(m_uiNumBuffers, m_pBuffer, m_pBufferBusAddress, m_pMarkerBusAddress); glGenBuffers(m_uiNumBuffers, m_pSourceBuffer); for (unsigned int i = 0; i < m_uiNumBuffers; i++) { glBindBuffer(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, m_pSourceBuffer[i]); glBufferBusAddressAMD(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, m_uiBufferSize, m_pBufferBusAddress[i], m_pMarkerBusAddress[i]); } GPU 0: Sink GPU 1: Source
  • 9. 9 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  The source creates data and copies it into the GL_EXTERNAL_PHYSICAL_MEMORY buffer that has it’s data store on the sink device  The sink device receives the data and copies it into a texture to be displayed DIRECTGMA IN OPENGL | GPU TO GPU COPY // Submit draw calls that do not require data sent by the source … glBindTexture(GL_TEXTURE_2D, m_uiTexture); glBindBuffer(GL_PIXEL_UNPACK_BUFFER, uiBufferIdx); // Indicate that the following commands will need the data transferred by the source glWaitMarkerAMD(uiBufferId, uiTransferId); // Copy buffer into texture glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, m_uiTextureWidth, m_uiTextureHeight, m_nExtFormat, m_nType, NULL); // Draw using received texture // Draw … ++uiTransferId; // Bind buffer that has its data store on the sink GPU glBindBuffer(GL_PIXEL_PACK_BUFFER, uiBufferid); // Copy local buffer into remote buffer glReadPixels(0, 0, m_uiBufferWidth, m_uiBufferHeight, m_nExtFormat, m_nType, NULL); // Write marker glWriteMarkerAMD(uiBufferId, uiTransferId , ullMarkerBusAddress); glFlush(); GPU 0: Sink GPU 1: Source
  • 10. 10 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | DIRECTGMA IN OPENGL | OVERLAPPING EXECUTION GPU 1 render GPU 1 transfer GPU 0 render GPU 0 use buffer GPU 0 wait
  • 11. 11 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  The OpenCL extension CL_AMD_BUS_ADDRESSABLE_MEMORY provides access to DirectGMA  The functions are:  The new tokens are: DIRECTGMA IN OPENCL cl_int clEnqueueWaitSignalAMD(cl_command_queue command_queue, cl_mem mem_object, uint value, cl_uint num_events, … cl_int clEnqueueWriteSignalAMD(cl_command_queue command_queue, cl_mem mem_object, uint value, cl_ulong offset, … cl_int clEnqueueMakeBuffersResidentAMD(cl_command_queue command_queue, cl_uint num_mem_objects, cl_mem* mem_objects, cl_bool blocking_make_resident, cl_bus_address_amd * bus_addresses, cl_uint num_events, … CL_BUS_ADDRESSABLE_MEMORY_AMD CL_EXTERNAL_PHYSICAL_MEMORY_AMD
  • 12. 12 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | DIRECTGMA | DX9  The DirectGMA functionality in DX9 is made available through a so called communication surface  The process for using it is as follow: ‒ Create an 1x1 offscreen plain surface of format FOURCC_SDIF ‒ Lock the surface. On lock, the driver will allocate and return a pointer to a AMDDX9SDICOMMPACKET structure. This structure is the communication surface. ‒ Assign and cast the pBits pointer to a locally created AMDDX9SDICOMMPACKET pointer.  The most essential commands are: AMD_SDI_CMD_GET_CAPS_DATA AMD_SDI_CMD_CREATE_SURFACE_LOCAL_BEGIN AMD_SDI_CMD_CREATE_SURFACE_LOCAL_END AMD_SDI_CMD_CREATE_SURFACE_REMOTE_BEGIN AMD_SDI_CMD_CREATE_SURFACE_REMOTE_END AMD_SDI_CMD_QUERY_PHY_ADDRESS_LOCAL AMD_SDI_CMD_SYNC_WAIT_MARKER AMD_SDI_CMD_SYNC_WRITE_MARKER
  • 13. 13 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | DIRECTGMA | DX9  Running a DirectGMA command: HRESULT RunSDICommand(IN LPDIRECT3DDEVICE9 pd3dDevice, IN AMDDX9SDICMD sdiCmd, IN PBYTE pInBuf, IN DWORD dwInBufSize, IN PBYTE pOutBuf, IN DWORD dwOutBufSize) { HRESULT hr; PAMDDX9SDICOMMPACKET pCommPacket; D3DLOCKED_RECT lockedRect; LPDIRECT3DSURFACE9 pCommSurf = NULL; hr = pd3dDevice->CreateOffscreenPlainSurface(1, 1, (D3DFORMAT) FOURCC_SDIF, D3DPOOL_DEFAULT, &pCommSurf, NULL); hr = pCommSurf->LockRect(&lockedRect, NULL, 0); pCommPacket = (PAMDDX9SDICOMMPACKET)(lockedRect.pBits); pCommPacket->dwSign = 'SDIF'; pCommPacket->pResult = &hr; pCommPacket->sdiCmd = sdiCmd; pCommPacket->pOutBuf = pOutBuf; pCommPacket->dwOutBufSize = dwOutBufSize; pCommPacket->pInBuf = pInBuf; pCommPacket->dwInBufSize = dwInBufSize; pCommSurf->UnlockRect(); REL(pCommSurf); return hr; }
  • 14. 14 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | DIRECTGMA | DX9  Create a local surface that can be accessed by a remote device hr = RunSDICommand(pd3dDevice, AMD_SDI_CMD_CREATE_SURFACE_LOCAL_BEGIN, NULL, 0, NULL, 0); if (SUCCEEDED(hr)) { // Create SDI_LOCAL resources here hr = pd3dDevice->CreateTexture(width, height, 1, usage, format, D3DPOOL_DEFAULT, ppTex, NULL); if (SUCCEEDED(hr)) { hr = MakeAllocDoneViaDumpDraw( pd3dDevice, *ppTex ); hr = RunSDICommand(pd3dDevice, AMD_SDI_CMD_CREATE_SURFACE_LOCAL_END, NULL, 0, (PBYTE)pAttrib, sizeof(AMDDX9SDISURFACEATTRIBUTES)); if (SUCCEEDED(hr)) { pAttrib->surfaceHandle, pAttrib->surfaceAddr.surfaceBusAddr, pAttrib->surfaceAddr.markerBusAddr); } } } return hr;
  • 15. 15 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | DIRECTGMA | DX10 DX 11  The AMD’s DirectGMA extension is accessed by way of the IAmdDxExt interface. In order to create this interface, the extension client must do the following: ‒ Include the “AmdDxExtSDIApi.h” file ‒ Get the exported function AmdDxExtCreate() from the DXX driver using GetProcAddress() ‒ Call AmdDxExtCreate to create an IAmdDxExt interface ‒ Get and use the desired specific extension interfaces ‒ Close the AMD DirectX extension interface IAmdDxExt once it is no longer needed ‒ Release the SDI interface IAmdDxExtSDI ‒ Release the extension interface IAmdDxExt
  • 16. 16 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | DIRECTGMA | DX 10 DX11  The following DirectGMA functions are provided: HRESULT CreateSDIAdapterSurfaces(AmdDxRemoteSDISurfaceList *pList) ; HRESULT QuerySDIAllocationAddress(AmdDxSDIQueryAllocInfo *pInfo) ; HRESULT MakeResidentSDISurfaces(AmdDxLocalSDISurfaceList *pList) ; BOOL WriteMarker(ID3D10Resource *pResource, AmdDxMarkerInfo *pMarkerInfo); BOOL WaitMarker(ID3D10Resource *pResource, UINT val); BOOL WriteMarker11(ID3D11Resource *pResource, AmdDxMarkerInfo *pMarkerInfo) ; BOOL WaitMarker11(ID3D11Resource *pResource, UINT val);
  • 17. 17 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | DISCLAIMER & ATTRIBUTION The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION © 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. SPEC is a registered trademark of the Standard Performance Evaluation Corporation (SPEC). Other names are for informational purposes only and may be trademarks of their respective owners.