SlideShare ist ein Scribd-Unternehmen logo
1 von 8
Preliminary XSX fact finding
(it is not about 100% correctness,
it is about journey for connecting the dot), so dont use this as a fact!!!
plus RDNA1 Slide and Cache data size
by @blueisviolet
AMD Slide / Navi Fact
In Navi, two Compute Engines form a Workgroup Processor, and five of those form an
Asynchronous Compute Engine (ACE)
important link RDNA1 (Navi10)
https://www.amd.com/system/files/documents/rdna-
whitepaper.pdf
https://gpuopen.com/wp-
content/uploads/2019/08/RDNA_Architecture_public.pdf
https://gpuopen.com/rdna-shader-instruction-set-architecture-
document-now-available/
AMD Slide / Navi Fact
Some interesting of per SIMD32 block
- VGPR (128KB)
- SGPR (10KB)
also hmmm so ALU provide: x32 ALU + DP unit x2 + Transcedental? unit x8
32KB I$ for 4 SPU ( 2 WGP)
16KB K$ (data cache) per 2 WGP
AMD Slide / Navi Fact
In Navi, two Compute Engines (CU) form a Workgroup Processor, and five of those form
an Asynchronous Compute Engine (ACE)
AMD Slide / Navi Fact
1 WGP (2CU)=2xL0(2 x 16KB) + I$ (32KB) + K$ (16KB)+VGPR(512KB)+SGPR(40KB)+128KB LDS
5 WGP (1 Shader Array) (1 ACE) = connect to 128KB L1,
1 CU = 256KB VGPR + 20KB SGPR
So Why ? Seems the E F G has similar structure
Zoomed, in normal CU, the 4 5 6 7 usually different
it is like 10 Group, instead 4 like Navi, it is 6, what we know it seems
per WGP can be customized, also remember X1 Hotchip
on higher level diagram they showed as per 6 CU
Another thing is from Github, it is said 3 GDS and 6
LDS per group. Also provided X1 Hotchip slide (per 6)

Weitere Àhnliche Inhalte

Was ist angesagt?

Nodester Architecture overview & roadmap
Nodester Architecture overview & roadmapNodester Architecture overview & roadmap
Nodester Architecture overview & roadmap
cmatthieu
 
nodester Architecture overview & roadmap
nodester Architecture overview & roadmapnodester Architecture overview & roadmap
nodester Architecture overview & roadmap
wearefractal
 
Map Analytics in Starcraft II
Map Analytics in Starcraft IIMap Analytics in Starcraft II
Map Analytics in Starcraft II
gy8
 
Architectural Analysis of Game Machines
Architectural Analysis of Game MachinesArchitectural Analysis of Game Machines
Architectural Analysis of Game Machines
Praveen AP
 
Wms Performance Tests Map Server Vs Geo Server
Wms Performance Tests Map Server Vs Geo ServerWms Performance Tests Map Server Vs Geo Server
Wms Performance Tests Map Server Vs Geo Server
DonnyV
 

Was ist angesagt? (20)

Tobias Oetiker: RRDtool - how to make it sit up and beg
Tobias Oetiker: RRDtool - how to make it sit up and begTobias Oetiker: RRDtool - how to make it sit up and beg
Tobias Oetiker: RRDtool - how to make it sit up and beg
 
MapServer #ProTips 2015
MapServer #ProTips 2015MapServer #ProTips 2015
MapServer #ProTips 2015
 
Federated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation TherapyFederated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation Therapy
 
Nodester Architecture overview & roadmap
Nodester Architecture overview & roadmapNodester Architecture overview & roadmap
Nodester Architecture overview & roadmap
 
nodester Architecture overview & roadmap
nodester Architecture overview & roadmapnodester Architecture overview & roadmap
nodester Architecture overview & roadmap
 
Map Analytics in Starcraft II
Map Analytics in Starcraft IIMap Analytics in Starcraft II
Map Analytics in Starcraft II
 
Using GPUs to accelerate nonstiff and stiff chemical kinetics in combustion s...
Using GPUs to accelerate nonstiff and stiff chemical kinetics in combustion s...Using GPUs to accelerate nonstiff and stiff chemical kinetics in combustion s...
Using GPUs to accelerate nonstiff and stiff chemical kinetics in combustion s...
 
Debugging CUDA applications
Debugging CUDA applicationsDebugging CUDA applications
Debugging CUDA applications
 
GPU Programming with CUDA
GPU Programming with CUDAGPU Programming with CUDA
GPU Programming with CUDA
 
Architectural Analysis of Game Machines
Architectural Analysis of Game MachinesArchitectural Analysis of Game Machines
Architectural Analysis of Game Machines
 
Working together with SURF Raymond Oonk Annette Langedijk SURF
Working together with SURF Raymond Oonk Annette Langedijk SURFWorking together with SURF Raymond Oonk Annette Langedijk SURF
Working together with SURF Raymond Oonk Annette Langedijk SURF
 
Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0
 
Wms Performance Tests Map Server Vs Geo Server
Wms Performance Tests Map Server Vs Geo ServerWms Performance Tests Map Server Vs Geo Server
Wms Performance Tests Map Server Vs Geo Server
 
WMS Performance Shootout 2011
WMS Performance Shootout 2011WMS Performance Shootout 2011
WMS Performance Shootout 2011
 
cnsm2011_slide
cnsm2011_slidecnsm2011_slide
cnsm2011_slide
 
GPU
GPUGPU
GPU
 
Japan Lustre User Group 2014
Japan Lustre User Group 2014Japan Lustre User Group 2014
Japan Lustre User Group 2014
 
OpenPOWER Application Optimisation meet up
OpenPOWER Application Optimisation meet up OpenPOWER Application Optimisation meet up
OpenPOWER Application Optimisation meet up
 
mod-geocache / mapcache - A fast tiling solution for the apache web server
mod-geocache / mapcache - A fast tiling solution for the apache web servermod-geocache / mapcache - A fast tiling solution for the apache web server
mod-geocache / mapcache - A fast tiling solution for the apache web server
 
Managing Large Datasets in LabVIEW
Managing Large Datasets in LabVIEWManaging Large Datasets in LabVIEW
Managing Large Datasets in LabVIEW
 

Ähnlich wie Preliminary xsx die_fact_finding

Woden 2: Developing a modern 3D graphics engine in Smalltalk
Woden 2: Developing a modern 3D graphics engine in SmalltalkWoden 2: Developing a modern 3D graphics engine in Smalltalk
Woden 2: Developing a modern 3D graphics engine in Smalltalk
ESUG
 
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Stefano Di Carlo
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 

Ähnlich wie Preliminary xsx die_fact_finding (20)

20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
 
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)
 
Woden 2: Developing a modern 3D graphics engine in Smalltalk
Woden 2: Developing a modern 3D graphics engine in SmalltalkWoden 2: Developing a modern 3D graphics engine in Smalltalk
Woden 2: Developing a modern 3D graphics engine in Smalltalk
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
 
Do Theoretical Flo Ps Matter For Real Application’S Performance Kaust 2012
Do Theoretical Flo Ps Matter For Real Application’S Performance Kaust 2012Do Theoretical Flo Ps Matter For Real Application’S Performance Kaust 2012
Do Theoretical Flo Ps Matter For Real Application’S Performance Kaust 2012
 
Oracle Cloud Infrastructure - High Performance ComputingぼごçŽč介 [2020ćčŽ5月版]
Oracle Cloud Infrastructure - High Performance ComputingぼごçŽč介 [2020ćčŽ5月版]Oracle Cloud Infrastructure - High Performance ComputingぼごçŽč介 [2020ćčŽ5月版]
Oracle Cloud Infrastructure - High Performance ComputingぼごçŽč介 [2020ćčŽ5月版]
 
20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference
 
Analytics DB Benchmark
Analytics DB BenchmarkAnalytics DB Benchmark
Analytics DB Benchmark
 
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Advances in GPU Computing
Advances in GPU ComputingAdvances in GPU Computing
Advances in GPU Computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Exploiting GPUs in Spark
Exploiting GPUs in SparkExploiting GPUs in Spark
Exploiting GPUs in Spark
 
Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AI
 
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
 
NetFlow Data processing using Hadoop and Vertica
NetFlow Data processing using Hadoop and VerticaNetFlow Data processing using Hadoop and Vertica
NetFlow Data processing using Hadoop and Vertica
 
RAPIDS Overview
RAPIDS OverviewRAPIDS Overview
RAPIDS Overview
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best Practices
 

Mehr von mistercteam

The technology behind_the_elemental_demo_16x9-1248544805
The technology behind_the_elemental_demo_16x9-1248544805The technology behind_the_elemental_demo_16x9-1248544805
The technology behind_the_elemental_demo_16x9-1248544805
mistercteam
 
Mantle programming-guide-and-api-reference
Mantle programming-guide-and-api-referenceMantle programming-guide-and-api-reference
Mantle programming-guide-and-api-reference
mistercteam
 
D3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performanceD3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performance
mistercteam
 
D3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performanceD3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performance
mistercteam
 

Mehr von mistercteam (17)

20150207 howes-gpgpu8-dark secrets
20150207 howes-gpgpu8-dark secrets20150207 howes-gpgpu8-dark secrets
20150207 howes-gpgpu8-dark secrets
 
S0333 gtc2012-gmac-programming-cuda
S0333 gtc2012-gmac-programming-cudaS0333 gtc2012-gmac-programming-cuda
S0333 gtc2012-gmac-programming-cuda
 
201210 howes-hsa and-the_modern_gpu
201210 howes-hsa and-the_modern_gpu201210 howes-hsa and-the_modern_gpu
201210 howes-hsa and-the_modern_gpu
 
3 673 (1)
3 673 (1)3 673 (1)
3 673 (1)
 
3 boyd direct3_d12 (1)
3 boyd direct3_d12 (1)3 boyd direct3_d12 (1)
3 boyd direct3_d12 (1)
 
5 baker oxide (1)
5 baker oxide (1)5 baker oxide (1)
5 baker oxide (1)
 
The technology behind_the_elemental_demo_16x9-1248544805
The technology behind_the_elemental_demo_16x9-1248544805The technology behind_the_elemental_demo_16x9-1248544805
The technology behind_the_elemental_demo_16x9-1248544805
 
Lecture14
Lecture14Lecture14
Lecture14
 
01 intro-bps-2011
01 intro-bps-201101 intro-bps-2011
01 intro-bps-2011
 
Gdce 2010 dx11
Gdce 2010 dx11Gdce 2010 dx11
Gdce 2010 dx11
 
Hpg2011 papers kazakov
Hpg2011 papers kazakovHpg2011 papers kazakov
Hpg2011 papers kazakov
 
Dx11 performancereloaded
Dx11 performancereloadedDx11 performancereloaded
Dx11 performancereloaded
 
Mantle programming-guide-and-api-reference
Mantle programming-guide-and-api-referenceMantle programming-guide-and-api-reference
Mantle programming-guide-and-api-reference
 
D3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performanceD3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performance
 
D3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performanceD3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performance
 
Advancements in-tiled-rendering
Advancements in-tiled-renderingAdvancements in-tiled-rendering
Advancements in-tiled-rendering
 
Getting the-best-out-of-d3 d12
Getting the-best-out-of-d3 d12Getting the-best-out-of-d3 d12
Getting the-best-out-of-d3 d12
 

KĂŒrzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

KĂŒrzlich hochgeladen (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

Preliminary xsx die_fact_finding

  • 1. Preliminary XSX fact finding (it is not about 100% correctness, it is about journey for connecting the dot), so dont use this as a fact!!! plus RDNA1 Slide and Cache data size by @blueisviolet
  • 2. AMD Slide / Navi Fact In Navi, two Compute Engines form a Workgroup Processor, and five of those form an Asynchronous Compute Engine (ACE) important link RDNA1 (Navi10) https://www.amd.com/system/files/documents/rdna- whitepaper.pdf https://gpuopen.com/wp- content/uploads/2019/08/RDNA_Architecture_public.pdf https://gpuopen.com/rdna-shader-instruction-set-architecture- document-now-available/
  • 3. AMD Slide / Navi Fact Some interesting of per SIMD32 block - VGPR (128KB) - SGPR (10KB) also hmmm so ALU provide: x32 ALU + DP unit x2 + Transcedental? unit x8 32KB I$ for 4 SPU ( 2 WGP) 16KB K$ (data cache) per 2 WGP
  • 4. AMD Slide / Navi Fact In Navi, two Compute Engines (CU) form a Workgroup Processor, and five of those form an Asynchronous Compute Engine (ACE)
  • 5. AMD Slide / Navi Fact 1 WGP (2CU)=2xL0(2 x 16KB) + I$ (32KB) + K$ (16KB)+VGPR(512KB)+SGPR(40KB)+128KB LDS 5 WGP (1 Shader Array) (1 ACE) = connect to 128KB L1, 1 CU = 256KB VGPR + 20KB SGPR
  • 6. So Why ? Seems the E F G has similar structure
  • 7. Zoomed, in normal CU, the 4 5 6 7 usually different it is like 10 Group, instead 4 like Navi, it is 6, what we know it seems per WGP can be customized, also remember X1 Hotchip on higher level diagram they showed as per 6 CU
  • 8. Another thing is from Github, it is said 3 GDS and 6 LDS per group. Also provided X1 Hotchip slide (per 6)