SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
A 2.5D CULLING FOR FORWARD+
AMD
Takahiro Harada
2 | A 2.5D culling for Forward+ | Takahiro Harada
AGENDA
ï‚§â€ŻForward+
–  Forward, Deferred, Forward+
–  Problem description
ï‚§â€Ż2.5D culling
ï‚§â€ŻResults
3 | A 2.5D culling for Forward+ | Takahiro Harada
FORWARD+
4 | A 2.5D culling for Forward+ | Takahiro Harada
REAL-TIME SOLUTION COMPARISON
ï‚§â€ŻRendering equation
ï‚§â€ŻForward
ï‚§â€ŻDeferred
ï‚§â€ŻForward+
5 | A 2.5D culling for Forward+ | Takahiro Harada
FORWARD RENDERING PIPELINE
ï‚§â€ŻDepth prepass
–  Fills z buffer
ï‚§â€ŻPrevent overdraw for shading
ï‚§â€ŻShading
–  Geometry is rendered
–  Pixel shader
ï‚§â€ŻIterate through light list set for each object
ï‚§â€ŻEvaluates materials for the lights
6 | A 2.5D culling for Forward+ | Takahiro Harada
FORWARD+ RENDERING PIPELINE
ï‚§â€ŻDepth prepass
–  Fills z buffer
ï‚§â€ŻPrevent overdraw for shading
ï‚§â€ŻUsed for pixel position reconstruction for light culling
ï‚§â€ŻLight culling
–  Culls light per tile basis
–  Input: z buffer, light buffer
–  Output: light list per tile
ï‚§â€ŻShading
–  Geometry is rendered
–  Pixel shader
ï‚§â€ŻIterate through light list calculated in light culling
ï‚§â€ŻEvaluates materials for the lights
1
2
3
[1,2,3][1] [2,3]
7 | A 2.5D culling for Forward+ | Takahiro Harada
CREATING A FRUSTUM FOR A TILE
ï‚§â€ŻAn edge @SS == A plane @VS
ï‚§â€ŻA tile (4 edges) @SS == 4 planes @VS
–  Open frustum (no bound in Z direction)
ï‚§â€ŻMax and min Z is used to cap
8 | A 2.5D culling for Forward+ | Takahiro Harada
LONG FRUSTUM
ï‚§â€ŻScreen space culling is not always sufficient
–  Create a frustum from max and min depth values
–  Edge of objects
–  Captures a lot of unnecessary lights
9 | A 2.5D culling for Forward+ | Takahiro Harada
LONG FRUSTUM
ï‚§â€ŻScreen space culling is not always sufficient
–  Create a frustum from max and min depth values
–  Edge of objects
–  Captures a lot of unnecessary lights
0 lights
25 lights
50 lights
10 | A 2.5D culling for Forward+ | Takahiro Harada
GET WORSE IN A COMPLEX SCENE
0 lights
100 lights
200 lights
11 | A 2.5D culling for Forward+ | Takahiro Harada
QUESTION
ï‚§â€ŻWant to reduce false positives
ï‚§â€ŻCan we improve the culling without adding much overhead?
–  Computation time, memory
–  Culling itself is an optimization
–  Spending a lot of resources for it does not make sense
ï‚§â€ŻUsing a 3D grid is a natural extension
–  Uses too much memory
12 | A 2.5D culling for Forward+ | Takahiro Harada
2.5D CULLING
13 | A 2.5D culling for Forward+ | Takahiro Harada
2.5D CULLING
ï‚§â€ŻAdditional memory usage
–  0B global memory
–  4B local memory per WG (can compress more if you want)
ï‚§â€ŻAdditional computation complexity
–  A few bit and arithmetic instructions
–  A few lines of codes for light culling
–  No changes for other stages
ï‚§â€ŻAdditional runtime overhead
–  < 10% compared to the original light culling
14 | A 2.5D culling for Forward+ | Takahiro Harada
IDEA
ï‚§â€ŻSplit frustum in z direction
–  Uniform split for a frustum
–  Varying split among frustums
(a) (b)
15 | A 2.5D culling for Forward+ | Takahiro Harada
FRUSTUM CONSTRUCTION
ï‚§â€ŻCalculate depth bound
–  max and min values of depth
ï‚§â€ŻSplit depth direction into 32 cells
–  Min value and cell size
ï‚§â€ŻFlag occupied cell
ï‚§â€ŻA 32bit depth mask per work group
A tile
16 | A 2.5D culling for Forward+ | Takahiro Harada
FRUSTUM CONSTRUCTION
ï‚§â€ŻCalculate depth bound
–  max and min values of depth
ï‚§â€ŻSplit depth direction into 32 cells
–  Min value and cell size
ï‚§â€ŻFlag occupied cell
ï‚§â€ŻA 32bit depth mask per work group
7 7 7 7
7 7 7 2
7 7 2 1
7 2 1 0
Depth mask = 11100001
A tile
0 1 2 3 4 5 6 7
17 | A 2.5D culling for Forward+ | Takahiro Harada
LIGHT CULLING
ï‚§â€ŻIf a light overlaps to the frustum
–  Calculate depth mask for the light
–  Check overlap using the depth mask of the frustum
ï‚§â€ŻDepth mask & Depth mask
–  11100001 & 00011000 = 00000000
Depth mask = 11100001
Depth mask = 00011000
18 | A 2.5D culling for Forward+ | Takahiro Harada
LIGHT CULLING
ï‚§â€ŻIf a light overlaps to the frustum
–  Calculate depth mask for the light
–  Check overlap using the depth mask of the frustum
ï‚§â€ŻDepth mask & Depth mask
–  11100001 & 00110000 = 00100000
Depth mask = 11100001
Depth mask = 00110000
19 | A 2.5D culling for Forward+ | Takahiro Harada
CODE
Original With 2.5D culling
20 | A 2.5D culling for Forward+ | Takahiro Harada
RESULTS
21 | A 2.5D culling for Forward+ | Takahiro Harada
LIGHT CULLING
22 | A 2.5D culling for Forward+ | Takahiro Harada
LIGHT CULLING + 2.5D CULLING
23 | A 2.5D culling for Forward+ | Takahiro Harada
COMPARISON
1"
10"
100"
1000"
10000"
1" 2" 3" 4" 5" 6" 7" 8" 9" 10" 11" 12" 13" 14" 15" 16" 17" 18" 19" 20" 21" 22" 23"
Number'of'*les'
Number'of'lights'(x10)'
With"2.5D"culling"
Without"2.5D"culling"
220 lights/frustum -> 120 lights/frustum
24 | A 2.5D culling for Forward+ | Takahiro Harada
LIGHT CULLING
25 | A 2.5D culling for Forward+ | Takahiro Harada
LIGHT CULLING + 2.5D CULLING
26 | A 2.5D culling for Forward+ | Takahiro Harada
COMPARISON
1"
10"
100"
1000"
10000"
1" 2" 3" 4" 5" 6" 7" 8" 9" 10" 11" 12" 13" 14" 15" 16" 17"
Number'of'*les'
Number'of'lights'(x10)'
With"2.5D"culling"
Without"2.5D"culling"
27 | A 2.5D culling for Forward+ | Takahiro Harada
PERFORMANCE
0"
1"
2"
3"
4"
5"
6"
1024" 2048" 3072" 4096"
!me$(ms)$
Number$of$lights$
Forward+"w."frustum"culling"
Forward+"w."2.5D"
Deferred"
28 | A 2.5D culling for Forward+ | Takahiro Harada
CONCLUSION
ï‚§â€ŻProposed 2.5D culling which
–  Additional memory usage
ï‚§â€Ż 0B global memory
ï‚§â€Ż 4B local memory per WG (can compress more if you want)
–  Additional compute complexity
ï‚§â€Ż 3 lines of pseudo codes for light culling
ï‚§â€Ż No changes for other stages
–  Additional runtime overhead
ï‚§â€Ż < 10% compared to the original light culling
ï‚§â€ŻShowed that 2.5D culling reduces false positives

Weitere Àhnliche Inhalte

Was ist angesagt?

Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Tiago Sousa
 
Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2Philip Hammer
 
DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3Electronic Arts / DICE
 
Beyond porting
Beyond portingBeyond porting
Beyond portingCass Everitt
 
A Real-time Radiosity Architecture
A Real-time Radiosity ArchitectureA Real-time Radiosity Architecture
A Real-time Radiosity ArchitectureElectronic Arts / DICE
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility bufferWolfgang Engel
 
Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Tiago Sousa
 
Checkerboard Rendering in Dark Souls: Remastered by QLOC
Checkerboard Rendering in Dark Souls: Remastered by QLOCCheckerboard Rendering in Dark Souls: Remastered by QLOC
Checkerboard Rendering in Dark Souls: Remastered by QLOCQLOC
 
Screen Space Reflections in The Surge
Screen Space Reflections in The SurgeScreen Space Reflections in The Surge
Screen Space Reflections in The SurgeMichele Giacalone
 
A Scalable Real-Time Many-Shadowed-Light Rendering System
A Scalable Real-Time Many-Shadowed-Light Rendering SystemA Scalable Real-Time Many-Shadowed-Light Rendering System
A Scalable Real-Time Many-Shadowed-Light Rendering SystemBo Li
 
Oit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsOit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsHolger Gruen
 
Decima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero DawnDecima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero DawnGuerrilla
 
Moving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based RenderingMoving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based RenderingElectronic Arts / DICE
 
Approaching zero driver overhead
Approaching zero driver overheadApproaching zero driver overhead
Approaching zero driver overheadCass Everitt
 
Real-time lightmap baking
Real-time lightmap bakingReal-time lightmap baking
Real-time lightmap bakingRosario Leonardi
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Johan Andersson
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonAMD Developer Central
 
OpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering TechniquesOpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering TechniquesNarann29
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14AMD Developer Central
 
Dx11 performancereloaded
Dx11 performancereloadedDx11 performancereloaded
Dx11 performancereloadedmistercteam
 

Was ist angesagt? (20)

Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666
 
Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2
 
DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3
 
Beyond porting
Beyond portingBeyond porting
Beyond porting
 
A Real-time Radiosity Architecture
A Real-time Radiosity ArchitectureA Real-time Radiosity Architecture
A Real-time Radiosity Architecture
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility buffer
 
Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)
 
Checkerboard Rendering in Dark Souls: Remastered by QLOC
Checkerboard Rendering in Dark Souls: Remastered by QLOCCheckerboard Rendering in Dark Souls: Remastered by QLOC
Checkerboard Rendering in Dark Souls: Remastered by QLOC
 
Screen Space Reflections in The Surge
Screen Space Reflections in The SurgeScreen Space Reflections in The Surge
Screen Space Reflections in The Surge
 
A Scalable Real-Time Many-Shadowed-Light Rendering System
A Scalable Real-Time Many-Shadowed-Light Rendering SystemA Scalable Real-Time Many-Shadowed-Light Rendering System
A Scalable Real-Time Many-Shadowed-Light Rendering System
 
Oit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsOit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked Lists
 
Decima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero DawnDecima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero Dawn
 
Moving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based RenderingMoving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based Rendering
 
Approaching zero driver overhead
Approaching zero driver overheadApproaching zero driver overhead
Approaching zero driver overhead
 
Real-time lightmap baking
Real-time lightmap bakingReal-time lightmap baking
Real-time lightmap baking
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
OpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering TechniquesOpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering Techniques
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
 
Dx11 performancereloaded
Dx11 performancereloadedDx11 performancereloaded
Dx11 performancereloaded
 

Mehr von Takahiro Harada

201907 Radeon ProRender2.0@Siggraph2019
201907 Radeon ProRender2.0@Siggraph2019201907 Radeon ProRender2.0@Siggraph2019
201907 Radeon ProRender2.0@Siggraph2019Takahiro Harada
 
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...Takahiro Harada
 
Introduction to OpenCL (Japanese, OpenCLたćŸș瀎)
Introduction to OpenCL (Japanese, OpenCLたćŸș瀎)Introduction to OpenCL (Japanese, OpenCLたćŸș瀎)
Introduction to OpenCL (Japanese, OpenCLたćŸș瀎)Takahiro Harada
 
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering WorkflowTakahiro Harada
 
çąșçŽ‡çš„ăƒ©ă‚€ăƒˆă‚«ăƒȘăƒłă‚°ă€€ç†è«–ăšćźŸèŁ… (CEDEC2016)
çąșçŽ‡çš„ăƒ©ă‚€ăƒˆă‚«ăƒȘăƒłă‚°ă€€ç†è«–ăšćźŸèŁ… (CEDEC2016)çąșçŽ‡çš„ăƒ©ă‚€ăƒˆă‚«ăƒȘăƒłă‚°ă€€ç†è«–ăšćźŸèŁ… (CEDEC2016)
çąșçŽ‡çš„ăƒ©ă‚€ăƒˆă‚«ăƒȘăƒłă‚°ă€€ç†è«–ăšćźŸèŁ… (CEDEC2016)Takahiro Harada
 
Introducing Firerender for 3DS Max
Introducing Firerender for 3DS MaxIntroducing Firerender for 3DS Max
Introducing Firerender for 3DS MaxTakahiro Harada
 
[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays
[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays
[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRaysTakahiro Harada
 
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...Takahiro Harada
 
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...Takahiro Harada
 
Foveated Ray Tracing for VR on Multiple GPUs
Foveated Ray Tracing for VR on Multiple GPUsFoveated Ray Tracing for VR on Multiple GPUs
Foveated Ray Tracing for VR on Multiple GPUsTakahiro Harada
 
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)Takahiro Harada
 
Physics Tutorial, GPU Physics (GDC2010)
Physics Tutorial, GPU Physics (GDC2010)Physics Tutorial, GPU Physics (GDC2010)
Physics Tutorial, GPU Physics (GDC2010)Takahiro Harada
 
Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...
Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...
Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...Takahiro Harada
 
Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)
Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)
Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)Takahiro Harada
 
A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)
A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)
A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)Takahiro Harada
 
Introduction to Monte Carlo Ray Tracing (CEDEC 2013)
Introduction to Monte Carlo Ray Tracing (CEDEC 2013)Introduction to Monte Carlo Ray Tracing (CEDEC 2013)
Introduction to Monte Carlo Ray Tracing (CEDEC 2013)Takahiro Harada
 

Mehr von Takahiro Harada (16)

201907 Radeon ProRender2.0@Siggraph2019
201907 Radeon ProRender2.0@Siggraph2019201907 Radeon ProRender2.0@Siggraph2019
201907 Radeon ProRender2.0@Siggraph2019
 
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
 
Introduction to OpenCL (Japanese, OpenCLたćŸș瀎)
Introduction to OpenCL (Japanese, OpenCLたćŸș瀎)Introduction to OpenCL (Japanese, OpenCLたćŸș瀎)
Introduction to OpenCL (Japanese, OpenCLたćŸș瀎)
 
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
 
çąșçŽ‡çš„ăƒ©ă‚€ăƒˆă‚«ăƒȘăƒłă‚°ă€€ç†è«–ăšćźŸèŁ… (CEDEC2016)
çąșçŽ‡çš„ăƒ©ă‚€ăƒˆă‚«ăƒȘăƒłă‚°ă€€ç†è«–ăšćźŸèŁ… (CEDEC2016)çąșçŽ‡çš„ăƒ©ă‚€ăƒˆă‚«ăƒȘăƒłă‚°ă€€ç†è«–ăšćźŸèŁ… (CEDEC2016)
çąșçŽ‡çš„ăƒ©ă‚€ăƒˆă‚«ăƒȘăƒłă‚°ă€€ç†è«–ăšćźŸèŁ… (CEDEC2016)
 
Introducing Firerender for 3DS Max
Introducing Firerender for 3DS MaxIntroducing Firerender for 3DS Max
Introducing Firerender for 3DS Max
 
[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays
[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays
[2016 GDC] Multiplatform GPU Ray-Tracing Solutions With FireRender and FireRays
 
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
 
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
Introduction to Bidirectional Path Tracing (BDPT) & Implementation using Open...
 
Foveated Ray Tracing for VR on Multiple GPUs
Foveated Ray Tracing for VR on Multiple GPUsFoveated Ray Tracing for VR on Multiple GPUs
Foveated Ray Tracing for VR on Multiple GPUs
 
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
 
Physics Tutorial, GPU Physics (GDC2010)
Physics Tutorial, GPU Physics (GDC2010)Physics Tutorial, GPU Physics (GDC2010)
Physics Tutorial, GPU Physics (GDC2010)
 
Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...
Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...
Using GPUs for Collision detection, Recent Advances in Real-Time Collision an...
 
Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)
Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)
Heterogeneous Particle based Simulation (SIGGRAPH ASIA 2011)
 
A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)
A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)
A Parallel Constraint Solver for a Rigid Body Simulation (SIGGRAPH ASIA 2011)
 
Introduction to Monte Carlo Ray Tracing (CEDEC 2013)
Introduction to Monte Carlo Ray Tracing (CEDEC 2013)Introduction to Monte Carlo Ray Tracing (CEDEC 2013)
Introduction to Monte Carlo Ray Tracing (CEDEC 2013)
 

KĂŒrzlich hochgeladen

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĂșjo
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

KĂŒrzlich hochgeladen (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

A 2.5D Culling for Forward+ (SIGGRAPH ASIA 2012)

  • 1. A 2.5D CULLING FOR FORWARD+ AMD Takahiro Harada
  • 2. 2 | A 2.5D culling for Forward+ | Takahiro Harada AGENDA ï‚§â€ŻForward+ –  Forward, Deferred, Forward+ –  Problem description ï‚§â€Ż2.5D culling ï‚§â€ŻResults
  • 3. 3 | A 2.5D culling for Forward+ | Takahiro Harada FORWARD+
  • 4. 4 | A 2.5D culling for Forward+ | Takahiro Harada REAL-TIME SOLUTION COMPARISON ï‚§â€ŻRendering equation ï‚§â€ŻForward ï‚§â€ŻDeferred ï‚§â€ŻForward+
  • 5. 5 | A 2.5D culling for Forward+ | Takahiro Harada FORWARD RENDERING PIPELINE ï‚§â€ŻDepth prepass –  Fills z buffer ï‚§â€ŻPrevent overdraw for shading ï‚§â€ŻShading –  Geometry is rendered –  Pixel shader ï‚§â€ŻIterate through light list set for each object ï‚§â€ŻEvaluates materials for the lights
  • 6. 6 | A 2.5D culling for Forward+ | Takahiro Harada FORWARD+ RENDERING PIPELINE ï‚§â€ŻDepth prepass –  Fills z buffer ï‚§â€ŻPrevent overdraw for shading ï‚§â€ŻUsed for pixel position reconstruction for light culling ï‚§â€ŻLight culling –  Culls light per tile basis –  Input: z buffer, light buffer –  Output: light list per tile ï‚§â€ŻShading –  Geometry is rendered –  Pixel shader ï‚§â€ŻIterate through light list calculated in light culling ï‚§â€ŻEvaluates materials for the lights 1 2 3 [1,2,3][1] [2,3]
  • 7. 7 | A 2.5D culling for Forward+ | Takahiro Harada CREATING A FRUSTUM FOR A TILE ï‚§â€ŻAn edge @SS == A plane @VS ï‚§â€ŻA tile (4 edges) @SS == 4 planes @VS –  Open frustum (no bound in Z direction) ï‚§â€ŻMax and min Z is used to cap
  • 8. 8 | A 2.5D culling for Forward+ | Takahiro Harada LONG FRUSTUM ï‚§â€ŻScreen space culling is not always sufficient –  Create a frustum from max and min depth values –  Edge of objects –  Captures a lot of unnecessary lights
  • 9. 9 | A 2.5D culling for Forward+ | Takahiro Harada LONG FRUSTUM ï‚§â€ŻScreen space culling is not always sufficient –  Create a frustum from max and min depth values –  Edge of objects –  Captures a lot of unnecessary lights 0 lights 25 lights 50 lights
  • 10. 10 | A 2.5D culling for Forward+ | Takahiro Harada GET WORSE IN A COMPLEX SCENE 0 lights 100 lights 200 lights
  • 11. 11 | A 2.5D culling for Forward+ | Takahiro Harada QUESTION ï‚§â€ŻWant to reduce false positives ï‚§â€ŻCan we improve the culling without adding much overhead? –  Computation time, memory –  Culling itself is an optimization –  Spending a lot of resources for it does not make sense ï‚§â€ŻUsing a 3D grid is a natural extension –  Uses too much memory
  • 12. 12 | A 2.5D culling for Forward+ | Takahiro Harada 2.5D CULLING
  • 13. 13 | A 2.5D culling for Forward+ | Takahiro Harada 2.5D CULLING ï‚§â€ŻAdditional memory usage –  0B global memory –  4B local memory per WG (can compress more if you want) ï‚§â€ŻAdditional computation complexity –  A few bit and arithmetic instructions –  A few lines of codes for light culling –  No changes for other stages ï‚§â€ŻAdditional runtime overhead –  < 10% compared to the original light culling
  • 14. 14 | A 2.5D culling for Forward+ | Takahiro Harada IDEA ï‚§â€ŻSplit frustum in z direction –  Uniform split for a frustum –  Varying split among frustums (a) (b)
  • 15. 15 | A 2.5D culling for Forward+ | Takahiro Harada FRUSTUM CONSTRUCTION ï‚§â€ŻCalculate depth bound –  max and min values of depth ï‚§â€ŻSplit depth direction into 32 cells –  Min value and cell size ï‚§â€ŻFlag occupied cell ï‚§â€ŻA 32bit depth mask per work group A tile
  • 16. 16 | A 2.5D culling for Forward+ | Takahiro Harada FRUSTUM CONSTRUCTION ï‚§â€ŻCalculate depth bound –  max and min values of depth ï‚§â€ŻSplit depth direction into 32 cells –  Min value and cell size ï‚§â€ŻFlag occupied cell ï‚§â€ŻA 32bit depth mask per work group 7 7 7 7 7 7 7 2 7 7 2 1 7 2 1 0 Depth mask = 11100001 A tile 0 1 2 3 4 5 6 7
  • 17. 17 | A 2.5D culling for Forward+ | Takahiro Harada LIGHT CULLING ï‚§â€ŻIf a light overlaps to the frustum –  Calculate depth mask for the light –  Check overlap using the depth mask of the frustum ï‚§â€ŻDepth mask & Depth mask –  11100001 & 00011000 = 00000000 Depth mask = 11100001 Depth mask = 00011000
  • 18. 18 | A 2.5D culling for Forward+ | Takahiro Harada LIGHT CULLING ï‚§â€ŻIf a light overlaps to the frustum –  Calculate depth mask for the light –  Check overlap using the depth mask of the frustum ï‚§â€ŻDepth mask & Depth mask –  11100001 & 00110000 = 00100000 Depth mask = 11100001 Depth mask = 00110000
  • 19. 19 | A 2.5D culling for Forward+ | Takahiro Harada CODE Original With 2.5D culling
  • 20. 20 | A 2.5D culling for Forward+ | Takahiro Harada RESULTS
  • 21. 21 | A 2.5D culling for Forward+ | Takahiro Harada LIGHT CULLING
  • 22. 22 | A 2.5D culling for Forward+ | Takahiro Harada LIGHT CULLING + 2.5D CULLING
  • 23. 23 | A 2.5D culling for Forward+ | Takahiro Harada COMPARISON 1" 10" 100" 1000" 10000" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10" 11" 12" 13" 14" 15" 16" 17" 18" 19" 20" 21" 22" 23" Number'of'*les' Number'of'lights'(x10)' With"2.5D"culling" Without"2.5D"culling" 220 lights/frustum -> 120 lights/frustum
  • 24. 24 | A 2.5D culling for Forward+ | Takahiro Harada LIGHT CULLING
  • 25. 25 | A 2.5D culling for Forward+ | Takahiro Harada LIGHT CULLING + 2.5D CULLING
  • 26. 26 | A 2.5D culling for Forward+ | Takahiro Harada COMPARISON 1" 10" 100" 1000" 10000" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10" 11" 12" 13" 14" 15" 16" 17" Number'of'*les' Number'of'lights'(x10)' With"2.5D"culling" Without"2.5D"culling"
  • 27. 27 | A 2.5D culling for Forward+ | Takahiro Harada PERFORMANCE 0" 1" 2" 3" 4" 5" 6" 1024" 2048" 3072" 4096" !me$(ms)$ Number$of$lights$ Forward+"w."frustum"culling" Forward+"w."2.5D" Deferred"
  • 28. 28 | A 2.5D culling for Forward+ | Takahiro Harada CONCLUSION ï‚§â€ŻProposed 2.5D culling which –  Additional memory usage ï‚§â€Ż 0B global memory ï‚§â€Ż 4B local memory per WG (can compress more if you want) –  Additional compute complexity ï‚§â€Ż 3 lines of pseudo codes for light culling ï‚§â€Ż No changes for other stages –  Additional runtime overhead ï‚§â€Ż < 10% compared to the original light culling ï‚§â€ŻShowed that 2.5D culling reduces false positives