SlideShare ist ein Scribd-Unternehmen logo
1 von 76
Downloaden Sie, um offline zu lesen
Magnus Andersson
2
Occlusion Culling
Stanford Bunny in the Crytek Sponza Atrium
Eye
View frustum
3
Occlusion Culling
Stanford Bunny in the Crytek Sponza Atrium
 Fully occluded
4
Occlusion Culling
Stanford Bunny in the Crytek Sponza Atrium
 Partially occluded
Pixel processing
Geometry processing
Draw call
5
Hardware Fixed-function Occlusion Culling
 Handled automatically under the hood
 Per-tile culling granularity
– Semi-occluded triangles can be
partially culled
 Very late in the pipeline
Upload frame data
Game logic
Z Tile Culling
CPUsideGPUside
CPUsideGPUside
Game logic +
Pixel processing
Geometry processing
Draw call
Upload frame data
Z Tile Culling
SW culling
6
Software Occlusion Culling
 Cull very early in the pipeline
– Cull both CPU and GPU work
 Short delay
– Can be integrated with scene traversal
7
 Binary Space Partitioning (BSP)
trees & portals
 Precomputed – very efficient
 Scene (occluders) must be static
 Difficult to handle general
scenes
Potentially Visible Sets (PVS)
Quake II, id Software, 1997
Half-Life 2, Valve Corporation, 2004
8
Potentially Visible Sets (PVS)
Quake II, id Software, 1997
Half-Life 2, Valve Corporation, 2004
Player
Not part of PVS
Leaf boundaries
9
 Increasingly popular
 Modern games have more
complex and dynamic worlds
 No complex pre-computation
– Simpler content pipeline
Dynamic Occlusion Culling
Assassin’s Creed Unity, Ubisoft, 2014
Battlefield 4, EA DICE, 2013
[HA15]
[Col11]
10
Hierarchical Z Buffer (HiZ) [Greene93]
 Rasterize to full resolution z buffer
 Create HiZ buffer
– Find the maximum depth in each NxN tile
 Perform occlusion query with HiZ buffer
 General algorithm works for both SW
and HW occlusion culling
Z-buffer Based Culling
Full resolution depth buffer
HiZ buffer
Complex
object
Bounding
shape
Dragon model courtesy of Stanford University Computer Graphics Laboratory
11
Intel Software Occlusion Culling Framework
[CMK16]
Algorithm phases:
1. Rasterize a few designated occluder
objects to z buffer
– Heavily SSE/AVX optimized
– Parallel triangle setup
– Parallel pixel depth computation
2. Compute 1-level HiZ buffer (and throw
away z buffer)
3. Perform queries and render surviving
objects
12
 Rendering to z-buffer per pixel
 Updating HiZ tile needs all pixels within
the tile
 Occlusion Query per tile
 Wouldn’t it be nice to compute HiZ
directly?
– Being conservative is the only requirement
 Idea: use alternative HiZ representation
Z-buffer Based Culling
Full resolution depth buffer
HiZ buffer
13
Alternative HiZ buffer representation
Masked Occlusion Culling for Graphics Hardware [AHAM15]
 Two depth values per tile
 Per-pixel selection mask
zmax
0
zmax
1 Layer selection mask
0 0 0 1
0 0 1 1
0 0 1 1
0 1 1 1
0 0 0 0
0 0 0 0
0 0 0 1
0 0 0 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
0 0 0 1
0 0 1 1
0 0 0 1
0 0 0 1
14
Masked Occlusion Culling [AHAM15]
15
Masked Occlusion Culling [AHAM15]
16
Masked Occlusion Culling [AHAM15]
17
Masked Occlusion Culling [AHAM15]
18
Masked Occlusion Culling [AHAM15]
19
Masked Occlusion Culling [AHAM15]
20
Masked Occlusion Culling [AHAM15]
Merge
?
21
Masked Occlusion Culling [AHAM15]
22
Masked Occlusion Culling [AHAM15]
CulledNot culled
23
Masked Occlusion Culling [AHAM15]
Triangle meshes
24
 Originally designed for graphics hardware
 Directly update HiZ buffer without
computing a full res z buffer
 Decouples coverage sampling
(rasterization) and depth computation
Masked Occlusion Culling [AHAM15]
Approximate, conservative HiZ buffer
Depth buffer
25
Masked Software Occlusion Culling
Could Masked Occlusion Culling [AHAM15] be really fast for software
occlusion culling?
 Much less memory to read/write than full res z-buffer
 Updates use bitmasks – can process many pixels in parallel (i.e. SSE/AVX)
 No need to compute per-pixel depths
– Would need a fast SW rasterizer to compute coverage
Turns out it can 
 Paper presented at High Performance Graphics this year [HAAM16]
 Source code available!
26
Single Instruction, Multiple Data (SIMD)
3 3 5 6 2
32 bits 32 bits 32 bits 32 bits 32 bits
A A
5 5 7 3 5B B
+ + + ++
8 8 12 9 7
256 bits
AVXx86
4 1 4 10
5 11 4 5
+ + + +
9 12 8 15
32 bits 32 bits 32 bits 32 bits
27
Single Instruction, Multiple Data (SIMD)
32 bits
AVXx86
0xAC1DBA5EAC1DBA5EAC1DBA5EAC1DBA5E51CAFE3751CAFE3751CAFE3751CAFE37
256 bits
A
0x51CAFE3751CAFE3751CAFE3751CAFE37AC1DBA5EAC1DBA5EAC1DBA5EAC1DBA5EB
&
0x0008BA160008BA160008BA160008BA160008BA160008BA160008BA160008BA16
0xAC1DBA5EA
0x51CAFE37B
&
0x0008BA16
New algorithm
target architecture
Supported in our library codeEasily extended to AVX-512
28
An abridged history of Intel’s SIMD instruction sets
SSE, 1999
128b wide
SSE2, 2001
SSE4, 2006
Intel® microarchitecture code name Nehalem
AVX, 2011
256b wide
2nd Gen Intel® Core™ Processors
AVX2, 2013
4th Gen Intel® Core™ Processors
AVX-512, 2016
512b wide
1998 2017
Masked software occlusion culling
30
Algorithm Overview
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
8-wide triangle setup
8 scanlines
256 pixels (8 tiles with 8x4 pixels)
Tile
traversal
Triangle
setup
31
Transform and Clip
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
32
Compute Bounding Box
 Padded to 32x8 pixel supertiles
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
33
Compute Depth Plane
 Depth = ax + by + c
– Conservative tile depth: Check sign of a and b
– Can be incrementally updated Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
-, - +, -
-, + +, +
Clamp to vertex depths
+ a
+ b
34
Supertile Traversal Order
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
35
AVX Register Layout
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
36
AVX Register Layout
 One scanline per SIMD lane
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
 Compute slopes (∆y/∆x) once
– Similar to regular scanline rasterizers
37
Edge Slopes
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
38
Compute Intersections
 Compute intersections for each scanline
– Eight scanlines in parallel using AVX Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
Intersections
39
Compute Coverage Mask
 Start with full coverage mask
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
Intersections
40
Compute Coverage Mask
>>
>>
>>
>>
>>
>>
>>
>>
 Start with full coverage mask
– Shift each lane (scanline) to intersection
– AVX2 and later have per-lane shift instruction Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
Intersections
41
Compute Coverage Mask
 Repeat the same process for the next edge
Left edge
Right edge
Right edge
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
Intersections
42
Compute Coverage Mask
 Repeat the same process for the next edge
– Edge is facing right  invert mask
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
Intersections
43
Compute Coverage Mask
 Combine masks of all overlapping edges
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
44
Compute Coverage Mask
 Combine masks of all overlapping edges
– Using bitwise AND
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
45
Compute Coverage Mask
 Combine masks of all overlapping edges
– Using bitwise AND
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
46
Shuffle Mask
 Shuffle mask to form better shaped tiles
– Before: each SIMD lane is a scanline
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
47
Shuffle Mask
 Shuffle mask to form better shaped tiles
– Before: each SIMD lane is a scanline
– After: each SIMD lane is a 8x4 tile Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
48
Depth Test
 Interpolate conservative depth (per 8x4 tile)
 Test against buffer
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
Buffer
49
Update Tile
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
 Two code paths (can be switched compile time)
– Original update method [AHAM15]
– New update method tailored for SW [HAAM16]
 Why use a new update method?
– Faster – same culling power
– Less accurate than original, more dependent on render order
– Works best if you render front-to-back
50
Update Tile, New Method [HAAM16]
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
 zmax is the reference layer
– Maximum value for the entire tile
 zmax is the working layer
– Maximum value for a subset of the tile
– Updated as
– New depth = max(zmax , zmax)
– New mask = TriangleMask OR LayerMask
 Whenever working layer mask is full, overwrite reference layer
1
1
tri
0
51
Update Tile
52
Update Tile
53
Update Tile
54
Update Tile
55
Update Tile
 Discard heuristic: If zmax – zmax > zmax – zmax , discard working layer
56
Update Tile
tri1 10
Restart
57
Update Tile
58
Update Tile
59
Update Tile
Full overwrite:
Restart from new value
60
Update Tile
Update
Depth test
Compute coverage
Traversal setup
Depth plane
Compute bounds
Clip
Transform
 Update is quicker than original [AHAM15]
 Test is also quicker
– Need only to test against reference layer (zmax)0
62
Results
Intel Occlusion Culling Sample
 Clear: Clearing the depth buffer
 Geom: Transform & project geometry
 Rast: Triangle setup & occluder rasterization
 Gen: Compute HiZ buffer from full resolution z buffer
 Test: Perform occlusion queries
3.7x16x
(μs)
Old [CMK16]
New [HAAM16]
63
Performance comparison for camera
animation
Results
First frame
Last frame
Old NewFrustum only
Code is available as open-source
65
Masked Occlusion Culling API
void SetResolution();
void SetNearClipPlane();
void ClearBuffer();
static void TransformVertices();
Result RenderTriangles();
Result TestTriangles();
Result TestRect();
void ComputePixelDepthBuffer();
OcclusionCullingStatistics GetStatistics();
Setup
Debug
Render &
query
66
Masked Occlusion Culling API
Result RenderTriangles(
float *inVtx,
uint *inTris,
int nTris,
ClipPlanes mask,
ScissorRect *scissor,
VertexLayout &layout
);
 Render to the software HiZ buffer
// Clip space vertex positions
// Index array (Indices to inVtx buffer)
// Triangle count (the number of index triplets in inTris)
// Mask for potential frustum bound overlap
// Scissor region
// Vertex format of inTris. There is a fast-path for AoS with
(x, y, z, w) coordinates
67
Masked Occlusion Culling API
Result RenderTriangles(
float *inVtx,
uint *inTris,
int nTris,
ClipPlanes mask,
ScissorRect *scissor,
VertexLayout &layout
);
Eye
View frustum
Near plane
mask = 0
mask = leftPlane | nearPlane
 Clipping is not free...
– If you’re already doing frustum culling, let the API know the outcome 
68
Masked Occlusion Culling API
Result RenderTriangles(
float *inVtx,
uint *inTris,
int nTris,
ClipPlanes mask,
ScissorRect *scissor,
VertexLayout &layout
);
Eye
View frustum
Scissor region
(screen space AABB)
 Can be used for threading
– One scissor region per thread
69
Masked Occlusion Culling API
Result TestTriangles(
float *inVtx,
uint *inTris,
int nTris,
ClipPlanes mask,
ScissorRect *scissor,
VertexLayout &layout
);
 Test triangles against the software HiZ buffer
– Does not update the buffer
// Returns the collective culling outcome of the triangles
// Clip space vertex positions
// Index array (Indices to inVtx buffer)
// Triangle count (the number of index triplets in inTris)
// Mask for potential frustum bound overlap
// Scissor region
// Vertex format of inTris. There is a fast-path for AoS with
(x, y, z, w) coordinates
70
Masked Occlusion Culling API
Result TestRect(
float xmin,
float ymin,
float xmax,
float ymax,
float wmin
);
 Test rectangle against the software HiZ buffer
– Does not update the buffer
// Returns the culling outcome of the screen space rectangle
/*
Screen space bounds:
[xmin, ymin] – [xmax, ymax]
*/
// Conservative clip space w (typically the w-component of the nearest
bbox vertex in clip space)
71
Example use case: Scene Bounding Volume Hierarchy (BVH) traversal and culling
ClearBuffer();
prioQueue.push(root);
while (!prioQueue.empty()) {
Node node = prioQueue.pop();
if (FrustumTest(node) == Culled)
continue;
compute_screen_space_bounds(node);
if (TestRect(bounds) == Culled)
continue;
if (node is InnerNode) {
prioQueue.push(node.left, dist);
prioQueue.push(node.right, dist);
} else (node is Leaf) {
TransformVertices(leaf.vertices);
RenderTriangles(xfVertices);
send_leaf_to_GPU();
}
}
RenderFrame
Culled!
72
Essential Tools We Have Relied On
 Intel® VTune™
– https://software.intel.com/en-us/intel-vtune-amplifier-xe
 SSE/AVX intrinsics guide
– https://software.intel.com/sites/landingpage/IntrinsicsGuide/
73
References
[AHAM15] ANDERSSON M., HASSELGREN J., AKENINE-MÖLLER T.: Masked Depth Culling for
Graphics Hardware. ACM Transactions on Graphics 34, 6 (2015), pp. 188:1–188:9
[CMK16] CHANDRASEKARAN C., MCNABB D., KUAH K., FAUCONNEAU M., GIESEN F.: Software
Occlusion Culling. Published online at: https://software.intel.com/en-us/articles/
software-occlusion-culling, (2013–2016)
[Col11] COLLIN D.: Culling the Battlefield. Game Developer’s Conference (presentation), (2011)
[Greene93] GREENE N., KASS M., MILLER G.: Hierarchical Z-Buffer Visibility. In Proceedings of
SIGGRAPH, (1993), pp. 231–238
[HA15] HAAR U., AALTONEN S.: GPU-Driven Rendering Pipelines. SIGGRAPH Advances in Real-Time
Rendering in Games course, (2015)
[HAAM16] HASSELGREN J., ANDERSSON M., AKENINE-MÖLLER T.: Masked Software Occlusion
Culling. High Performance Graphics, (2016)
74
Check it out!
 GitHub: Lightweight library
– https://github.com/GameTechDev/MaskedOcclusionCulling
 GitHub: Example integrated in Intel’s Software Occlusion Culling demo
– https://github.com/GameTechDev/OcclusionCulling
 Project page: Masked Software Occlusion Culling
– https://software.intel.com/en-us/articles/masked-software-occlusion-culling
 Questions and feedback welcome
– magnus.andersson@intel.com
Legal Notices and Disclaimers
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system
configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.
Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other
sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit
http://www.intel.com/performance.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are
measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other
information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For
more complete information visit http://www.intel.com/performance.
Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide
cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.
This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel
representative to obtain the latest forecast, schedule, specifications and roadmaps.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward-looking statements that involve a number of risks and
uncertainties. A detailed discussion of the factors that could affect Intel’s results and plans is included in Intel’s SEC filings, including the annual report on Form 10-K.
All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice. The products described may contain
design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data
are accurate.
© 2016 Intel Corporation. Intel, the Intel logo, VTune and others are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
Masked Software Occlusion Culling

Weitere ähnliche Inhalte

Was ist angesagt?

Rendering Techniques in Rise of the Tomb Raider
Rendering Techniques in Rise of the Tomb RaiderRendering Techniques in Rise of the Tomb Raider
Rendering Techniques in Rise of the Tomb RaiderEidos-Montréal
 
Terrain in Battlefield 3: A Modern, Complete and Scalable System
Terrain in Battlefield 3: A Modern, Complete and Scalable SystemTerrain in Battlefield 3: A Modern, Complete and Scalable System
Terrain in Battlefield 3: A Modern, Complete and Scalable SystemElectronic Arts / DICE
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologyTiago Sousa
 
Oit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsOit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsHolger Gruen
 
OpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering TechniquesOpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering TechniquesNarann29
 
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunFive Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunElectronic Arts / DICE
 
Hill Stephen Rendering Tools Splinter Cell Conviction
Hill Stephen Rendering Tools Splinter Cell ConvictionHill Stephen Rendering Tools Splinter Cell Conviction
Hill Stephen Rendering Tools Splinter Cell Convictionozlael ozlael
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasAMD Developer Central
 
Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Ki Hyunwoo
 
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Johan Andersson
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonAMD Developer Central
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Johan Andersson
 
Volumetric Lighting for Many Lights in Lords of the Fallen
Volumetric Lighting for Many Lights in Lords of the FallenVolumetric Lighting for Many Lights in Lords of the Fallen
Volumetric Lighting for Many Lights in Lords of the FallenBenjamin Glatzel
 
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)Philip Hammer
 
A Scalable Real-Time Many-Shadowed-Light Rendering System
A Scalable Real-Time Many-Shadowed-Light Rendering SystemA Scalable Real-Time Many-Shadowed-Light Rendering System
A Scalable Real-Time Many-Shadowed-Light Rendering SystemBo Li
 
A Bit More Deferred Cry Engine3
A Bit More Deferred   Cry Engine3A Bit More Deferred   Cry Engine3
A Bit More Deferred Cry Engine3guest11b095
 
Future Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsFuture Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsElectronic Arts / DICE
 
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...Johan Andersson
 

Was ist angesagt? (20)

Rendering Techniques in Rise of the Tomb Raider
Rendering Techniques in Rise of the Tomb RaiderRendering Techniques in Rise of the Tomb Raider
Rendering Techniques in Rise of the Tomb Raider
 
Terrain in Battlefield 3: A Modern, Complete and Scalable System
Terrain in Battlefield 3: A Modern, Complete and Scalable SystemTerrain in Battlefield 3: A Modern, Complete and Scalable System
Terrain in Battlefield 3: A Modern, Complete and Scalable System
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics Technology
 
Oit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsOit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked Lists
 
OpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering TechniquesOpenGL 4.4 - Scene Rendering Techniques
OpenGL 4.4 - Scene Rendering Techniques
 
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunFive Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
 
Hill Stephen Rendering Tools Splinter Cell Conviction
Hill Stephen Rendering Tools Splinter Cell ConvictionHill Stephen Rendering Tools Splinter Cell Conviction
Hill Stephen Rendering Tools Splinter Cell Conviction
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
 
Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1
 
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
 
Volumetric Lighting for Many Lights in Lords of the Fallen
Volumetric Lighting for Many Lights in Lords of the FallenVolumetric Lighting for Many Lights in Lords of the Fallen
Volumetric Lighting for Many Lights in Lords of the Fallen
 
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
 
A Scalable Real-Time Many-Shadowed-Light Rendering System
A Scalable Real-Time Many-Shadowed-Light Rendering SystemA Scalable Real-Time Many-Shadowed-Light Rendering System
A Scalable Real-Time Many-Shadowed-Light Rendering System
 
A Bit More Deferred Cry Engine3
A Bit More Deferred   Cry Engine3A Bit More Deferred   Cry Engine3
A Bit More Deferred Cry Engine3
 
Future Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsFuture Directions for Compute-for-Graphics
Future Directions for Compute-for-Graphics
 
Light prepass
Light prepassLight prepass
Light prepass
 
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
 
DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3
 

Andere mochten auch

Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-ResolutionUltra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-ResolutionIntel® Software
 
Real-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPAReal-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPAIntel® Software
 
Real-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPAReal-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPAIntel® Software
 
Make your unity game faster, faster
Make your unity game faster, fasterMake your unity game faster, faster
Make your unity game faster, fasterIntel® Software
 
Looking at Machine Learning in Games
Looking at Machine Learning in GamesLooking at Machine Learning in Games
Looking at Machine Learning in GamesIntel® Software
 
Unity Optimization Tips, Tricks and Tools
Unity Optimization Tips, Tricks and ToolsUnity Optimization Tips, Tricks and Tools
Unity Optimization Tips, Tricks and ToolsIntel® Software
 
Intel Graphics Performance Analyzers (Intel GPA)
Intel Graphics Performance Analyzers (Intel GPA)Intel Graphics Performance Analyzers (Intel GPA)
Intel Graphics Performance Analyzers (Intel GPA)Intel® Software
 
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning AccelerationclCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning AccelerationIntel® Software
 
[Gpg1권 박민근] 4.8 가려진 객체의 제외 기법 (오브젝트 오클루젼 컬링)
[Gpg1권 박민근] 4.8 가려진 객체의 제외 기법 (오브젝트 오클루젼 컬링)[Gpg1권 박민근] 4.8 가려진 객체의 제외 기법 (오브젝트 오클루젼 컬링)
[Gpg1권 박민근] 4.8 가려진 객체의 제외 기법 (오브젝트 오클루젼 컬링)MinGeun Park
 
Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for GamesUmbra
 
High-Dynamic Range (HDR) Demystified
High-Dynamic Range (HDR) DemystifiedHigh-Dynamic Range (HDR) Demystified
High-Dynamic Range (HDR) DemystifiedIntel® Software
 
Narrative Fiction Storytelling in 360 Stereoscopic Panoramic VR: Old Techniqu...
Narrative Fiction Storytelling in 360 Stereoscopic Panoramic VR: Old Techniqu...Narrative Fiction Storytelling in 360 Stereoscopic Panoramic VR: Old Techniqu...
Narrative Fiction Storytelling in 360 Stereoscopic Panoramic VR: Old Techniqu...Intel® Software
 
Optimization Deep Dive: Unreal Engine 4 on Intel
Optimization Deep Dive: Unreal Engine 4 on IntelOptimization Deep Dive: Unreal Engine 4 on Intel
Optimization Deep Dive: Unreal Engine 4 on IntelIntel® Software
 
FrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteFrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteElectronic Arts / DICE
 
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect AndromedaElectronic Arts / DICE
 
Deferred rendering case study
Deferred rendering case studyDeferred rendering case study
Deferred rendering case studyozlael ozlael
 

Andere mochten auch (20)

Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-ResolutionUltra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
 
Real-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPAReal-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPA
 
Real-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPAReal-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPA
 
DreamWork Animation DWA
DreamWork Animation DWADreamWork Animation DWA
DreamWork Animation DWA
 
Make your unity game faster, faster
Make your unity game faster, fasterMake your unity game faster, faster
Make your unity game faster, faster
 
Looking at Machine Learning in Games
Looking at Machine Learning in GamesLooking at Machine Learning in Games
Looking at Machine Learning in Games
 
DreamWorks Animation
DreamWorks AnimationDreamWorks Animation
DreamWorks Animation
 
Unity Optimization Tips, Tricks and Tools
Unity Optimization Tips, Tricks and ToolsUnity Optimization Tips, Tricks and Tools
Unity Optimization Tips, Tricks and Tools
 
Relic's FX System
Relic's FX SystemRelic's FX System
Relic's FX System
 
Intel Graphics Performance Analyzers (Intel GPA)
Intel Graphics Performance Analyzers (Intel GPA)Intel Graphics Performance Analyzers (Intel GPA)
Intel Graphics Performance Analyzers (Intel GPA)
 
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning AccelerationclCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
 
[Gpg1권 박민근] 4.8 가려진 객체의 제외 기법 (오브젝트 오클루젼 컬링)
[Gpg1권 박민근] 4.8 가려진 객체의 제외 기법 (오브젝트 오클루젼 컬링)[Gpg1권 박민근] 4.8 가려진 객체의 제외 기법 (오브젝트 오클루젼 컬링)
[Gpg1권 박민근] 4.8 가려진 객체의 제외 기법 (오브젝트 오클루젼 컬링)
 
Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for Games
 
High-Dynamic Range (HDR) Demystified
High-Dynamic Range (HDR) DemystifiedHigh-Dynamic Range (HDR) Demystified
High-Dynamic Range (HDR) Demystified
 
Narrative Fiction Storytelling in 360 Stereoscopic Panoramic VR: Old Techniqu...
Narrative Fiction Storytelling in 360 Stereoscopic Panoramic VR: Old Techniqu...Narrative Fiction Storytelling in 360 Stereoscopic Panoramic VR: Old Techniqu...
Narrative Fiction Storytelling in 360 Stereoscopic Panoramic VR: Old Techniqu...
 
Optimization Deep Dive: Unreal Engine 4 on Intel
Optimization Deep Dive: Unreal Engine 4 on IntelOptimization Deep Dive: Unreal Engine 4 on Intel
Optimization Deep Dive: Unreal Engine 4 on Intel
 
Lighting the City of Glass
Lighting the City of GlassLighting the City of Glass
Lighting the City of Glass
 
FrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteFrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in Frostbite
 
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
 
Deferred rendering case study
Deferred rendering case studyDeferred rendering case study
Deferred rendering case study
 

Ähnlich wie Masked Software Occlusion Culling

The Technology behind Shadow Warrior, ZTG 2014
The Technology behind Shadow Warrior, ZTG 2014The Technology behind Shadow Warrior, ZTG 2014
The Technology behind Shadow Warrior, ZTG 2014Jarosław Pleskot
 
Realtime Per Face Texture Mapping (PTEX)
Realtime Per Face Texture Mapping (PTEX)Realtime Per Face Texture Mapping (PTEX)
Realtime Per Face Texture Mapping (PTEX)basisspace
 
Penn graphics
Penn graphicsPenn graphics
Penn graphicsfloored
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And EffectsThomas Goddard
 
Minko stage3d workshop_20130525
Minko stage3d workshop_20130525Minko stage3d workshop_20130525
Minko stage3d workshop_20130525Minko3D
 
Snake Game on FPGA in Verilog
Snake Game on FPGA in VerilogSnake Game on FPGA in Verilog
Snake Game on FPGA in VerilogKrishnajith S S
 
FlameWorks GTC 2014
FlameWorks GTC 2014FlameWorks GTC 2014
FlameWorks GTC 2014Simon Green
 
Heterogeneous Integration with 3D Packaging
Heterogeneous Integration with 3D PackagingHeterogeneous Integration with 3D Packaging
Heterogeneous Integration with 3D PackagingAMD
 
Hpg2011 papers kazakov
Hpg2011 papers kazakovHpg2011 papers kazakov
Hpg2011 papers kazakovmistercteam
 
Semiconductor overview
Semiconductor overviewSemiconductor overview
Semiconductor overviewNabil Chouba
 
[Unite Seoul 2020] Mobile Graphics Best Practices for Artists
[Unite Seoul 2020] Mobile Graphics Best Practices for Artists[Unite Seoul 2020] Mobile Graphics Best Practices for Artists
[Unite Seoul 2020] Mobile Graphics Best Practices for ArtistsOwen Wu
 
Tutorial VSP Conference 2013, San Luis Obispo, CA
Tutorial VSP Conference 2013, San Luis Obispo, CATutorial VSP Conference 2013, San Luis Obispo, CA
Tutorial VSP Conference 2013, San Luis Obispo, CAHersh Amin
 
Advanced High-Performance Computing Features of the OpenPOWER ISA
 Advanced High-Performance Computing Features of the OpenPOWER ISA Advanced High-Performance Computing Features of the OpenPOWER ISA
Advanced High-Performance Computing Features of the OpenPOWER ISAGanesan Narayanasamy
 
Advanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering PipelineAdvanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering PipelineNarann29
 
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like systemAccelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like systemShuai Yuan
 

Ähnlich wie Masked Software Occlusion Culling (20)

The Technology behind Shadow Warrior, ZTG 2014
The Technology behind Shadow Warrior, ZTG 2014The Technology behind Shadow Warrior, ZTG 2014
The Technology behind Shadow Warrior, ZTG 2014
 
Realtime Per Face Texture Mapping (PTEX)
Realtime Per Face Texture Mapping (PTEX)Realtime Per Face Texture Mapping (PTEX)
Realtime Per Face Texture Mapping (PTEX)
 
Penn graphics
Penn graphicsPenn graphics
Penn graphics
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And Effects
 
OpenGL for 2015
OpenGL for 2015OpenGL for 2015
OpenGL for 2015
 
Minko stage3d workshop_20130525
Minko stage3d workshop_20130525Minko stage3d workshop_20130525
Minko stage3d workshop_20130525
 
Snake Game on FPGA in Verilog
Snake Game on FPGA in VerilogSnake Game on FPGA in Verilog
Snake Game on FPGA in Verilog
 
FlameWorks GTC 2014
FlameWorks GTC 2014FlameWorks GTC 2014
FlameWorks GTC 2014
 
Extreme dxt compression
Extreme dxt compressionExtreme dxt compression
Extreme dxt compression
 
Heterogeneous Integration with 3D Packaging
Heterogeneous Integration with 3D PackagingHeterogeneous Integration with 3D Packaging
Heterogeneous Integration with 3D Packaging
 
Hpg2011 papers kazakov
Hpg2011 papers kazakovHpg2011 papers kazakov
Hpg2011 papers kazakov
 
Brkdct 3101
Brkdct 3101Brkdct 3101
Brkdct 3101
 
26_Fan.pdf
26_Fan.pdf26_Fan.pdf
26_Fan.pdf
 
Semiconductor overview
Semiconductor overviewSemiconductor overview
Semiconductor overview
 
[Unite Seoul 2020] Mobile Graphics Best Practices for Artists
[Unite Seoul 2020] Mobile Graphics Best Practices for Artists[Unite Seoul 2020] Mobile Graphics Best Practices for Artists
[Unite Seoul 2020] Mobile Graphics Best Practices for Artists
 
Tutorial VSP Conference 2013, San Luis Obispo, CA
Tutorial VSP Conference 2013, San Luis Obispo, CATutorial VSP Conference 2013, San Luis Obispo, CA
Tutorial VSP Conference 2013, San Luis Obispo, CA
 
Advanced High-Performance Computing Features of the OpenPOWER ISA
 Advanced High-Performance Computing Features of the OpenPOWER ISA Advanced High-Performance Computing Features of the OpenPOWER ISA
Advanced High-Performance Computing Features of the OpenPOWER ISA
 
lecture4 raster details in computer graphics(Computer graphics tutorials)
lecture4 raster details in computer graphics(Computer graphics tutorials)lecture4 raster details in computer graphics(Computer graphics tutorials)
lecture4 raster details in computer graphics(Computer graphics tutorials)
 
Advanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering PipelineAdvanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering Pipeline
 
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like systemAccelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
 

Mehr von Intel® Software

AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology Intel® Software
 
Python Data Science and Machine Learning at Scale with Intel and Anaconda
Python Data Science and Machine Learning at Scale with Intel and AnacondaPython Data Science and Machine Learning at Scale with Intel and Anaconda
Python Data Science and Machine Learning at Scale with Intel and AnacondaIntel® Software
 
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciStreamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciIntel® Software
 
AI for good: Scaling AI in science, healthcare, and more.
AI for good: Scaling AI in science, healthcare, and more.AI for good: Scaling AI in science, healthcare, and more.
AI for good: Scaling AI in science, healthcare, and more.Intel® Software
 
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Intel® Software
 
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...Intel® Software
 
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Intel® Software
 
AWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI ResearchAWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI ResearchIntel® Software
 
Intel AIDC Houston Summit - Overview Slides
Intel AIDC Houston Summit - Overview SlidesIntel AIDC Houston Summit - Overview Slides
Intel AIDC Houston Summit - Overview SlidesIntel® Software
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019Intel® Software
 
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019Intel® Software
 
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...Intel® Software
 
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...Intel® Software
 
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...Intel® Software
 
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...Intel® Software
 
AIDC India - Intel Movidius / Open Vino Slides
AIDC India - Intel Movidius / Open Vino SlidesAIDC India - Intel Movidius / Open Vino Slides
AIDC India - Intel Movidius / Open Vino SlidesIntel® Software
 
AIDC India - AI Vision Slides
AIDC India - AI Vision SlidesAIDC India - AI Vision Slides
AIDC India - AI Vision SlidesIntel® Software
 
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...Intel® Software
 

Mehr von Intel® Software (20)

AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology
 
Python Data Science and Machine Learning at Scale with Intel and Anaconda
Python Data Science and Machine Learning at Scale with Intel and AnacondaPython Data Science and Machine Learning at Scale with Intel and Anaconda
Python Data Science and Machine Learning at Scale with Intel and Anaconda
 
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciStreamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
 
AI for good: Scaling AI in science, healthcare, and more.
AI for good: Scaling AI in science, healthcare, and more.AI for good: Scaling AI in science, healthcare, and more.
AI for good: Scaling AI in science, healthcare, and more.
 
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
 
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
 
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
 
AWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI ResearchAWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI Research
 
Intel Developer Program
Intel Developer ProgramIntel Developer Program
Intel Developer Program
 
Intel AIDC Houston Summit - Overview Slides
Intel AIDC Houston Summit - Overview SlidesIntel AIDC Houston Summit - Overview Slides
Intel AIDC Houston Summit - Overview Slides
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019
 
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019
 
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
 
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
 
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
 
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
 
AIDC India - AI on IA
AIDC India  - AI on IAAIDC India  - AI on IA
AIDC India - AI on IA
 
AIDC India - Intel Movidius / Open Vino Slides
AIDC India - Intel Movidius / Open Vino SlidesAIDC India - Intel Movidius / Open Vino Slides
AIDC India - Intel Movidius / Open Vino Slides
 
AIDC India - AI Vision Slides
AIDC India - AI Vision SlidesAIDC India - AI Vision Slides
AIDC India - AI Vision Slides
 
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
 

Kürzlich hochgeladen

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 

Masked Software Occlusion Culling

  • 2. 2 Occlusion Culling Stanford Bunny in the Crytek Sponza Atrium Eye View frustum
  • 3. 3 Occlusion Culling Stanford Bunny in the Crytek Sponza Atrium  Fully occluded
  • 4. 4 Occlusion Culling Stanford Bunny in the Crytek Sponza Atrium  Partially occluded
  • 5. Pixel processing Geometry processing Draw call 5 Hardware Fixed-function Occlusion Culling  Handled automatically under the hood  Per-tile culling granularity – Semi-occluded triangles can be partially culled  Very late in the pipeline Upload frame data Game logic Z Tile Culling CPUsideGPUside
  • 6. CPUsideGPUside Game logic + Pixel processing Geometry processing Draw call Upload frame data Z Tile Culling SW culling 6 Software Occlusion Culling  Cull very early in the pipeline – Cull both CPU and GPU work  Short delay – Can be integrated with scene traversal
  • 7. 7  Binary Space Partitioning (BSP) trees & portals  Precomputed – very efficient  Scene (occluders) must be static  Difficult to handle general scenes Potentially Visible Sets (PVS) Quake II, id Software, 1997 Half-Life 2, Valve Corporation, 2004
  • 8. 8 Potentially Visible Sets (PVS) Quake II, id Software, 1997 Half-Life 2, Valve Corporation, 2004 Player Not part of PVS Leaf boundaries
  • 9. 9  Increasingly popular  Modern games have more complex and dynamic worlds  No complex pre-computation – Simpler content pipeline Dynamic Occlusion Culling Assassin’s Creed Unity, Ubisoft, 2014 Battlefield 4, EA DICE, 2013 [HA15] [Col11]
  • 10. 10 Hierarchical Z Buffer (HiZ) [Greene93]  Rasterize to full resolution z buffer  Create HiZ buffer – Find the maximum depth in each NxN tile  Perform occlusion query with HiZ buffer  General algorithm works for both SW and HW occlusion culling Z-buffer Based Culling Full resolution depth buffer HiZ buffer Complex object Bounding shape Dragon model courtesy of Stanford University Computer Graphics Laboratory
  • 11. 11 Intel Software Occlusion Culling Framework [CMK16] Algorithm phases: 1. Rasterize a few designated occluder objects to z buffer – Heavily SSE/AVX optimized – Parallel triangle setup – Parallel pixel depth computation 2. Compute 1-level HiZ buffer (and throw away z buffer) 3. Perform queries and render surviving objects
  • 12. 12  Rendering to z-buffer per pixel  Updating HiZ tile needs all pixels within the tile  Occlusion Query per tile  Wouldn’t it be nice to compute HiZ directly? – Being conservative is the only requirement  Idea: use alternative HiZ representation Z-buffer Based Culling Full resolution depth buffer HiZ buffer
  • 13. 13 Alternative HiZ buffer representation Masked Occlusion Culling for Graphics Hardware [AHAM15]  Two depth values per tile  Per-pixel selection mask zmax 0 zmax 1 Layer selection mask 0 0 0 1 0 0 1 1 0 0 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0 0 1 1 0 0 0 1 0 0 0 1
  • 20. 20 Masked Occlusion Culling [AHAM15] Merge ?
  • 22. 22 Masked Occlusion Culling [AHAM15] CulledNot culled
  • 23. 23 Masked Occlusion Culling [AHAM15] Triangle meshes
  • 24. 24  Originally designed for graphics hardware  Directly update HiZ buffer without computing a full res z buffer  Decouples coverage sampling (rasterization) and depth computation Masked Occlusion Culling [AHAM15] Approximate, conservative HiZ buffer Depth buffer
  • 25. 25 Masked Software Occlusion Culling Could Masked Occlusion Culling [AHAM15] be really fast for software occlusion culling?  Much less memory to read/write than full res z-buffer  Updates use bitmasks – can process many pixels in parallel (i.e. SSE/AVX)  No need to compute per-pixel depths – Would need a fast SW rasterizer to compute coverage Turns out it can   Paper presented at High Performance Graphics this year [HAAM16]  Source code available!
  • 26. 26 Single Instruction, Multiple Data (SIMD) 3 3 5 6 2 32 bits 32 bits 32 bits 32 bits 32 bits A A 5 5 7 3 5B B + + + ++ 8 8 12 9 7 256 bits AVXx86 4 1 4 10 5 11 4 5 + + + + 9 12 8 15 32 bits 32 bits 32 bits 32 bits
  • 27. 27 Single Instruction, Multiple Data (SIMD) 32 bits AVXx86 0xAC1DBA5EAC1DBA5EAC1DBA5EAC1DBA5E51CAFE3751CAFE3751CAFE3751CAFE37 256 bits A 0x51CAFE3751CAFE3751CAFE3751CAFE37AC1DBA5EAC1DBA5EAC1DBA5EAC1DBA5EB & 0x0008BA160008BA160008BA160008BA160008BA160008BA160008BA160008BA16 0xAC1DBA5EA 0x51CAFE37B & 0x0008BA16
  • 28. New algorithm target architecture Supported in our library codeEasily extended to AVX-512 28 An abridged history of Intel’s SIMD instruction sets SSE, 1999 128b wide SSE2, 2001 SSE4, 2006 Intel® microarchitecture code name Nehalem AVX, 2011 256b wide 2nd Gen Intel® Core™ Processors AVX2, 2013 4th Gen Intel® Core™ Processors AVX-512, 2016 512b wide 1998 2017
  • 30. 30 Algorithm Overview Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform 8-wide triangle setup 8 scanlines 256 pixels (8 tiles with 8x4 pixels) Tile traversal Triangle setup
  • 31. 31 Transform and Clip Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform
  • 32. 32 Compute Bounding Box  Padded to 32x8 pixel supertiles Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform
  • 33. 33 Compute Depth Plane  Depth = ax + by + c – Conservative tile depth: Check sign of a and b – Can be incrementally updated Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform -, - +, - -, + +, + Clamp to vertex depths + a + b
  • 34. 34 Supertile Traversal Order Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform
  • 35. 35 AVX Register Layout Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform
  • 36. 36 AVX Register Layout  One scanline per SIMD lane Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform
  • 37.  Compute slopes (∆y/∆x) once – Similar to regular scanline rasterizers 37 Edge Slopes Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform
  • 38. 38 Compute Intersections  Compute intersections for each scanline – Eight scanlines in parallel using AVX Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform Intersections
  • 39. 39 Compute Coverage Mask  Start with full coverage mask Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform Intersections
  • 40. 40 Compute Coverage Mask >> >> >> >> >> >> >> >>  Start with full coverage mask – Shift each lane (scanline) to intersection – AVX2 and later have per-lane shift instruction Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform Intersections
  • 41. 41 Compute Coverage Mask  Repeat the same process for the next edge Left edge Right edge Right edge Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform Intersections
  • 42. 42 Compute Coverage Mask  Repeat the same process for the next edge – Edge is facing right  invert mask Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform Intersections
  • 43. 43 Compute Coverage Mask  Combine masks of all overlapping edges Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform
  • 44. 44 Compute Coverage Mask  Combine masks of all overlapping edges – Using bitwise AND Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform
  • 45. 45 Compute Coverage Mask  Combine masks of all overlapping edges – Using bitwise AND Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform
  • 46. 46 Shuffle Mask  Shuffle mask to form better shaped tiles – Before: each SIMD lane is a scanline Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform
  • 47. 47 Shuffle Mask  Shuffle mask to form better shaped tiles – Before: each SIMD lane is a scanline – After: each SIMD lane is a 8x4 tile Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform
  • 48. 48 Depth Test  Interpolate conservative depth (per 8x4 tile)  Test against buffer Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform Buffer
  • 49. 49 Update Tile Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform  Two code paths (can be switched compile time) – Original update method [AHAM15] – New update method tailored for SW [HAAM16]  Why use a new update method? – Faster – same culling power – Less accurate than original, more dependent on render order – Works best if you render front-to-back
  • 50. 50 Update Tile, New Method [HAAM16] Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform  zmax is the reference layer – Maximum value for the entire tile  zmax is the working layer – Maximum value for a subset of the tile – Updated as – New depth = max(zmax , zmax) – New mask = TriangleMask OR LayerMask  Whenever working layer mask is full, overwrite reference layer 1 1 tri 0
  • 56.  Discard heuristic: If zmax – zmax > zmax – zmax , discard working layer 56 Update Tile tri1 10 Restart
  • 60. 60 Update Tile Update Depth test Compute coverage Traversal setup Depth plane Compute bounds Clip Transform  Update is quicker than original [AHAM15]  Test is also quicker – Need only to test against reference layer (zmax)0
  • 61.
  • 62. 62 Results Intel Occlusion Culling Sample  Clear: Clearing the depth buffer  Geom: Transform & project geometry  Rast: Triangle setup & occluder rasterization  Gen: Compute HiZ buffer from full resolution z buffer  Test: Perform occlusion queries 3.7x16x (μs) Old [CMK16] New [HAAM16]
  • 63. 63 Performance comparison for camera animation Results First frame Last frame Old NewFrustum only
  • 64. Code is available as open-source
  • 65. 65 Masked Occlusion Culling API void SetResolution(); void SetNearClipPlane(); void ClearBuffer(); static void TransformVertices(); Result RenderTriangles(); Result TestTriangles(); Result TestRect(); void ComputePixelDepthBuffer(); OcclusionCullingStatistics GetStatistics(); Setup Debug Render & query
  • 66. 66 Masked Occlusion Culling API Result RenderTriangles( float *inVtx, uint *inTris, int nTris, ClipPlanes mask, ScissorRect *scissor, VertexLayout &layout );  Render to the software HiZ buffer // Clip space vertex positions // Index array (Indices to inVtx buffer) // Triangle count (the number of index triplets in inTris) // Mask for potential frustum bound overlap // Scissor region // Vertex format of inTris. There is a fast-path for AoS with (x, y, z, w) coordinates
  • 67. 67 Masked Occlusion Culling API Result RenderTriangles( float *inVtx, uint *inTris, int nTris, ClipPlanes mask, ScissorRect *scissor, VertexLayout &layout ); Eye View frustum Near plane mask = 0 mask = leftPlane | nearPlane  Clipping is not free... – If you’re already doing frustum culling, let the API know the outcome 
  • 68. 68 Masked Occlusion Culling API Result RenderTriangles( float *inVtx, uint *inTris, int nTris, ClipPlanes mask, ScissorRect *scissor, VertexLayout &layout ); Eye View frustum Scissor region (screen space AABB)  Can be used for threading – One scissor region per thread
  • 69. 69 Masked Occlusion Culling API Result TestTriangles( float *inVtx, uint *inTris, int nTris, ClipPlanes mask, ScissorRect *scissor, VertexLayout &layout );  Test triangles against the software HiZ buffer – Does not update the buffer // Returns the collective culling outcome of the triangles // Clip space vertex positions // Index array (Indices to inVtx buffer) // Triangle count (the number of index triplets in inTris) // Mask for potential frustum bound overlap // Scissor region // Vertex format of inTris. There is a fast-path for AoS with (x, y, z, w) coordinates
  • 70. 70 Masked Occlusion Culling API Result TestRect( float xmin, float ymin, float xmax, float ymax, float wmin );  Test rectangle against the software HiZ buffer – Does not update the buffer // Returns the culling outcome of the screen space rectangle /* Screen space bounds: [xmin, ymin] – [xmax, ymax] */ // Conservative clip space w (typically the w-component of the nearest bbox vertex in clip space)
  • 71. 71 Example use case: Scene Bounding Volume Hierarchy (BVH) traversal and culling ClearBuffer(); prioQueue.push(root); while (!prioQueue.empty()) { Node node = prioQueue.pop(); if (FrustumTest(node) == Culled) continue; compute_screen_space_bounds(node); if (TestRect(bounds) == Culled) continue; if (node is InnerNode) { prioQueue.push(node.left, dist); prioQueue.push(node.right, dist); } else (node is Leaf) { TransformVertices(leaf.vertices); RenderTriangles(xfVertices); send_leaf_to_GPU(); } } RenderFrame Culled!
  • 72. 72 Essential Tools We Have Relied On  Intel® VTune™ – https://software.intel.com/en-us/intel-vtune-amplifier-xe  SSE/AVX intrinsics guide – https://software.intel.com/sites/landingpage/IntrinsicsGuide/
  • 73. 73 References [AHAM15] ANDERSSON M., HASSELGREN J., AKENINE-MÖLLER T.: Masked Depth Culling for Graphics Hardware. ACM Transactions on Graphics 34, 6 (2015), pp. 188:1–188:9 [CMK16] CHANDRASEKARAN C., MCNABB D., KUAH K., FAUCONNEAU M., GIESEN F.: Software Occlusion Culling. Published online at: https://software.intel.com/en-us/articles/ software-occlusion-culling, (2013–2016) [Col11] COLLIN D.: Culling the Battlefield. Game Developer’s Conference (presentation), (2011) [Greene93] GREENE N., KASS M., MILLER G.: Hierarchical Z-Buffer Visibility. In Proceedings of SIGGRAPH, (1993), pp. 231–238 [HA15] HAAR U., AALTONEN S.: GPU-Driven Rendering Pipelines. SIGGRAPH Advances in Real-Time Rendering in Games course, (2015) [HAAM16] HASSELGREN J., ANDERSSON M., AKENINE-MÖLLER T.: Masked Software Occlusion Culling. High Performance Graphics, (2016)
  • 74. 74 Check it out!  GitHub: Lightweight library – https://github.com/GameTechDev/MaskedOcclusionCulling  GitHub: Example integrated in Intel’s Software Occlusion Culling demo – https://github.com/GameTechDev/OcclusionCulling  Project page: Masked Software Occlusion Culling – https://software.intel.com/en-us/articles/masked-software-occlusion-culling  Questions and feedback welcome – magnus.andersson@intel.com
  • 75. Legal Notices and Disclaimers Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com. Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit http://www.intel.com/performance. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance. Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward-looking statements that involve a number of risks and uncertainties. A detailed discussion of the factors that could affect Intel’s results and plans is included in Intel’s SEC filings, including the annual report on Form 10-K. All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. © 2016 Intel Corporation. Intel, the Intel logo, VTune and others are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.