SlideShare a Scribd company logo
1 of 56
Download to read offline
v4




Johan Andersson, Rendering Architect, DICE
Overview

Feature:                                   Performance:
  ›   Deferred Shading                      ›   Instancing
  ›   Compute Shader Tile-Based Lighting    ›   Parallel dispatch
  ›   Terrain Displacement Mapping          ›   Multi-GPU
  ›   Direct Stereo 3D rendering            ›   Resource Streaming


Quality:
  ›   Antialiasing: MSAA                   Conclusions
  ›   Antialiasing: FXAA
  ›   Antialiasing: SRAA                   Q&A
  ›   Transparency Supersampling
› FPS
› Fall 2011
› DX11, Xbox 360 and PS3
› Frostbite 2 engine
› Developed for Battlefield 3 and future DICE/EA games
› Massive focus on creating simple to use & powerful workflows
› Major pushes in animation, destruction, streaming, rendering,
  lighting and landscapes
DX11 API only
 › Requires a DX10 or DX11 GPUs
 › Requires Vista SP1 or Windows 7
 › No Windows XP!
Why?
 › CPU performance win
 › GPU performance & quality win
 › Ease of development - no legacy
 › Future proof
BF3 is a big title - will drive OS & HW adoption
 › Which is good for your game as well! 
Switched to Deferred Shading in FB2
 › Rich mix of Outdoor + Indoor + Urban environments in BF3
 › Wanted lots more light sources
Why not Forward Rendering?
 › Light culling / shader permutations not efficient for us
 › Expensive & more difficult decaling / destruction masking
Why not Light Pre-pass?
 › 2x geometry pass too expensive on both CPU & GPU for us
 › Was able to generalize our BRDF enough to just a few variations
 › Saw major potential in full tile-based deferred shading
See also:
 › Nicolas Thibieroz’s talk ”Deferred Shading Optimizations”
Weaknesses with traditional deferred lighting/shading:

 › Massive overdraw & ROP cost when having lots of big light sources
 › Expensive to have multiple per-pixel materials in light shaders
 › MSAA lighting can be slow (non-coherent, extra BW)
1. Divide screen into tiles and determine
which lights affects which tiles

2. Only apply the visible light sources on
pixels
 › Custom shader with multiple lights
 › Reduced bandwidth & setup cost

How can we do this best in DX11?
Tile-based Deferred Shading using Compute Shaders


Primarily for analytical light sources
 › Point lights, cone lights, line lights
 › No shadows
 › Requires Compute Shader 5.0

Hybrid Graphics/Compute shading pipeline:
 › Graphics pipeline rasterizes gbuffers for opaque surfaces
 › Compute pipeline uses gbuffers, culls lights, computes lighting &
     combines with shading
 ›   Graphics pipeline renders transparent surfaces on top
1 thread per pixel, 16x16 thread groups (aka tile)

  Input: gbuffers, depth buffer & list of lights
  Output: fully composited & lit HDR texture
                                                         Normal         Roughness
Texture2D<float4>   gbufferTexture0 : register(t0);
Texture2D<float4>   gbufferTexture1 : register(t1);
Texture2D<float4>   gbufferTexture2 : register(t2);
Texture2D<float4>   depthTexture : register(t3);

RWTexture2D<float4> outputTexture : register(u0);

#define BLOCK_SIZE 16
[numthreads(BLOCK_SIZE,BLOCK_SIZE,1)]
void csMain(                                          Diffuse Albedo   Specular Albedo
    uint3 groupId : SV_GroupID,
    uint3 groupThreadId : SV_GroupThreadID,
    uint groupIndex: SV_GroupIndex,
    uint3 dispatchThreadId : SV_DispatchThreadID)
{
    ...
}
groupshared uint minDepthInt;
1. Load gbuffers & depth                              groupshared uint maxDepthInt;

                                                      // --- globals above, function below -------

2. Calculate min & max z in threadgroup / tile        float depth =
 › Using InterlockedMin/Max on groupshared variable       depthTexture.Load(uint3(texCoord, 0)).r;

 › Atomics only work on ints 
                                                      uint depthInt = asuint(depth);

 › Can cast float to int (z is always +)              minDepthInt = 0xFFFFFFFF;
                                                      maxDepthInt = 0;
                                                      GroupMemoryBarrierWithGroupSync();

                                                      InterlockedMin(minDepthInt, depthInt);
                                                      InterlockedMax(maxDepthInt, depthInt);

                                                      GroupMemoryBarrierWithGroupSync();
                                                      float minGroupDepth = asfloat(minDepthInt);
                                                      float maxGroupDepth = asfloat(maxDepthInt);
Determine visible light sources for each tile                 Per-tile visible light count
    › Cull all light sources against tile frustum            (black = 0 lights, white = 40)

 Input (global):
    › Light list, frustum & SW occlusion culled
 Output (per tile):
    › # of visible light sources
    › Index list of visible light sources
                       Lights         Indices
Global list            1000+          0 1 2 3 4 5 6 7 8 ..
Tile visible list      ~0-40+         0 2 5 6 8 ..
struct Light {
                                                           float3 pos; float sqrRadius;
                                                           float3 color; float invSqrRadius;
                                                       };
                                                       int lightCount;
                                                       StructuredBuffer<Light> lights;
3a. Each thread switches to process lights
instead of pixels                                      groupshared uint visibleLightCount = 0;

 › Wow, parallelism switcharoo!                        groupshared uint visibleLightIndices[1024];

 › 256 light sources in parallel                       // --- globals above, cont. function below ---
 › Multiple iterations for >256 lights                 uint threadCount = BLOCK_SIZE*BLOCK_SIZE;
                                                       uint passCount = (lightCount+threadCount-1) / threadCount;
3b. Intersect light and tile
 › Multiple variants – accuracy vs perf                for (uint passIt = 0; passIt < passCount; ++passIt)

 › Tile min & max z is used as a ”depth bounds” test   {
                                                           uint lightIndex = passIt*threadCount + groupIndex;

3c. Append visible light indices to list                   // prevent overrun by clamping to a last ”null” light
 › Atomic add to threadgroup shared memory                 lightIndex = min(lightIndex, lightCount);

 › ”inlined stream compaction”                             if (intersects(lights[lightIndex], tile))
                                                           {
3d. Switch back to processing pixels                           uint offset;

 › Synchronize the thread group                                InterlockedAdd(visibleLightCount, 1, offset);

 › We now know which light sources affect the tile
                                                               visibleLightIndices[offset] = lightIndex;
                                                           }
                                                       }

                                                       GroupMemoryBarrierWithGroupSync();
4. For each pixel, accumulate lighting from visible lights
    › Read from tile visible light index list in groupshared memory
                                                                         Computed lighting
5. Combine lighting & shading albedos
    › Output is non-MSAA HDR texture
    › Render transparent surfaces on top
float3 color = 0;

for (uint lightIt = 0; lightIt < visibleLightCount; ++lightIt)
{
    uint lightIndex = visibleLightIndices[lightIt];
    Light light = lights[lightIndex];

      color += diffuseAlbedo * evaluateLightDiffuse(light, gbuffer);
      color += specularAlbedo * evaluateLightSpecular(light, gbuffer);
}
Massive lighting test scene - 1000 large point lights
Only edge pixels need full per-sample lighting
 › But edges have bad screen-space coherency! Inefficient
Compute Shader can build efficient coherent pixel list
 › Evaluate lighting for each pixel (sample 0)
 › Determine if pixel requires per-sample lighting
 › If so, add to atomic list in shared memory
 › When all pixels are done, synchronize
 › Go through and light sample 1-3 for pixels in list
Major performance improvement!
 › Described in detail in [Lauritzen10]
Battlefield 3 terrains
 › Huge area & massive view distances
 › Dynamic destructible heightfields
 › Procedural virtual texturing
 › Streamed heightfields, colormaps, masks
 › Full details at at a later conference
We stream in source heightfield data at close to pixel ratio
 › Derive high-quality per-pixel normals in shader
How can we increase detail even further on DX11?
 › Create better silhouettes and improved depth
 › Keep small-scale detail (dynamic craters & slopes)
Normal mapped terrain
Displacement mapped terrain
Straight high-res heightfields, no procedural detail
 › Lots of data in the heightfields
 › Pragmatic & simple choice
 › No changes in physics/collisions
 › No content changes, artists see the true detail they created
Uses DX11 fixed edge tessellation factors
 › Stable, no swimming vertices
 › Though can be wasteful
 › Height morphing for streaming by fetching 2 heightfields in domain
   shader & blend based on patch CLOD factor

More work left to figure optimal tessellation scheme for our
use case
Nvidia’s 3D Vision drivers is a good and almost automatic
stereo 3D rendering method
 › But only for forward rendering, doesn’t work with deferred shading
 › Or on AMD or Intel GPUs
 › Transparent surfaces do not get proper 3D depth

We instead use explicit 3D stereo rendering
 › Render unique frame for each eye
 › Works with deferred shading & includes all surfaces
 › Higher performance requirements, 2x draw calls
Works with Nvidia’s 3D Vision and AMD’s HD3D
 › Similar to OpenGL quad buffer support
 › Ask your friendly IHV contact how
Draw calls can still be major performance bottleneck
 › Lots of materials / lots of variation
 › Complex shadowmaps
 › High detail / long view distances
 › Full 3D stereo rendering
Battlefield have lots of use cases for heavy instancing      Richard Huddy:
 › Props, foliage, debris, destruction, mesh particles     ”Batch batch batch!”




Batching submissions is still important, just as before!
DX9-style stream instancing is good, but restrictive
 › Extra vertex attributes, GPU overhead
 › Can’t be (efficiently) combined with skinning
 › Used primarily for tiny meshes (particles, foliage)
DX10/DX11 brings support for shader Buffer objects
 › Vertex shaders have access to SV_InstanceID
 › Can do completely arbitrary loads, not limited to fixed elements
 › Can support per-instance arrays and other data structures!

Let’s rethink how instancing can be implemented..
Multiple object types
 › Rigid / skinned / composite meshes
Multiple object lighting types
 › Small/dynamic: light probes
 › Large/static: light maps
Different types of instancing data we have
 › Transform                     float4x3
 › Skinning transforms           float4x3 array
 › SH light probe                float4 x 4
 › Lightmap UV scale/offset      float4

Let’s pack all instancing data into single big buffer!
Buffer<float4> instanceVectorBuffer : register(t0);
cbuffer a
{
    float g_startVector;
    float g_vectorsPerInstance;
}
VsOutput main(
    // ....
    uint instanceId : SV_InstanceId)
{
    uint worldMatrixVectorOffset = g_startVector + input.instanceId * g_vectorsPerInstance + 0;
    uint probeVectorOffset       = g_startVector + input.instanceId * g_vectorsPerInstance + 3;
    float4 r0 = instanceVectorBuffer.Load(worldMatrixVectorOffset + 0);
    float4 r1 = instanceVectorBuffer.Load(worldMatrixVectorOffset + 1);
    float4 r2 = instanceVectorBuffer.Load(worldMatrixVectorOffset + 2);
    float4   lightProbeShR   =   instanceVectorBuffer.Load(probeVectorOffset   +   0);
    float4   lightProbeShG   =   instanceVectorBuffer.Load(probeVectorOffset   +   1);
    float4   lightProbeShB   =   instanceVectorBuffer.Load(probeVectorOffset   +   2);
    float4   lightProbeShO   =   instanceVectorBuffer.Load(probeVectorOffset   +   3);
    // ....
}
half4 weights = input.boneWeights;
int4 indices = (int4)input.boneIndices;

float4 skinnedPos   =    mul(float4(pos,1),   getSkinningMatrix(indices[0])).xyz   *   weights[0];
       skinnedPos   +=   mul(float4(pos,1),   getSkinningMatrix(indices[1])).xyz   *   weights[1];
       skinnedPos   +=   mul(float4(pos,1),   getSkinningMatrix(indices[2])).xyz   *   weights[2];
       skinnedPos   +=   mul(float4(pos,1),   getSkinningMatrix(indices[3])).xyz   *   weights[3];

// ...

float4x3 getSkinningMatrix(uint boneIndex)
{
    uint vectorOffset = g_startVector + instanceId * g_vectorsPerInstance;
    vectorOffset += boneIndex*3;
    float4 r0 = instanceVectorBuffer.Load(vectorOffset + 0);
    float4 r1 = instanceVectorBuffer.Load(vectorOffset + 1);
    float4 r2 = instanceVectorBuffer.Load(vectorOffset + 2);
    return createMat4x3(r0, r1, r2);
}
Single draw call per object type instead of per instance
 › Minor GPU hit for big CPU gain
Instancing does not break when skinning parts
 › More deterministic & better overall performance
End result is typically 1500-2000 draw calls
 › Regardless of how many object instances the artists place!
 › Instead of 3000-7000 draw calls in some heavy cases
Great key DX11 feature!
 › Improve performance by scaling dispatching to D3D to more cores
 › Reduce frame latency
How we use it:
 › DX11 deferred context per HW thread
 › Renderer builds list of all draw calls we want to do for each
   rendering ”layer” of the frame
 › Split draw calls for each layer into chunks of ~256
 › Dispatch chunks in parallel to the deferred contexts to generate
   command lists
 › Render to immediate context & execute command lists
 › Profit! *

* but theory != practice
Still no performant drivers available for our use case 
 › Have waited for 2 years and still are
 › Big driver codebases takes time to refactor                 quagmire [ˈkwægˌmaɪə ˈkwɒg-]n

 › IHVs vs Microsoft quagmire
                                                               1. (Earth Sciences / Physical Geography) a soft
                                                               wet area of land that gives way under the feet; bog

 › Heavy driver threads collide with game threads              2. an awkward, complex, or embarrassing
                                                               situation


How it should work (an utopia?)
 › Driver does not create any processing threads of its own
 › Game submits workload in parallel to multiple deferred contexts
 › Driver make sure almost all processing required happens on the
     draw call on the deferred context
 ›   Game dispatches command list on immediate context, driver does
     absolute minimal work with it



Still good to design engine for + instancing is great!
Even with modern GPUs with lots of memory,
resource streaming is often required
 › Can’t require 1+ GB graphics cards
 › BF3 levels have much more than 1 GB of textures &
     meshes
 ›   Reduced load times

But creating & destroying DX resources in-frame
has never been a good thing
 › Can cause non-deterministic & large driver / OS stalls 
 › Has been a problem for a very long time in DX
 › About time to fix it
Have worked with Microsoft, Nvidia &            Resource creation flow:
AMD to make sure we can do stall                 ›   Streaming system determines resources to
                                                     load (texture mipmaps or mesh LODs)
free async resource streaming of GPU             ›   Add up DX resource creation on to queue on
resources in DX11                                    our own separate low-priority thread
                                                 ›   Thread creates resources using initial data,
                                                     signals streaming system
 ›   Want neither CPU nor GPU perf hit           ›   Resource created, game starts using it

 ›   Key foundation: DX11 concurrent creates
                                                Enables async stall-free DMA in drivers!
 D3D11_FEATURE_DATA_THREADING threadingCaps;

 FB_SAFE_DX(m_device->CheckFeatureSupport(      Resource destruction flow:
     D3D11_FEATURE_THREADING,                    ›   Streaming system deletes D3D resource
     &threadingCaps, sizeof(threadingCaps)));    ›   Driver keeps it internally alive until GPU frames
                                                     using it are done. NO STALL!
 if (threadingCaps.DriverConcurrentCreates)
Efficiently supporting Crossfire and SLI is important for us
 › High-end consumers expect it
 › IHVs expect it (and can help!)
 › Allows targeting higher-end HW then currently available during dev
AFR is easy: Do not reuse GPU resources from previous frame!
 › UpdateSubResource is easy & robust to use for dynamic resources, but not ideal
All of our playtests run with exe named AFR-FriendlyD3D.exe
 › Disables all driver AFR synchronization workarounds
 › Rather find corruption during dev then have bad perf
 › ForceSingleGPU.exe is also useful to track down issues
Aliasing 


Reducing aliasing is one of our key visual priorities
 › Creates a more smooth gameplay experience
 › Extra challenging goal due to deferred shading
We use multiple methods:
 › MSAA –Multisample Antialiasing
 › FXAA – Fast Approximate Antialiasing
 › SRAA – Sub-pixel Reconstruction Antialiasing
 › TSAA – Transparency Supersampling Antialiasing
Our solution:
 › Deferred geometry pass renders with MSAA (2x, 4x or 8x)
 › Light shaders evaluate per-sample (when needed), averages the
     samples and writes out per-pixel
 ›   Transparent surfaces rendered on top without MSAA

1080p gbuffer+z with 4x MSAA is 158 MB
 › Lots of memory and lots of bandwidth 
 › Could be tiled to reduce memory usage
 › Very nice quality though 
Our (overall) highest quality option
 › But not fast enough for more GPUs
 › Need additional solution(s)..
”Fast Approximate Antialiasing”
 ›   GPU-based MLAA implementation by
     Timothy Lottes (Nvidia)
 ›   Multiple quality options
 ›   ~1.3 ms/f for 1080p on Geforce 580

Pros & cons:
 ›   Superb antialiased long edges! 
 ›   Smooth overall picture 
 ›   Reasonably fast 
 ›   Moving pictures do not benefit as much 
 ›   ”Blurry aliasing” 

Will be released here at GDC’11
 ›   Part of Nvidia’s example SDK
”Sub-pixel Reconstruction Antialiasing”              Pros:
 ›   Presented at I3D’11 2 weeks ago [Chajdas11]      ›   Better at capturing small scale detail 
 ›   Use 4x MSAA buffers to improve reconstruction    ›   Less ”pixel snapping” than MLAA variants 

Multiple variants:                                   Cons:
 ›   MSAA depth buffer                                ›   Extra MSAA z/normal/id pass can be
 ›   MSAA depth buffer + normal buffer                    prohibitive 
 ›   MSAA Primitive ID / spatial hashing buffer       ›   Integration not as easy due to extra pass 
No antialiasing
4x MSAA
4x SRAA, depth-only
4x SRAA, depth+normal
FXAA
No antialiasing
None of the AA solutions can solve all aliasing
 › Foliage & other alpha-tested surfaces are extra difficult cases
 › Undersampled geometry, requires sub-samples

DX 10.1 added SV_COVERAGE as a pixel shader output
DX 11 added SV_COVERAGE as a pixel shader input


What does this mean?
 › We get full programmable control over the coverage mask
 › No need to waste the alpha channel output (great for deferred)
 › We can do partial supersampling on alpha-tested surfaces!
static const float2 msaaOffsets[4] =
Shade per-pixel but evaluate alpha            {
                                                  float2(-0.125, -0.375),    float2(0.375, -0.125),
test per-sample                                   float2(-0.375, 0.125),     float2(0.125, 0.375)

  ›
                                              };
      Write out coverage bitmask
  ›   MSAA offsets are defined in DX 10.1
                                              void psMain(
                                                  out float4 color : SV_Target,
  ›   Requires shader permutation for each    {
                                                  out uint coverage : SV_Coverage)

      MSAA level                                  float2 texCoord_ddx = ddx(texCoord);
                                                  float2 texCoord_ddy = ddy(texCoord);
                                                  coverage = 0;
Gradients still quite limited
  ›
                                                  [unroll]
      But much better than standard MSAA!        for (int i = 0; i < 4; ++i)
  ›   Can combine with screen-space dither
                                                  {
                                                      float2 texelOffset = msaaOffsets[i].x * texCoord_ddx;
                                                      texelOffset +=       msaaOffsets[i].y * texCoord_ddy;
                                                      float4 temp = tex.SampleLevel(sampler, texCoord + texelOffset);
See also:
  ›
                                                      if (temp.a >= 0.5)
      DirectX SDK ’TransparencyAA10.1                     coverage |= 1<<i;
  ›   GDC’09 STALKER talk [Lobanchikov09]     }
                                                  }
Alpha testing   4x MSAA + Transparency Supersampling
DX11 is here – in force
 › 2011 is a great year to focus on DX11
 › 2012 will be a great year for more to drop DX9
We’ve found lots & lots of quality & performance enhancing
features using DX11
 › And so will you for your game!
 › Still have only started, lots of potential
Take advantage of the PC strengths, don’t hold it back
 › Big end-user value
 › Good preparation for Gen4
›   Christina Coffin (@ChristinaCoffin)        ›   Battlefield team
›   Mattias Widmark                            ›   Frostbite team
›   Kenny Magnusson
›   Colin Barré-Brisebois (@ZigguratVertigo)   ›   Microsoft
                                               ›   Nvidia
›   Timothy Lottes (@TimothyLottes)            ›   AMD
›   Matthäus G. Chajdas (@NIV_Anteru)          ›   Intel
›   Miguel Sainz
›   Nicolas Thibieroz
Email: repi@dice.se
                                                           Blog: http://repi.se
                                                           Twitter: @repi




Battlefield 3 & Frostbite 2 talks at GDC’11:
Mon 1:45    DX11 Rendering in Battlefield 3                    Johan Andersson
Wed 10:30   SPU-based Deferred Shading in Battlefield 3        Christina Coffin
            for PlayStation 3
Wed 3:00    Culling the Battlefield: Data Oriented Design in   Daniel Collin
            Practice
Thu 1:30    Lighting You Up in Battlefield 3                   Kenny Magnusson
Fri 4:05    Approximating Translucency for a Fast, Cheap       Colin Barré-Brisebois
            & Convincing Subsurface Scattering Look



                          For more DICE talks: http://publications.dice.se
› [Lobanchikov09] Igor A. Lobanchikov, ”GSC Game World’s
    STALKER: Clear Sky – a showcase for Direct3D 10.0/1” GDC’09.
    http://developer.amd.com/gpu_assets/01gdc09ad3ddstalkercle
    arsky210309.ppt
›   [Lauritzen10] Andrew Lauritzen, ”Deferred Rendering for Current
    and Future Rendering Pipelines” SIGGRAPH’10
    http://bps10.idav.ucdavis.edu/
›   [Chajdas11] Matthäus G. Chajdas et al ”Subpixel Reconstruction
    Antialiasing for Deferred Shading.”. I3D’11

More Related Content

What's hot

Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorchJun Young Park
 
Calibrating Lighting and Materials in Far Cry 3
Calibrating Lighting and Materials in Far Cry 3Calibrating Lighting and Materials in Far Cry 3
Calibrating Lighting and Materials in Far Cry 3stevemcauley
 
Separable bilateral filtering for fast video preprocessing
Separable bilateral filtering for fast video preprocessingSeparable bilateral filtering for fast video preprocessing
Separable bilateral filtering for fast video preprocessingTuan Q. Pham
 
A Development of Log-based Game AI using Deep Learning
A Development of Log-based Game AI using Deep LearningA Development of Log-based Game AI using Deep Learning
A Development of Log-based Game AI using Deep LearningSuntae Kim
 
The Ring programming language version 1.6 book - Part 50 of 189
The Ring programming language version 1.6 book - Part 50 of 189The Ring programming language version 1.6 book - Part 50 of 189
The Ring programming language version 1.6 book - Part 50 of 189Mahmoud Samir Fayed
 
NoSQL @ CodeMash 2010
NoSQL @ CodeMash 2010NoSQL @ CodeMash 2010
NoSQL @ CodeMash 2010Ben Scofield
 
GPU-Accelerated Parallel Computing
GPU-Accelerated Parallel ComputingGPU-Accelerated Parallel Computing
GPU-Accelerated Parallel ComputingJun Young Park
 
Neural Turing Machine
Neural Turing MachineNeural Turing Machine
Neural Turing MachineKiho Suh
 
yt: An Analysis and Visualization System for Astrophysical Simulation Data
yt: An Analysis and Visualization System for Astrophysical Simulation Datayt: An Analysis and Visualization System for Astrophysical Simulation Data
yt: An Analysis and Visualization System for Astrophysical Simulation DataJohn ZuHone
 
CUDA by Example : Thread Cooperation : Notes
CUDA by Example : Thread Cooperation : NotesCUDA by Example : Thread Cooperation : Notes
CUDA by Example : Thread Cooperation : NotesSubhajit Sahu
 
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image TransformDIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image Transformvijayanand Kandaswamy
 
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeks
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeksBeginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeks
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeksJinTaek Seo
 
Reducing the time of heuristic algorithms for the Symmetric TSP
Reducing the time of heuristic algorithms for the Symmetric TSPReducing the time of heuristic algorithms for the Symmetric TSP
Reducing the time of heuristic algorithms for the Symmetric TSPgpolo
 
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager ExecutionTensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager ExecutionTaegyun Jeon
 
Around the World in 80 Shaders
Around the World in 80 ShadersAround the World in 80 Shaders
Around the World in 80 Shadersstevemcauley
 
Heroku Postgres Cloud Database Webinar
Heroku Postgres Cloud Database WebinarHeroku Postgres Cloud Database Webinar
Heroku Postgres Cloud Database WebinarSalesforce Developers
 
Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014Mark Rees
 

What's hot (19)

Developing games for Series 40 full-touch UI
Developing games for Series 40 full-touch UIDeveloping games for Series 40 full-touch UI
Developing games for Series 40 full-touch UI
 
Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorch
 
Calibrating Lighting and Materials in Far Cry 3
Calibrating Lighting and Materials in Far Cry 3Calibrating Lighting and Materials in Far Cry 3
Calibrating Lighting and Materials in Far Cry 3
 
Separable bilateral filtering for fast video preprocessing
Separable bilateral filtering for fast video preprocessingSeparable bilateral filtering for fast video preprocessing
Separable bilateral filtering for fast video preprocessing
 
A Development of Log-based Game AI using Deep Learning
A Development of Log-based Game AI using Deep LearningA Development of Log-based Game AI using Deep Learning
A Development of Log-based Game AI using Deep Learning
 
The Ring programming language version 1.6 book - Part 50 of 189
The Ring programming language version 1.6 book - Part 50 of 189The Ring programming language version 1.6 book - Part 50 of 189
The Ring programming language version 1.6 book - Part 50 of 189
 
NoSQL @ CodeMash 2010
NoSQL @ CodeMash 2010NoSQL @ CodeMash 2010
NoSQL @ CodeMash 2010
 
GPU-Accelerated Parallel Computing
GPU-Accelerated Parallel ComputingGPU-Accelerated Parallel Computing
GPU-Accelerated Parallel Computing
 
Neural Turing Machine
Neural Turing MachineNeural Turing Machine
Neural Turing Machine
 
yt: An Analysis and Visualization System for Astrophysical Simulation Data
yt: An Analysis and Visualization System for Astrophysical Simulation Datayt: An Analysis and Visualization System for Astrophysical Simulation Data
yt: An Analysis and Visualization System for Astrophysical Simulation Data
 
CUDA by Example : Thread Cooperation : Notes
CUDA by Example : Thread Cooperation : NotesCUDA by Example : Thread Cooperation : Notes
CUDA by Example : Thread Cooperation : Notes
 
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image TransformDIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
 
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeks
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeksBeginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeks
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeks
 
Reducing the time of heuristic algorithms for the Symmetric TSP
Reducing the time of heuristic algorithms for the Symmetric TSPReducing the time of heuristic algorithms for the Symmetric TSP
Reducing the time of heuristic algorithms for the Symmetric TSP
 
Motionblur
MotionblurMotionblur
Motionblur
 
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager ExecutionTensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
 
Around the World in 80 Shaders
Around the World in 80 ShadersAround the World in 80 Shaders
Around the World in 80 Shaders
 
Heroku Postgres Cloud Database Webinar
Heroku Postgres Cloud Database WebinarHeroku Postgres Cloud Database Webinar
Heroku Postgres Cloud Database Webinar
 
Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014
 

Similar to Gdc2011 direct x 11 rendering in battlefield 3

Rendering Techniques in Rise of the Tomb Raider
Rendering Techniques in Rise of the Tomb RaiderRendering Techniques in Rise of the Tomb Raider
Rendering Techniques in Rise of the Tomb RaiderEidos-Montréal
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Johan Andersson
 
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3Electronic Arts / DICE
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility bufferWolfgang Engel
 
FlameWorks GTC 2014
FlameWorks GTC 2014FlameWorks GTC 2014
FlameWorks GTC 2014Simon Green
 
Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Ki Hyunwoo
 
Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Graham Wihlidal
 
The Technology behind Shadow Warrior, ZTG 2014
The Technology behind Shadow Warrior, ZTG 2014The Technology behind Shadow Warrior, ZTG 2014
The Technology behind Shadow Warrior, ZTG 2014Jarosław Pleskot
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And EffectsThomas Goddard
 
Practical spherical harmonics based PRT methods.ppsx
Practical spherical harmonics based PRT methods.ppsxPractical spherical harmonics based PRT methods.ppsx
Practical spherical harmonics based PRT methods.ppsxMannyK4
 
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacketCsw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacketCanSecWest
 
A Scalable Real-Time Many-Shadowed-Light Rendering System
A Scalable Real-Time Many-Shadowed-Light Rendering SystemA Scalable Real-Time Many-Shadowed-Light Rendering System
A Scalable Real-Time Many-Shadowed-Light Rendering SystemBo Li
 
Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Johan Andersson
 
A Bit More Deferred Cry Engine3
A Bit More Deferred   Cry Engine3A Bit More Deferred   Cry Engine3
A Bit More Deferred Cry Engine3guest11b095
 
02 direct3 d_pipeline
02 direct3 d_pipeline02 direct3 d_pipeline
02 direct3 d_pipelineGirish Ghate
 
Efficient Image Processing - Nicolas Roard
Efficient Image Processing - Nicolas RoardEfficient Image Processing - Nicolas Roard
Efficient Image Processing - Nicolas RoardParis Android User Group
 
Rendering basics
Rendering basicsRendering basics
Rendering basicsicedmaster
 

Similar to Gdc2011 direct x 11 rendering in battlefield 3 (20)

DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3
 
Rendering Techniques in Rise of the Tomb Raider
Rendering Techniques in Rise of the Tomb RaiderRendering Techniques in Rise of the Tomb Raider
Rendering Techniques in Rise of the Tomb Raider
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility buffer
 
FlameWorks GTC 2014
FlameWorks GTC 2014FlameWorks GTC 2014
FlameWorks GTC 2014
 
Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1
 
Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016
 
The Technology behind Shadow Warrior, ZTG 2014
The Technology behind Shadow Warrior, ZTG 2014The Technology behind Shadow Warrior, ZTG 2014
The Technology behind Shadow Warrior, ZTG 2014
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And Effects
 
Practical spherical harmonics based PRT methods.ppsx
Practical spherical harmonics based PRT methods.ppsxPractical spherical harmonics based PRT methods.ppsx
Practical spherical harmonics based PRT methods.ppsx
 
Beyond porting
Beyond portingBeyond porting
Beyond porting
 
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacketCsw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
 
A Scalable Real-Time Many-Shadowed-Light Rendering System
A Scalable Real-Time Many-Shadowed-Light Rendering SystemA Scalable Real-Time Many-Shadowed-Light Rendering System
A Scalable Real-Time Many-Shadowed-Light Rendering System
 
Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!
 
A Bit More Deferred Cry Engine3
A Bit More Deferred   Cry Engine3A Bit More Deferred   Cry Engine3
A Bit More Deferred Cry Engine3
 
02 direct3 d_pipeline
02 direct3 d_pipeline02 direct3 d_pipeline
02 direct3 d_pipeline
 
Efficient Image Processing - Nicolas Roard
Efficient Image Processing - Nicolas RoardEfficient Image Processing - Nicolas Roard
Efficient Image Processing - Nicolas Roard
 
Programar para GPUs
Programar para GPUsProgramar para GPUs
Programar para GPUs
 
Rendering basics
Rendering basicsRendering basics
Rendering basics
 

More from drandom

노동진 Mega splatting
노동진 Mega splatting노동진 Mega splatting
노동진 Mega splattingdrandom
 
The Settler 7- 포스트모템
The Settler 7- 포스트모템The Settler 7- 포스트모템
The Settler 7- 포스트모템drandom
 
최우성 구별하여 사용하면 좋은 프로젝트 관련용어
최우성 구별하여 사용하면 좋은 프로젝트 관련용어최우성 구별하여 사용하면 좋은 프로젝트 관련용어
최우성 구별하여 사용하면 좋은 프로젝트 관련용어drandom
 
이은석 마비노기 영웅전 포스트모템 2부 (kgc 버전)
이은석   마비노기 영웅전 포스트모템 2부 (kgc 버전)이은석   마비노기 영웅전 포스트모템 2부 (kgc 버전)
이은석 마비노기 영웅전 포스트모템 2부 (kgc 버전)drandom
 
이은석 마비노기 영웅전 포스트모템 1부 (kgc 버전)
이은석   마비노기 영웅전 포스트모템 1부 (kgc 버전)이은석   마비노기 영웅전 포스트모템 1부 (kgc 버전)
이은석 마비노기 영웅전 포스트모템 1부 (kgc 버전)drandom
 
김항기 시나리오 기반 온라인 게임 서버 부하 테스트 기술
김항기 시나리오 기반 온라인 게임 서버 부하 테스트 기술김항기 시나리오 기반 온라인 게임 서버 부하 테스트 기술
김항기 시나리오 기반 온라인 게임 서버 부하 테스트 기술drandom
 
Mmorpg 사례로 본 만족도와 재접속
Mmorpg 사례로 본 만족도와 재접속Mmorpg 사례로 본 만족도와 재접속
Mmorpg 사례로 본 만족도와 재접속drandom
 
그럴듯한 랜덤 생성 컨텐츠 만들기
그럴듯한 랜덤 생성 컨텐츠 만들기그럴듯한 랜덤 생성 컨텐츠 만들기
그럴듯한 랜덤 생성 컨텐츠 만들기drandom
 
Angel cunado_The Terrain Of KUF2
Angel cunado_The Terrain Of KUF2Angel cunado_The Terrain Of KUF2
Angel cunado_The Terrain Of KUF2drandom
 
오토데스크 게임수퍼유저투어 part 2. 제작 파이프라인 현대화
오토데스크 게임수퍼유저투어 part 2. 제작 파이프라인 현대화오토데스크 게임수퍼유저투어 part 2. 제작 파이프라인 현대화
오토데스크 게임수퍼유저투어 part 2. 제작 파이프라인 현대화drandom
 
오토데스크 게임 수퍼유저투어 part1.human ik 및 motionbuilder를 이용한 ea sports game 제작사례
오토데스크 게임 수퍼유저투어 part1.human ik 및 motionbuilder를 이용한 ea sports game 제작사례오토데스크 게임 수퍼유저투어 part1.human ik 및 motionbuilder를 이용한 ea sports game 제작사례
오토데스크 게임 수퍼유저투어 part1.human ik 및 motionbuilder를 이용한 ea sports game 제작사례drandom
 
Landscape 구축, Unreal Engine 3 의 차세대 terrain system
Landscape 구축, Unreal Engine 3 의 차세대 terrain systemLandscape 구축, Unreal Engine 3 의 차세대 terrain system
Landscape 구축, Unreal Engine 3 의 차세대 terrain systemdrandom
 
MMORPG게임엔진의 현재와미래 by 장언일
MMORPG게임엔진의 현재와미래 by 장언일MMORPG게임엔진의 현재와미래 by 장언일
MMORPG게임엔진의 현재와미래 by 장언일drandom
 
The Technology Behind the DirectX 11 Unreal Engine"Samaritan" Demo
The Technology Behind the DirectX 11 Unreal Engine"Samaritan" DemoThe Technology Behind the DirectX 11 Unreal Engine"Samaritan" Demo
The Technology Behind the DirectX 11 Unreal Engine"Samaritan" Demodrandom
 
Lighting you up in Battlefield 3
Lighting you up in Battlefield 3Lighting you up in Battlefield 3
Lighting you up in Battlefield 3drandom
 
From Content for Next Generation Games by Chris Wells
From Content for Next Generation Games by Chris WellsFrom Content for Next Generation Games by Chris Wells
From Content for Next Generation Games by Chris Wellsdrandom
 

More from drandom (16)

노동진 Mega splatting
노동진 Mega splatting노동진 Mega splatting
노동진 Mega splatting
 
The Settler 7- 포스트모템
The Settler 7- 포스트모템The Settler 7- 포스트모템
The Settler 7- 포스트모템
 
최우성 구별하여 사용하면 좋은 프로젝트 관련용어
최우성 구별하여 사용하면 좋은 프로젝트 관련용어최우성 구별하여 사용하면 좋은 프로젝트 관련용어
최우성 구별하여 사용하면 좋은 프로젝트 관련용어
 
이은석 마비노기 영웅전 포스트모템 2부 (kgc 버전)
이은석   마비노기 영웅전 포스트모템 2부 (kgc 버전)이은석   마비노기 영웅전 포스트모템 2부 (kgc 버전)
이은석 마비노기 영웅전 포스트모템 2부 (kgc 버전)
 
이은석 마비노기 영웅전 포스트모템 1부 (kgc 버전)
이은석   마비노기 영웅전 포스트모템 1부 (kgc 버전)이은석   마비노기 영웅전 포스트모템 1부 (kgc 버전)
이은석 마비노기 영웅전 포스트모템 1부 (kgc 버전)
 
김항기 시나리오 기반 온라인 게임 서버 부하 테스트 기술
김항기 시나리오 기반 온라인 게임 서버 부하 테스트 기술김항기 시나리오 기반 온라인 게임 서버 부하 테스트 기술
김항기 시나리오 기반 온라인 게임 서버 부하 테스트 기술
 
Mmorpg 사례로 본 만족도와 재접속
Mmorpg 사례로 본 만족도와 재접속Mmorpg 사례로 본 만족도와 재접속
Mmorpg 사례로 본 만족도와 재접속
 
그럴듯한 랜덤 생성 컨텐츠 만들기
그럴듯한 랜덤 생성 컨텐츠 만들기그럴듯한 랜덤 생성 컨텐츠 만들기
그럴듯한 랜덤 생성 컨텐츠 만들기
 
Angel cunado_The Terrain Of KUF2
Angel cunado_The Terrain Of KUF2Angel cunado_The Terrain Of KUF2
Angel cunado_The Terrain Of KUF2
 
오토데스크 게임수퍼유저투어 part 2. 제작 파이프라인 현대화
오토데스크 게임수퍼유저투어 part 2. 제작 파이프라인 현대화오토데스크 게임수퍼유저투어 part 2. 제작 파이프라인 현대화
오토데스크 게임수퍼유저투어 part 2. 제작 파이프라인 현대화
 
오토데스크 게임 수퍼유저투어 part1.human ik 및 motionbuilder를 이용한 ea sports game 제작사례
오토데스크 게임 수퍼유저투어 part1.human ik 및 motionbuilder를 이용한 ea sports game 제작사례오토데스크 게임 수퍼유저투어 part1.human ik 및 motionbuilder를 이용한 ea sports game 제작사례
오토데스크 게임 수퍼유저투어 part1.human ik 및 motionbuilder를 이용한 ea sports game 제작사례
 
Landscape 구축, Unreal Engine 3 의 차세대 terrain system
Landscape 구축, Unreal Engine 3 의 차세대 terrain systemLandscape 구축, Unreal Engine 3 의 차세대 terrain system
Landscape 구축, Unreal Engine 3 의 차세대 terrain system
 
MMORPG게임엔진의 현재와미래 by 장언일
MMORPG게임엔진의 현재와미래 by 장언일MMORPG게임엔진의 현재와미래 by 장언일
MMORPG게임엔진의 현재와미래 by 장언일
 
The Technology Behind the DirectX 11 Unreal Engine"Samaritan" Demo
The Technology Behind the DirectX 11 Unreal Engine"Samaritan" DemoThe Technology Behind the DirectX 11 Unreal Engine"Samaritan" Demo
The Technology Behind the DirectX 11 Unreal Engine"Samaritan" Demo
 
Lighting you up in Battlefield 3
Lighting you up in Battlefield 3Lighting you up in Battlefield 3
Lighting you up in Battlefield 3
 
From Content for Next Generation Games by Chris Wells
From Content for Next Generation Games by Chris WellsFrom Content for Next Generation Games by Chris Wells
From Content for Next Generation Games by Chris Wells
 

Recently uploaded

Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025Rndexperts
 
Call Girls Aslali 7397865700 Ridhima Hire Me Full Night
Call Girls Aslali 7397865700 Ridhima Hire Me Full NightCall Girls Aslali 7397865700 Ridhima Hire Me Full Night
Call Girls Aslali 7397865700 Ridhima Hire Me Full Nightssuser7cb4ff
 
group_15_empirya_p1projectIndustrial.pdf
group_15_empirya_p1projectIndustrial.pdfgroup_15_empirya_p1projectIndustrial.pdf
group_15_empirya_p1projectIndustrial.pdfneelspinoy
 
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025Rndexperts
 
ARt app | UX Case Study
ARt app | UX Case StudyARt app | UX Case Study
ARt app | UX Case StudySophia Viganò
 
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Iconic Global Solution - web design, Digital Marketing services
Iconic Global Solution - web design, Digital Marketing servicesIconic Global Solution - web design, Digital Marketing services
Iconic Global Solution - web design, Digital Marketing servicesIconic global solution
 
Call Girls in Ashok Nagar Delhi ✡️9711147426✡️ Escorts Service
Call Girls in Ashok Nagar Delhi ✡️9711147426✡️ Escorts ServiceCall Girls in Ashok Nagar Delhi ✡️9711147426✡️ Escorts Service
Call Girls in Ashok Nagar Delhi ✡️9711147426✡️ Escorts Servicejennyeacort
 
FiveHypotheses_UIDMasterclass_18April2024.pdf
FiveHypotheses_UIDMasterclass_18April2024.pdfFiveHypotheses_UIDMasterclass_18April2024.pdf
FiveHypotheses_UIDMasterclass_18April2024.pdfShivakumar Viswanathan
 
Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...
Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...
Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...Yantram Animation Studio Corporation
 
(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一
(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一
(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一Fi sss
 
专业一比一美国亚利桑那大学毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degree
专业一比一美国亚利桑那大学毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degree专业一比一美国亚利桑那大学毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degree
专业一比一美国亚利桑那大学毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degreeyuu sss
 
Untitled presedddddddddddddddddntation (1).pptx
Untitled presedddddddddddddddddntation (1).pptxUntitled presedddddddddddddddddntation (1).pptx
Untitled presedddddddddddddddddntation (1).pptxmapanig881
 
Call Girls Meghani Nagar 7397865700 Independent Call Girls
Call Girls Meghani Nagar 7397865700  Independent Call GirlsCall Girls Meghani Nagar 7397865700  Independent Call Girls
Call Girls Meghani Nagar 7397865700 Independent Call Girlsssuser7cb4ff
 
cda.pptx critical discourse analysis ppt
cda.pptx critical discourse analysis pptcda.pptx critical discourse analysis ppt
cda.pptx critical discourse analysis pptMaryamAfzal41
 
306MTAMount UCLA University Bachelor's Diploma in Social Media
306MTAMount UCLA University Bachelor's Diploma in Social Media306MTAMount UCLA University Bachelor's Diploma in Social Media
306MTAMount UCLA University Bachelor's Diploma in Social MediaD SSS
 
'CASE STUDY OF INDIRA PARYAVARAN BHAVAN DELHI ,
'CASE STUDY OF INDIRA PARYAVARAN BHAVAN DELHI ,'CASE STUDY OF INDIRA PARYAVARAN BHAVAN DELHI ,
'CASE STUDY OF INDIRA PARYAVARAN BHAVAN DELHI ,Aginakm1
 
PORTAFOLIO 2024_ ANASTASIYA KUDINOVA
PORTAFOLIO   2024_  ANASTASIYA  KUDINOVAPORTAFOLIO   2024_  ANASTASIYA  KUDINOVA
PORTAFOLIO 2024_ ANASTASIYA KUDINOVAAnastasiya Kudinova
 

Recently uploaded (20)

Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025
 
Call Girls Aslali 7397865700 Ridhima Hire Me Full Night
Call Girls Aslali 7397865700 Ridhima Hire Me Full NightCall Girls Aslali 7397865700 Ridhima Hire Me Full Night
Call Girls Aslali 7397865700 Ridhima Hire Me Full Night
 
group_15_empirya_p1projectIndustrial.pdf
group_15_empirya_p1projectIndustrial.pdfgroup_15_empirya_p1projectIndustrial.pdf
group_15_empirya_p1projectIndustrial.pdf
 
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025
 
ARt app | UX Case Study
ARt app | UX Case StudyARt app | UX Case Study
ARt app | UX Case Study
 
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Call Girls in Pratap Nagar, 9953056974 Escort Service
Call Girls in Pratap Nagar,  9953056974 Escort ServiceCall Girls in Pratap Nagar,  9953056974 Escort Service
Call Girls in Pratap Nagar, 9953056974 Escort Service
 
Iconic Global Solution - web design, Digital Marketing services
Iconic Global Solution - web design, Digital Marketing servicesIconic Global Solution - web design, Digital Marketing services
Iconic Global Solution - web design, Digital Marketing services
 
Call Girls in Ashok Nagar Delhi ✡️9711147426✡️ Escorts Service
Call Girls in Ashok Nagar Delhi ✡️9711147426✡️ Escorts ServiceCall Girls in Ashok Nagar Delhi ✡️9711147426✡️ Escorts Service
Call Girls in Ashok Nagar Delhi ✡️9711147426✡️ Escorts Service
 
FiveHypotheses_UIDMasterclass_18April2024.pdf
FiveHypotheses_UIDMasterclass_18April2024.pdfFiveHypotheses_UIDMasterclass_18April2024.pdf
FiveHypotheses_UIDMasterclass_18April2024.pdf
 
Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...
Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...
Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...
 
(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一
(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一
(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一
 
专业一比一美国亚利桑那大学毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degree
专业一比一美国亚利桑那大学毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degree专业一比一美国亚利桑那大学毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degree
专业一比一美国亚利桑那大学毕业证成绩单pdf电子版制作修改#真实工艺展示#真实防伪#diploma#degree
 
Untitled presedddddddddddddddddntation (1).pptx
Untitled presedddddddddddddddddntation (1).pptxUntitled presedddddddddddddddddntation (1).pptx
Untitled presedddddddddddddddddntation (1).pptx
 
Call Girls Meghani Nagar 7397865700 Independent Call Girls
Call Girls Meghani Nagar 7397865700  Independent Call GirlsCall Girls Meghani Nagar 7397865700  Independent Call Girls
Call Girls Meghani Nagar 7397865700 Independent Call Girls
 
cda.pptx critical discourse analysis ppt
cda.pptx critical discourse analysis pptcda.pptx critical discourse analysis ppt
cda.pptx critical discourse analysis ppt
 
306MTAMount UCLA University Bachelor's Diploma in Social Media
306MTAMount UCLA University Bachelor's Diploma in Social Media306MTAMount UCLA University Bachelor's Diploma in Social Media
306MTAMount UCLA University Bachelor's Diploma in Social Media
 
'CASE STUDY OF INDIRA PARYAVARAN BHAVAN DELHI ,
'CASE STUDY OF INDIRA PARYAVARAN BHAVAN DELHI ,'CASE STUDY OF INDIRA PARYAVARAN BHAVAN DELHI ,
'CASE STUDY OF INDIRA PARYAVARAN BHAVAN DELHI ,
 
PORTAFOLIO 2024_ ANASTASIYA KUDINOVA
PORTAFOLIO   2024_  ANASTASIYA  KUDINOVAPORTAFOLIO   2024_  ANASTASIYA  KUDINOVA
PORTAFOLIO 2024_ ANASTASIYA KUDINOVA
 

Gdc2011 direct x 11 rendering in battlefield 3

  • 2. Overview Feature: Performance: › Deferred Shading › Instancing › Compute Shader Tile-Based Lighting › Parallel dispatch › Terrain Displacement Mapping › Multi-GPU › Direct Stereo 3D rendering › Resource Streaming Quality: › Antialiasing: MSAA Conclusions › Antialiasing: FXAA › Antialiasing: SRAA Q&A › Transparency Supersampling
  • 3.
  • 4. › FPS › Fall 2011 › DX11, Xbox 360 and PS3 › Frostbite 2 engine
  • 5. › Developed for Battlefield 3 and future DICE/EA games › Massive focus on creating simple to use & powerful workflows › Major pushes in animation, destruction, streaming, rendering, lighting and landscapes
  • 6. DX11 API only › Requires a DX10 or DX11 GPUs › Requires Vista SP1 or Windows 7 › No Windows XP! Why? › CPU performance win › GPU performance & quality win › Ease of development - no legacy › Future proof BF3 is a big title - will drive OS & HW adoption › Which is good for your game as well! 
  • 7. Switched to Deferred Shading in FB2 › Rich mix of Outdoor + Indoor + Urban environments in BF3 › Wanted lots more light sources Why not Forward Rendering? › Light culling / shader permutations not efficient for us › Expensive & more difficult decaling / destruction masking Why not Light Pre-pass? › 2x geometry pass too expensive on both CPU & GPU for us › Was able to generalize our BRDF enough to just a few variations › Saw major potential in full tile-based deferred shading See also: › Nicolas Thibieroz’s talk ”Deferred Shading Optimizations”
  • 8. Weaknesses with traditional deferred lighting/shading: › Massive overdraw & ROP cost when having lots of big light sources › Expensive to have multiple per-pixel materials in light shaders › MSAA lighting can be slow (non-coherent, extra BW)
  • 9.
  • 10. 1. Divide screen into tiles and determine which lights affects which tiles 2. Only apply the visible light sources on pixels › Custom shader with multiple lights › Reduced bandwidth & setup cost How can we do this best in DX11?
  • 11. Tile-based Deferred Shading using Compute Shaders Primarily for analytical light sources › Point lights, cone lights, line lights › No shadows › Requires Compute Shader 5.0 Hybrid Graphics/Compute shading pipeline: › Graphics pipeline rasterizes gbuffers for opaque surfaces › Compute pipeline uses gbuffers, culls lights, computes lighting & combines with shading › Graphics pipeline renders transparent surfaces on top
  • 12. 1 thread per pixel, 16x16 thread groups (aka tile) Input: gbuffers, depth buffer & list of lights Output: fully composited & lit HDR texture Normal Roughness Texture2D<float4> gbufferTexture0 : register(t0); Texture2D<float4> gbufferTexture1 : register(t1); Texture2D<float4> gbufferTexture2 : register(t2); Texture2D<float4> depthTexture : register(t3); RWTexture2D<float4> outputTexture : register(u0); #define BLOCK_SIZE 16 [numthreads(BLOCK_SIZE,BLOCK_SIZE,1)] void csMain( Diffuse Albedo Specular Albedo uint3 groupId : SV_GroupID, uint3 groupThreadId : SV_GroupThreadID, uint groupIndex: SV_GroupIndex, uint3 dispatchThreadId : SV_DispatchThreadID) { ... }
  • 13. groupshared uint minDepthInt; 1. Load gbuffers & depth groupshared uint maxDepthInt; // --- globals above, function below ------- 2. Calculate min & max z in threadgroup / tile float depth = › Using InterlockedMin/Max on groupshared variable depthTexture.Load(uint3(texCoord, 0)).r; › Atomics only work on ints  uint depthInt = asuint(depth); › Can cast float to int (z is always +) minDepthInt = 0xFFFFFFFF; maxDepthInt = 0; GroupMemoryBarrierWithGroupSync(); InterlockedMin(minDepthInt, depthInt); InterlockedMax(maxDepthInt, depthInt); GroupMemoryBarrierWithGroupSync(); float minGroupDepth = asfloat(minDepthInt); float maxGroupDepth = asfloat(maxDepthInt);
  • 14. Determine visible light sources for each tile Per-tile visible light count › Cull all light sources against tile frustum (black = 0 lights, white = 40) Input (global): › Light list, frustum & SW occlusion culled Output (per tile): › # of visible light sources › Index list of visible light sources Lights Indices Global list 1000+ 0 1 2 3 4 5 6 7 8 .. Tile visible list ~0-40+ 0 2 5 6 8 ..
  • 15. struct Light { float3 pos; float sqrRadius; float3 color; float invSqrRadius; }; int lightCount; StructuredBuffer<Light> lights; 3a. Each thread switches to process lights instead of pixels groupshared uint visibleLightCount = 0; › Wow, parallelism switcharoo! groupshared uint visibleLightIndices[1024]; › 256 light sources in parallel // --- globals above, cont. function below --- › Multiple iterations for >256 lights uint threadCount = BLOCK_SIZE*BLOCK_SIZE; uint passCount = (lightCount+threadCount-1) / threadCount; 3b. Intersect light and tile › Multiple variants – accuracy vs perf for (uint passIt = 0; passIt < passCount; ++passIt) › Tile min & max z is used as a ”depth bounds” test { uint lightIndex = passIt*threadCount + groupIndex; 3c. Append visible light indices to list // prevent overrun by clamping to a last ”null” light › Atomic add to threadgroup shared memory lightIndex = min(lightIndex, lightCount); › ”inlined stream compaction” if (intersects(lights[lightIndex], tile)) { 3d. Switch back to processing pixels uint offset; › Synchronize the thread group InterlockedAdd(visibleLightCount, 1, offset); › We now know which light sources affect the tile visibleLightIndices[offset] = lightIndex; } } GroupMemoryBarrierWithGroupSync();
  • 16. 4. For each pixel, accumulate lighting from visible lights › Read from tile visible light index list in groupshared memory Computed lighting 5. Combine lighting & shading albedos › Output is non-MSAA HDR texture › Render transparent surfaces on top float3 color = 0; for (uint lightIt = 0; lightIt < visibleLightCount; ++lightIt) { uint lightIndex = visibleLightIndices[lightIt]; Light light = lights[lightIndex]; color += diffuseAlbedo * evaluateLightDiffuse(light, gbuffer); color += specularAlbedo * evaluateLightSpecular(light, gbuffer); }
  • 17. Massive lighting test scene - 1000 large point lights
  • 18. Only edge pixels need full per-sample lighting › But edges have bad screen-space coherency! Inefficient Compute Shader can build efficient coherent pixel list › Evaluate lighting for each pixel (sample 0) › Determine if pixel requires per-sample lighting › If so, add to atomic list in shared memory › When all pixels are done, synchronize › Go through and light sample 1-3 for pixels in list Major performance improvement! › Described in detail in [Lauritzen10]
  • 19.
  • 20. Battlefield 3 terrains › Huge area & massive view distances › Dynamic destructible heightfields › Procedural virtual texturing › Streamed heightfields, colormaps, masks › Full details at at a later conference We stream in source heightfield data at close to pixel ratio › Derive high-quality per-pixel normals in shader How can we increase detail even further on DX11? › Create better silhouettes and improved depth › Keep small-scale detail (dynamic craters & slopes)
  • 21.
  • 24. Straight high-res heightfields, no procedural detail › Lots of data in the heightfields › Pragmatic & simple choice › No changes in physics/collisions › No content changes, artists see the true detail they created Uses DX11 fixed edge tessellation factors › Stable, no swimming vertices › Though can be wasteful › Height morphing for streaming by fetching 2 heightfields in domain shader & blend based on patch CLOD factor More work left to figure optimal tessellation scheme for our use case
  • 25. Nvidia’s 3D Vision drivers is a good and almost automatic stereo 3D rendering method › But only for forward rendering, doesn’t work with deferred shading › Or on AMD or Intel GPUs › Transparent surfaces do not get proper 3D depth We instead use explicit 3D stereo rendering › Render unique frame for each eye › Works with deferred shading & includes all surfaces › Higher performance requirements, 2x draw calls Works with Nvidia’s 3D Vision and AMD’s HD3D › Similar to OpenGL quad buffer support › Ask your friendly IHV contact how
  • 26.
  • 27. Draw calls can still be major performance bottleneck › Lots of materials / lots of variation › Complex shadowmaps › High detail / long view distances › Full 3D stereo rendering Battlefield have lots of use cases for heavy instancing Richard Huddy: › Props, foliage, debris, destruction, mesh particles ”Batch batch batch!” Batching submissions is still important, just as before!
  • 28. DX9-style stream instancing is good, but restrictive › Extra vertex attributes, GPU overhead › Can’t be (efficiently) combined with skinning › Used primarily for tiny meshes (particles, foliage) DX10/DX11 brings support for shader Buffer objects › Vertex shaders have access to SV_InstanceID › Can do completely arbitrary loads, not limited to fixed elements › Can support per-instance arrays and other data structures! Let’s rethink how instancing can be implemented..
  • 29. Multiple object types › Rigid / skinned / composite meshes Multiple object lighting types › Small/dynamic: light probes › Large/static: light maps Different types of instancing data we have › Transform float4x3 › Skinning transforms float4x3 array › SH light probe float4 x 4 › Lightmap UV scale/offset float4 Let’s pack all instancing data into single big buffer!
  • 30. Buffer<float4> instanceVectorBuffer : register(t0); cbuffer a { float g_startVector; float g_vectorsPerInstance; } VsOutput main( // .... uint instanceId : SV_InstanceId) { uint worldMatrixVectorOffset = g_startVector + input.instanceId * g_vectorsPerInstance + 0; uint probeVectorOffset = g_startVector + input.instanceId * g_vectorsPerInstance + 3; float4 r0 = instanceVectorBuffer.Load(worldMatrixVectorOffset + 0); float4 r1 = instanceVectorBuffer.Load(worldMatrixVectorOffset + 1); float4 r2 = instanceVectorBuffer.Load(worldMatrixVectorOffset + 2); float4 lightProbeShR = instanceVectorBuffer.Load(probeVectorOffset + 0); float4 lightProbeShG = instanceVectorBuffer.Load(probeVectorOffset + 1); float4 lightProbeShB = instanceVectorBuffer.Load(probeVectorOffset + 2); float4 lightProbeShO = instanceVectorBuffer.Load(probeVectorOffset + 3); // .... }
  • 31. half4 weights = input.boneWeights; int4 indices = (int4)input.boneIndices; float4 skinnedPos = mul(float4(pos,1), getSkinningMatrix(indices[0])).xyz * weights[0]; skinnedPos += mul(float4(pos,1), getSkinningMatrix(indices[1])).xyz * weights[1]; skinnedPos += mul(float4(pos,1), getSkinningMatrix(indices[2])).xyz * weights[2]; skinnedPos += mul(float4(pos,1), getSkinningMatrix(indices[3])).xyz * weights[3]; // ... float4x3 getSkinningMatrix(uint boneIndex) { uint vectorOffset = g_startVector + instanceId * g_vectorsPerInstance; vectorOffset += boneIndex*3; float4 r0 = instanceVectorBuffer.Load(vectorOffset + 0); float4 r1 = instanceVectorBuffer.Load(vectorOffset + 1); float4 r2 = instanceVectorBuffer.Load(vectorOffset + 2); return createMat4x3(r0, r1, r2); }
  • 32. Single draw call per object type instead of per instance › Minor GPU hit for big CPU gain Instancing does not break when skinning parts › More deterministic & better overall performance End result is typically 1500-2000 draw calls › Regardless of how many object instances the artists place! › Instead of 3000-7000 draw calls in some heavy cases
  • 33. Great key DX11 feature! › Improve performance by scaling dispatching to D3D to more cores › Reduce frame latency How we use it: › DX11 deferred context per HW thread › Renderer builds list of all draw calls we want to do for each rendering ”layer” of the frame › Split draw calls for each layer into chunks of ~256 › Dispatch chunks in parallel to the deferred contexts to generate command lists › Render to immediate context & execute command lists › Profit! * * but theory != practice
  • 34. Still no performant drivers available for our use case  › Have waited for 2 years and still are › Big driver codebases takes time to refactor quagmire [ˈkwægˌmaɪə ˈkwɒg-]n › IHVs vs Microsoft quagmire 1. (Earth Sciences / Physical Geography) a soft wet area of land that gives way under the feet; bog › Heavy driver threads collide with game threads 2. an awkward, complex, or embarrassing situation How it should work (an utopia?) › Driver does not create any processing threads of its own › Game submits workload in parallel to multiple deferred contexts › Driver make sure almost all processing required happens on the draw call on the deferred context › Game dispatches command list on immediate context, driver does absolute minimal work with it Still good to design engine for + instancing is great!
  • 35. Even with modern GPUs with lots of memory, resource streaming is often required › Can’t require 1+ GB graphics cards › BF3 levels have much more than 1 GB of textures & meshes › Reduced load times But creating & destroying DX resources in-frame has never been a good thing › Can cause non-deterministic & large driver / OS stalls  › Has been a problem for a very long time in DX › About time to fix it
  • 36. Have worked with Microsoft, Nvidia & Resource creation flow: AMD to make sure we can do stall › Streaming system determines resources to load (texture mipmaps or mesh LODs) free async resource streaming of GPU › Add up DX resource creation on to queue on resources in DX11 our own separate low-priority thread › Thread creates resources using initial data, signals streaming system › Want neither CPU nor GPU perf hit › Resource created, game starts using it › Key foundation: DX11 concurrent creates Enables async stall-free DMA in drivers! D3D11_FEATURE_DATA_THREADING threadingCaps; FB_SAFE_DX(m_device->CheckFeatureSupport( Resource destruction flow: D3D11_FEATURE_THREADING, › Streaming system deletes D3D resource &threadingCaps, sizeof(threadingCaps))); › Driver keeps it internally alive until GPU frames using it are done. NO STALL! if (threadingCaps.DriverConcurrentCreates)
  • 37. Efficiently supporting Crossfire and SLI is important for us › High-end consumers expect it › IHVs expect it (and can help!) › Allows targeting higher-end HW then currently available during dev AFR is easy: Do not reuse GPU resources from previous frame! › UpdateSubResource is easy & robust to use for dynamic resources, but not ideal All of our playtests run with exe named AFR-FriendlyD3D.exe › Disables all driver AFR synchronization workarounds › Rather find corruption during dev then have bad perf › ForceSingleGPU.exe is also useful to track down issues
  • 38.
  • 39. Aliasing  Reducing aliasing is one of our key visual priorities › Creates a more smooth gameplay experience › Extra challenging goal due to deferred shading We use multiple methods: › MSAA –Multisample Antialiasing › FXAA – Fast Approximate Antialiasing › SRAA – Sub-pixel Reconstruction Antialiasing › TSAA – Transparency Supersampling Antialiasing
  • 40. Our solution: › Deferred geometry pass renders with MSAA (2x, 4x or 8x) › Light shaders evaluate per-sample (when needed), averages the samples and writes out per-pixel › Transparent surfaces rendered on top without MSAA 1080p gbuffer+z with 4x MSAA is 158 MB › Lots of memory and lots of bandwidth  › Could be tiled to reduce memory usage › Very nice quality though  Our (overall) highest quality option › But not fast enough for more GPUs › Need additional solution(s)..
  • 41. ”Fast Approximate Antialiasing” › GPU-based MLAA implementation by Timothy Lottes (Nvidia) › Multiple quality options › ~1.3 ms/f for 1080p on Geforce 580 Pros & cons: › Superb antialiased long edges!  › Smooth overall picture  › Reasonably fast  › Moving pictures do not benefit as much  › ”Blurry aliasing”  Will be released here at GDC’11 › Part of Nvidia’s example SDK
  • 42. ”Sub-pixel Reconstruction Antialiasing” Pros: › Presented at I3D’11 2 weeks ago [Chajdas11] › Better at capturing small scale detail  › Use 4x MSAA buffers to improve reconstruction › Less ”pixel snapping” than MLAA variants  Multiple variants: Cons: › MSAA depth buffer › Extra MSAA z/normal/id pass can be › MSAA depth buffer + normal buffer prohibitive  › MSAA Primitive ID / spatial hashing buffer › Integration not as easy due to extra pass 
  • 43.
  • 48. FXAA
  • 50. None of the AA solutions can solve all aliasing › Foliage & other alpha-tested surfaces are extra difficult cases › Undersampled geometry, requires sub-samples DX 10.1 added SV_COVERAGE as a pixel shader output DX 11 added SV_COVERAGE as a pixel shader input What does this mean? › We get full programmable control over the coverage mask › No need to waste the alpha channel output (great for deferred) › We can do partial supersampling on alpha-tested surfaces!
  • 51. static const float2 msaaOffsets[4] = Shade per-pixel but evaluate alpha { float2(-0.125, -0.375), float2(0.375, -0.125), test per-sample float2(-0.375, 0.125), float2(0.125, 0.375) › }; Write out coverage bitmask › MSAA offsets are defined in DX 10.1 void psMain( out float4 color : SV_Target, › Requires shader permutation for each { out uint coverage : SV_Coverage) MSAA level float2 texCoord_ddx = ddx(texCoord); float2 texCoord_ddy = ddy(texCoord); coverage = 0; Gradients still quite limited › [unroll] But much better than standard MSAA!  for (int i = 0; i < 4; ++i) › Can combine with screen-space dither { float2 texelOffset = msaaOffsets[i].x * texCoord_ddx; texelOffset += msaaOffsets[i].y * texCoord_ddy; float4 temp = tex.SampleLevel(sampler, texCoord + texelOffset); See also: › if (temp.a >= 0.5) DirectX SDK ’TransparencyAA10.1 coverage |= 1<<i; › GDC’09 STALKER talk [Lobanchikov09] } }
  • 52. Alpha testing 4x MSAA + Transparency Supersampling
  • 53. DX11 is here – in force › 2011 is a great year to focus on DX11 › 2012 will be a great year for more to drop DX9 We’ve found lots & lots of quality & performance enhancing features using DX11 › And so will you for your game! › Still have only started, lots of potential Take advantage of the PC strengths, don’t hold it back › Big end-user value › Good preparation for Gen4
  • 54. Christina Coffin (@ChristinaCoffin) › Battlefield team › Mattias Widmark › Frostbite team › Kenny Magnusson › Colin Barré-Brisebois (@ZigguratVertigo) › Microsoft › Nvidia › Timothy Lottes (@TimothyLottes) › AMD › Matthäus G. Chajdas (@NIV_Anteru) › Intel › Miguel Sainz › Nicolas Thibieroz
  • 55. Email: repi@dice.se Blog: http://repi.se Twitter: @repi Battlefield 3 & Frostbite 2 talks at GDC’11: Mon 1:45 DX11 Rendering in Battlefield 3 Johan Andersson Wed 10:30 SPU-based Deferred Shading in Battlefield 3 Christina Coffin for PlayStation 3 Wed 3:00 Culling the Battlefield: Data Oriented Design in Daniel Collin Practice Thu 1:30 Lighting You Up in Battlefield 3 Kenny Magnusson Fri 4:05 Approximating Translucency for a Fast, Cheap Colin Barré-Brisebois & Convincing Subsurface Scattering Look For more DICE talks: http://publications.dice.se
  • 56. › [Lobanchikov09] Igor A. Lobanchikov, ”GSC Game World’s STALKER: Clear Sky – a showcase for Direct3D 10.0/1” GDC’09. http://developer.amd.com/gpu_assets/01gdc09ad3ddstalkercle arsky210309.ppt › [Lauritzen10] Andrew Lauritzen, ”Deferred Rendering for Current and Future Rendering Pipelines” SIGGRAPH’10 http://bps10.idav.ucdavis.edu/ › [Chajdas11] Matthäus G. Chajdas et al ”Subpixel Reconstruction Antialiasing for Deferred Shading.”. I3D’11