SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
BRINGING UNREAL ENGINE 4 TO OPENGL
Nick Penwarden – Epic Games
Mathias Schott, Evan Hart – NVIDIA | March 20, 2014
About Us
Nick Penwarden
Lead Graphics Programmer – Epic Games
@nickpwd on Twitter
Mathias Schott
Developer Technology Engineer – NVIDIA
Evan Hart
Principal Engineer - NVIDIA
Unreal Engine 4
Latest version of the highly successful UnrealEngine
Long, storied history
Dozens of Licensees
Hundreds of titles
Built to scale
Mobile phones and tablets
Laptop computers
Next-gen Console
High-end PC
Why OpenGL?
The only choice on some platforms:
Mac OS X
iOS
Android
Linux
HTML5/WebGL
Access to new hardware features not tied to OS version.
Microsoft has been tying new versions of Direct3D to new versions of
Windows.
OpenGL frees us of these restrictions: D3D 11.x features on Windows XP!
Porting UE4 to OpenGL
UE4 is a cross platform game engine.
We run on many platforms today (8 by my count!) with more to
come.
Extremely important that we can develop features once and have
them work across all platforms!
(Very) High Level Look at UE4
Game Engine
World
PrimitiveComponents
…
LightComponents
…
Etc.
Renderer
Scene
PrimitiveSceneInfo
…
LightSceneInfo
…
Etc.
RHI
Shaders
Textures
Vertex Buffers
Etc.
CROSS PLATFORM
PLATFORM
DEPENDENT
Render Hardware Interface
Largely based on the D3D11 API
Resource management
Shaders, textures, vertex buffers, etc.
Commands
DrawIndexedPrimitive, Clear, SetTexture, etc.
OpenGL RHI
Much of the API maps in a very straight-forward manner.
RHICreateVertexBuffer -> glGenBuffers + glBufferData
RHIDrawIndexedPrimitive -> glDrawRangeElements
What about versions? 4.x vs 3.x vs ES2 vs ES3
What about extensions?
A lot of code can be shared between versions, even ES2 and GL4.
OpenGL RHI
Use a class hierarchy with static functions.
Each platform ultimately defines its own, typedefs it as FOpenGL.
Optional functions defined here, e.g. FOpenGL::TexStorage2D()
Depending on platform could be unimplemented, glTexStorage2DEXT,
glTexStorage2D, …
Optional features have a corresponding SupportsXXX() function.
E.g. FOpenGL::SupportsTexStorage().
If support depends on an extension, we parse the GL_EXTENSIONS string at boot
time and return that.
If support is always (or never) available for the platform we return a static true
or false. This allows the compiler to remove unneeded branches and dead code.
Allows us to share basically all of our OpenGL code across ES2, GL3, and
GL4!
What about shaders?
That other stuff was easy.
Well, not exactly, Evan and Mathias will explain why later.
We do not want to write our shaders twice!
We have a large, existing HLSL shader code base.
Most of our platforms support an HLSL-like shader language.
We want to be able to compile and validate our shader code offline.
Need reflection to create metadata used by the renderer at runtime.
Which textures are bound to which indices?
Which uniforms are used and thus need to be computed by the CPU?
What are our options?
Can we just compile our HLSL shaders as GLSL with some magical
macros?
No. Maybe for really simple shaders but the languages really are different in
subtle ways that will cause your shaders to break in OpenGL.
So use FXC to compile the HLSL and translate the bytecode!
Can work really well!
But not for UE4: we support Mac OS X as a full development platform.
Any good cross compilers out there?
None that support shader model 5 syntax, compute, etc.
At least none that we could find two years ago :)
HLSLCC
Developed our own internal HLSL cross compiler.
Inspired by glsl-optimizer: https://github.com/aras-p/glsl-optimizer
Started with the Mesa GLSL parser and IR: http://www.mesa3d.org/
Modified the parser to parse shader model 5 HLSL instead of GLSL.
Wrote a Mesa IR to GLSL converter, similar to that used in glsl-
optimizer.
Mesa includes IR optimization: this is an optimizing compiler!
Benefits of Cross Compilation
Write shaders once, they work everywhere.
Contain platform specific workarounds to the compiler itself.
Offline validation and reflection of shaders for OpenGL.
Treat generated GLSL that doesn’t work with a GL driver as a bug in the cross
compiler.
Simplifies the runtime, especially resource binding and uniforms.
Makes mobile preview on PC easy!
GLSL Shader Pipeline
Shader Compiler Input
HLSL
Source
HLSL
Source
HLSL
Source
HLSL
Source
#defines
Compiler Flags
MCPP*
Shader Compiler Output
GLSL
Source
Parameter Map
Parse HLSL -> Abstract Syntax Tree
Convert AST -> Mesa IR
Convert IR for HLSL entry point to GLSL
compatible main()
Optimize IR
Pack uniforms
Optimize IR
Convert IR -> GLSL + parameter map
HLSLCC
* http://mcpp.sourceforge.net/
Resource Binding
When the OpenGL RHI creates the GLSL programs, setting up bind
points is easy:
Renderer binds textures to indices.
RHISetShaderTexture(PixelShader,LightmapParameter,LightmapTexture);
Same as every other platform.
for (int32 i=0; i < NumSamplers; ++i)
{
Location = glGetUniformLocation(Program, “_psi”);
glUniform1i(Location,i);
}
(from parameter map)
Uniform Packing
All global uniforms are packed in to arrays.
Simplifies binding uniforms when creating programs:
We don’t need the strings of the uniform’s names, they are fixed!
glGetUniformLocation(“_pu0”)
Simplifies shadowing of uniforms when rendering:
RHISetShaderParameter(Shader,Parameter,Value)
Copy value to shadow buffer, just like the D3D11 global constant buffer.
Simplifies setting of uniforms in the RHI:
Track dirty ranges. For each dirty range: glUniform4fv(…)
Easy to trade off between the quantity of glUniform calls and the amount of
bytes uploaded per draw call.
Uniform Buffer Emulation (ES2)
Straight-forward extension of uniform packing.
Rendering code gets to treat ES2 like D3D11: it packages uniforms in
to buffers based on the frequency at which uniforms change.
The cross compiler packs used uniforms in to the uniform arrays.
Also provides a copy list: where to get uniforms from and where they
go in the uniform array.
UV Space Differences
-1,-1
+1,+1
0,0
1,1
0,0
1,1
Normalized Device Coordinates
Direct3D UV Space
OpenGL UV Space
From the Shader’s Point of View
-1,-1
+1,+1
0,0
1,1
0,0
1,1
Normalized Device Coordinates
Direct3D UV Space
OpenGL UV Space
Rendering Upside Down
Solutions?
Could sample with (U, 1 - V)
Could render and flip in NDC (X,-Y,Z)
We choose to render upside down.
Fewer choke points: lots of texture samples in shader code!
You have to transform your post-projection vertex positions anyway. D3D
expects Z in [0,1], OpenGL expects Z in [-1,1].
Can wrap all of this in to a single patched projection matrix for no runtime
cost.
You also have to reverse your polygon winding when setting rasterizer state!
BUT: Remember not to flip when rendering to the backbuffer!
After Patching the Projection Matrix
-1,-1
+1,+1
0,0
1,1
0,0
1,1
Normalized Device Coordinates
Direct3D UV Space
OpenGL UV Space
Next…
Evan and Mathias will talk about extending the GL RHI to a full
D3D11 feature set.
Also, how to take all of this stuff and make it fast!
Achieving DX11 Parity
Enable advanced effects
Not a generic DX11 layer
Only handle engine needs
60% shader / 40% engine
Evolves over time
with engine
OpenGL API
From OpenGL 3.x to 4.x
Compute
shaders
General
Shader
upgrades
Unordered
Access
Views
Tessellation
Lots of extensions
And API
enchancements
but
Maintain Core API
Compatibility
ES2, GL3, GL4 etc
General Shader Upgrades
Too many to discuss
Most straight forward
Some ripple in HLSLCC
Type system
IR
Optimization
Different UBO packing rules for
Arrays
Shared Memory
RWTexture*
RWBuffer*
Memory Barriers
Shared Memory Atomics
UAV Atomic
Conversion and packing functions
Texture arrays
Integer Functions
Texture Queries
New Texture Functions
Early Depth Test
Compute Shader API
Simple analogs for Dispatch
Need to manage resources
No separate binding points
i.e. no CSetXXX
Memory behavior is different
DirectX 11 hazard free
OpenGL exposes hazards
Enforce full coherence
Guarantees correctness
Optimize later
//State Setup
glUseProgram( ComputeShader);
SetupTexturesCompute();
SetupImagesCompute();
SetupUniforms();
//Ensure reads are coherent
glMemoryBarrier(GL_ALL_BARRIER_BITS);
glDispatchCompute( X, Y, Z);
//Ensure writes are coherent
glMemoryBarrier(GL_ALL_BARRIER_BITS);
Unordered Access Views
Direct3D 11 world view
Multiple shader object types
RWBuffer, RWTexture2D, RWStructuredBuffer, etc
Unified API for setup
CSSetUnorderedAccessViews
OpenGL world view
GL_image_load_store (Closer to Direct3D 11)
GL_shader_buffer_storage_object
Solution
Use GL_image_load_store as much as possible
Indirection table for other types
Tessellation (WIP)
Both have two shader stages
Direct3D11: Hull Shader & Domain Shader
OpenGL: Tessellation Control Shader &Tessellation Evaluation Shader
Notable divergence: HLSL Hull vs. GLSL Tessellation Control Shader
HLSL has two Hull Shader functions
Hull shader (per control point)
PatchConstantFunction (per patch)
GLSL has a single TESSELLATION_CONTROL_SHADER
Explicitly handle patch constant computations
Tessellation Solution
Execute TCS in phases
First hull shader
Next patch constants
GLSL provides barrier
Ensures values are ready
Patch constant function is scalar
Just execute on one thread
Possibly vectorize later
layout( vertices = 3) out;
void main()
{
//Execute Hull shader
barrier();
if (gl_InvocationID == 0)
{
//Execute Patch constant
}
}
Optimization
Feature complete is only half the battle
No one wants pretty OpenGL 4.3 at half the performance
Initial test scene < ¼ of Direct3D
90 FPS versus ~13 FPS (IIRC)
HLSLCC Optimizatons
Initially heavily GPU bound
Internal engine tools were fantastic
Just needed to implement proper OpenGL query support
Shadows were the biggest offender
Filtering massively slower
Projection somewhat slower
Filter offsets and projection matrices turned into temp arrays
GPU compilers couldn’t fold through the temp array
Needed to aggressively fold / propagate
Reduced performance deficit to < 2x
Software Bottlenecks
Remaining bottlenecks were mostly CPU
Both application and driver issues
Driver is very solid, but
Unreal Engine 4 is a big test
Heaviest use of new features
This effort made the driver better
Applications issues were often unintuitive
Most problems were synchronizations
Overall, lots of onion peeling
Application & Driver Syncs
Driver likes to perform most of its work on a separate thread
Often improves software performance by 50%
Control panel setting to turn it off for testing
Can be a hindrance with excessive synchronization
Most optimization effort went into resolving these issues
General process
Run code with sampling profiler
Look at inclusive timing results
Find large hotspot on function calling into the driver
Time reflects wait for other thread
Top Offenders
glMapBuffer (and similar)
Needs to resolve and queued glBufferData calls
Replace with malloc + glBufferSubData
glGetTexImage
Used with reallocating mipmap chains
Replace with glCopyImageSubData
glGenXXX
Every time a new object is created
Replace with a cache of names quaried in a large block
glGetError
Just say no, except in debug
Where are we?
Infiltrator performance is at parity
(<5% difference)
Primitive throughput test is faster
Roughly 1 ms
Rendering test is slower
Roughly 1.5-2.0 ms
Where do we go?
More GPU optimization
Driver differences
Loop unrolling improvements
More software optimizations
Reduce object references
Already implemented for UBOs
Bindless Textures
More textures & Less overhead
BufferStorage
Allows persistent mappings
Reduces memory copying
Supporting Android
Tegra K1 opened an interesting door
Mobile chip with full OpenGL 4.4 capability
Common driver code means all the same features
Can we run “full” Unreal Engine 4 content?
Yes
Need to make some optimizations / compromises
No different than a low-end PC
Work done in NVIDIA’s branch
On-going work with Epic to feedback changes
What were the challenges?
High resolution, modest GPU
Rely on scaling
Turn some features back (motion blur, bloom, etc)
Aggressively optimize where feasible
Do reflections always need to be at full resolution?
Modest CPU
Often disable cpu-hungry rendering
Shadows
Optimize API at every turn
Modest Memory
Remove any duplication

Weitere ähnliche Inhalte

Was ist angesagt?

そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>
  そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>  そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>
そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>エピック・ゲームズ・ジャパン Epic Games Japan
 
Forts and Fights Scaling Performance on Unreal Engine*
Forts and Fights Scaling Performance on Unreal Engine*Forts and Fights Scaling Performance on Unreal Engine*
Forts and Fights Scaling Performance on Unreal Engine*Intel® Software
 
次の世代のインタラクティブレンダリング5つの挑戦と10の滅ぶべき技術
次の世代のインタラクティブレンダリング5つの挑戦と10の滅ぶべき技術 次の世代のインタラクティブレンダリング5つの挑戦と10の滅ぶべき技術
次の世代のインタラクティブレンダリング5つの挑戦と10の滅ぶべき技術 Masafumi Takahashi
 
GTMF2016:Unreal Engine 4を利用した先進的なゲーム制作手法 The Unreal Way 2016 Epic Games Japan
GTMF2016:Unreal Engine 4を利用した先進的なゲーム制作手法 The Unreal Way 2016 Epic Games JapanGTMF2016:Unreal Engine 4を利用した先進的なゲーム制作手法 The Unreal Way 2016 Epic Games Japan
GTMF2016:Unreal Engine 4を利用した先進的なゲーム制作手法 The Unreal Way 2016 Epic Games JapanGame Tools & Middleware Forum
 
【UE4.25 新機能】新しいシリアライゼーション機能「Unversioned Property Serialization」について
【UE4.25 新機能】新しいシリアライゼーション機能「Unversioned Property Serialization」について【UE4.25 新機能】新しいシリアライゼーション機能「Unversioned Property Serialization」について
【UE4.25 新機能】新しいシリアライゼーション機能「Unversioned Property Serialization」についてエピック・ゲームズ・ジャパン Epic Games Japan
 

Was ist angesagt? (20)

【UE4.25 新機能】ロードの高速化機能「IOStore」について
【UE4.25 新機能】ロードの高速化機能「IOStore」について【UE4.25 新機能】ロードの高速化機能「IOStore」について
【UE4.25 新機能】ロードの高速化機能「IOStore」について
 
Lightmass Deep Dive 2018 Vol. 2: Lightmap作成のためのLightmass設定方法
Lightmass Deep Dive 2018 Vol. 2: Lightmap作成のためのLightmass設定方法Lightmass Deep Dive 2018 Vol. 2: Lightmap作成のためのLightmass設定方法
Lightmass Deep Dive 2018 Vol. 2: Lightmap作成のためのLightmass設定方法
 
UE4のモバイル向け機能や最新情報などを改めて紹介!2019
UE4のモバイル向け機能や最新情報などを改めて紹介!2019UE4のモバイル向け機能や最新情報などを改めて紹介!2019
UE4のモバイル向け機能や最新情報などを改めて紹介!2019
 
UE4におけるLoadingとGCのProfilingと最適化手法
UE4におけるLoadingとGCのProfilingと最適化手法UE4におけるLoadingとGCのProfilingと最適化手法
UE4におけるLoadingとGCのProfilingと最適化手法
 
そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>
  そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>  そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>
そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>
 
Press Button, Drink Coffee : An Overview of UE4 build pipeline and maintenance
Press Button, Drink Coffee : An Overview of UE4 build pipeline and maintenancePress Button, Drink Coffee : An Overview of UE4 build pipeline and maintenance
Press Button, Drink Coffee : An Overview of UE4 build pipeline and maintenance
 
Forts and Fights Scaling Performance on Unreal Engine*
Forts and Fights Scaling Performance on Unreal Engine*Forts and Fights Scaling Performance on Unreal Engine*
Forts and Fights Scaling Performance on Unreal Engine*
 
UE4アセットリダクション手法紹介
UE4アセットリダクション手法紹介UE4アセットリダクション手法紹介
UE4アセットリダクション手法紹介
 
UE4のローカライズ機能紹介 (UE4 Localization Deep Dive)
UE4のローカライズ機能紹介 (UE4 Localization Deep Dive)UE4のローカライズ機能紹介 (UE4 Localization Deep Dive)
UE4のローカライズ機能紹介 (UE4 Localization Deep Dive)
 
UE4.14で広がるVRの可能性
UE4.14で広がるVRの可能性UE4.14で広がるVRの可能性
UE4.14で広がるVRの可能性
 
[CEDEC2017] UE4プロファイリングツール総おさらい(グラフィクス編)
[CEDEC2017] UE4プロファイリングツール総おさらい(グラフィクス編)[CEDEC2017] UE4プロファイリングツール総おさらい(グラフィクス編)
[CEDEC2017] UE4プロファイリングツール総おさらい(グラフィクス編)
 
最新UE4タイトルでのローカライズ事例 (UE4 Localization Deep Dive)
最新UE4タイトルでのローカライズ事例 (UE4 Localization Deep Dive)最新UE4タイトルでのローカライズ事例 (UE4 Localization Deep Dive)
最新UE4タイトルでのローカライズ事例 (UE4 Localization Deep Dive)
 
次の世代のインタラクティブレンダリング5つの挑戦と10の滅ぶべき技術
次の世代のインタラクティブレンダリング5つの挑戦と10の滅ぶべき技術 次の世代のインタラクティブレンダリング5つの挑戦と10の滅ぶべき技術
次の世代のインタラクティブレンダリング5つの挑戦と10の滅ぶべき技術
 
UE4における大規模背景制作事例 描画特殊表現編
UE4における大規模背景制作事例 描画特殊表現編UE4における大規模背景制作事例 描画特殊表現編
UE4における大規模背景制作事例 描画特殊表現編
 
なぜなにFProperty - 対応方法と改善点 -
なぜなにFProperty - 対応方法と改善点 -なぜなにFProperty - 対応方法と改善点 -
なぜなにFProperty - 対応方法と改善点 -
 
UE4におけるエフェクトの為のエンジン改造事例
UE4におけるエフェクトの為のエンジン改造事例UE4におけるエフェクトの為のエンジン改造事例
UE4におけるエフェクトの為のエンジン改造事例
 
GTMF2016:Unreal Engine 4を利用した先進的なゲーム制作手法 The Unreal Way 2016 Epic Games Japan
GTMF2016:Unreal Engine 4を利用した先進的なゲーム制作手法 The Unreal Way 2016 Epic Games JapanGTMF2016:Unreal Engine 4を利用した先進的なゲーム制作手法 The Unreal Way 2016 Epic Games Japan
GTMF2016:Unreal Engine 4を利用した先進的なゲーム制作手法 The Unreal Way 2016 Epic Games Japan
 
UE4 MultiPlayer Online Deep Dive 基礎編1 -Getting Started- (historia様ご講演) #UE4DD
UE4 MultiPlayer Online Deep Dive 基礎編1 -Getting Started-  (historia様ご講演) #UE4DDUE4 MultiPlayer Online Deep Dive 基礎編1 -Getting Started-  (historia様ご講演) #UE4DD
UE4 MultiPlayer Online Deep Dive 基礎編1 -Getting Started- (historia様ご講演) #UE4DD
 
Lightmass Deep Dive 2018 Vol.1: Lightmass内部アルゴリズム概要(Lightmap編)
Lightmass Deep Dive 2018 Vol.1:  Lightmass内部アルゴリズム概要(Lightmap編)Lightmass Deep Dive 2018 Vol.1:  Lightmass内部アルゴリズム概要(Lightmap編)
Lightmass Deep Dive 2018 Vol.1: Lightmass内部アルゴリズム概要(Lightmap編)
 
【UE4.25 新機能】新しいシリアライゼーション機能「Unversioned Property Serialization」について
【UE4.25 新機能】新しいシリアライゼーション機能「Unversioned Property Serialization」について【UE4.25 新機能】新しいシリアライゼーション機能「Unversioned Property Serialization」について
【UE4.25 新機能】新しいシリアライゼーション機能「Unversioned Property Serialization」について
 

Ähnlich wie Gdc 14 bringing unreal engine 4 to open_gl

OpenGL ES based UI Development on TI Platforms
OpenGL ES based UI Development on TI PlatformsOpenGL ES based UI Development on TI Platforms
OpenGL ES based UI Development on TI PlatformsPrabindh Sundareson
 
CS 354 Programmable Shading
CS 354 Programmable ShadingCS 354 Programmable Shading
CS 354 Programmable ShadingMark Kilgard
 
Minko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSLMinko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSLMinko3D
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Prabindh Sundareson
 
Android open gl2_droidcon_2014
Android open gl2_droidcon_2014Android open gl2_droidcon_2014
Android open gl2_droidcon_2014Droidcon Berlin
 
Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Johan Andersson
 
13th kandroid OpenGL and EGL
13th kandroid OpenGL and EGL13th kandroid OpenGL and EGL
13th kandroid OpenGL and EGLJungsoo Nam
 
Porting the Source Engine to Linux: Valve's Lessons Learned
Porting the Source Engine to Linux: Valve's Lessons LearnedPorting the Source Engine to Linux: Valve's Lessons Learned
Porting the Source Engine to Linux: Valve's Lessons Learnedbasisspace
 
Open Kode, Airplay And The New Reality Of Write Once Run Anywhere
Open Kode, Airplay And The New Reality Of Write Once Run AnywhereOpen Kode, Airplay And The New Reality Of Write Once Run Anywhere
Open Kode, Airplay And The New Reality Of Write Once Run Anywhereguest991eb3
 
Shader Programming With Unity
Shader Programming With UnityShader Programming With Unity
Shader Programming With UnityMindstorm Studios
 
GFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specificationsGFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specificationsPrabindh Sundareson
 
Unity advanced computer graphics week 02
Unity advanced computer graphics week 02Unity advanced computer graphics week 02
Unity advanced computer graphics week 02Tri Thanh
 
Minko - Flash Conference #5
Minko - Flash Conference #5Minko - Flash Conference #5
Minko - Flash Conference #5Minko3D
 
Open GL ES Android
Open GL ES AndroidOpen GL ES Android
Open GL ES AndroidMindos Cheng
 
JIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdf
JIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdfJIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdf
JIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdfSamiraKids
 
GPU Programming 360iDev
GPU Programming 360iDevGPU Programming 360iDev
GPU Programming 360iDevJanie Clayton
 
Gpu Programming With GPUImage and Metal
Gpu Programming With GPUImage and MetalGpu Programming With GPUImage and Metal
Gpu Programming With GPUImage and MetalJanie Clayton
 

Ähnlich wie Gdc 14 bringing unreal engine 4 to open_gl (20)

Opengl basics
Opengl basicsOpengl basics
Opengl basics
 
Android native gl
Android native glAndroid native gl
Android native gl
 
OpenGL ES based UI Development on TI Platforms
OpenGL ES based UI Development on TI PlatformsOpenGL ES based UI Development on TI Platforms
OpenGL ES based UI Development on TI Platforms
 
CS 354 Programmable Shading
CS 354 Programmable ShadingCS 354 Programmable Shading
CS 354 Programmable Shading
 
Minko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSLMinko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSL
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011
 
Android open gl2_droidcon_2014
Android open gl2_droidcon_2014Android open gl2_droidcon_2014
Android open gl2_droidcon_2014
 
Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!
 
13th kandroid OpenGL and EGL
13th kandroid OpenGL and EGL13th kandroid OpenGL and EGL
13th kandroid OpenGL and EGL
 
Porting the Source Engine to Linux: Valve's Lessons Learned
Porting the Source Engine to Linux: Valve's Lessons LearnedPorting the Source Engine to Linux: Valve's Lessons Learned
Porting the Source Engine to Linux: Valve's Lessons Learned
 
Open Kode, Airplay And The New Reality Of Write Once Run Anywhere
Open Kode, Airplay And The New Reality Of Write Once Run AnywhereOpen Kode, Airplay And The New Reality Of Write Once Run Anywhere
Open Kode, Airplay And The New Reality Of Write Once Run Anywhere
 
Shader Programming With Unity
Shader Programming With UnityShader Programming With Unity
Shader Programming With Unity
 
GFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specificationsGFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
 
Unity advanced computer graphics week 02
Unity advanced computer graphics week 02Unity advanced computer graphics week 02
Unity advanced computer graphics week 02
 
What is OpenGL ?
What is OpenGL ?What is OpenGL ?
What is OpenGL ?
 
Minko - Flash Conference #5
Minko - Flash Conference #5Minko - Flash Conference #5
Minko - Flash Conference #5
 
Open GL ES Android
Open GL ES AndroidOpen GL ES Android
Open GL ES Android
 
JIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdf
JIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdfJIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdf
JIT Spraying Never Dies - Bypass CFG By Leveraging WARP Shader JIT Spraying.pdf
 
GPU Programming 360iDev
GPU Programming 360iDevGPU Programming 360iDev
GPU Programming 360iDev
 
Gpu Programming With GPUImage and Metal
Gpu Programming With GPUImage and MetalGpu Programming With GPUImage and Metal
Gpu Programming With GPUImage and Metal
 

Mehr von changehee lee

Smedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicsSmedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicschangehee lee
 
Fortugno nick design_and_monetization
Fortugno nick design_and_monetizationFortugno nick design_and_monetization
Fortugno nick design_and_monetizationchangehee lee
 
[Kgc2013] 모바일 엔진 개발기
[Kgc2013] 모바일 엔진 개발기[Kgc2013] 모바일 엔진 개발기
[Kgc2013] 모바일 엔진 개발기changehee lee
 
모바일 엔진 개발기
모바일 엔진 개발기모바일 엔진 개발기
모바일 엔진 개발기changehee lee
 
Mobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraphMobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraphchangehee lee
 
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)changehee lee
 
개발자여! 스터디를 하자!
개발자여! 스터디를 하자!개발자여! 스터디를 하자!
개발자여! 스터디를 하자!changehee lee
 
[Kgc2012] deferred forward 이창희
[Kgc2012] deferred forward 이창희[Kgc2012] deferred forward 이창희
[Kgc2012] deferred forward 이창희changehee lee
 
Gamificated game developing
Gamificated game developingGamificated game developing
Gamificated game developingchangehee lee
 
Windows to reality getting the most out of direct3 d 10 graphics in your games
Windows to reality   getting the most out of direct3 d 10 graphics in your gamesWindows to reality   getting the most out of direct3 d 10 graphics in your games
Windows to reality getting the most out of direct3 d 10 graphics in your gameschangehee lee
 
Basic ofreflectance kor
Basic ofreflectance korBasic ofreflectance kor
Basic ofreflectance korchangehee lee
 
Valve handbook low_res
Valve handbook low_resValve handbook low_res
Valve handbook low_reschangehee lee
 

Mehr von changehee lee (20)

Visual shock vol.2
Visual shock   vol.2Visual shock   vol.2
Visual shock vol.2
 
Shader compilation
Shader compilationShader compilation
Shader compilation
 
Smedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicsSmedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphics
 
Fortugno nick design_and_monetization
Fortugno nick design_and_monetizationFortugno nick design_and_monetization
Fortugno nick design_and_monetization
 
카툰 렌더링
카툰 렌더링카툰 렌더링
카툰 렌더링
 
[Kgc2013] 모바일 엔진 개발기
[Kgc2013] 모바일 엔진 개발기[Kgc2013] 모바일 엔진 개발기
[Kgc2013] 모바일 엔진 개발기
 
Paper games 2013
Paper games 2013Paper games 2013
Paper games 2013
 
모바일 엔진 개발기
모바일 엔진 개발기모바일 엔진 개발기
모바일 엔진 개발기
 
V8
V8V8
V8
 
Wecanmakeengine
WecanmakeengineWecanmakeengine
Wecanmakeengine
 
Mobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraphMobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraph
 
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
 
개발자여! 스터디를 하자!
개발자여! 스터디를 하자!개발자여! 스터디를 하자!
개발자여! 스터디를 하자!
 
[Kgc2012] deferred forward 이창희
[Kgc2012] deferred forward 이창희[Kgc2012] deferred forward 이창희
[Kgc2012] deferred forward 이창희
 
Light prepass
Light prepassLight prepass
Light prepass
 
Gamificated game developing
Gamificated game developingGamificated game developing
Gamificated game developing
 
Windows to reality getting the most out of direct3 d 10 graphics in your games
Windows to reality   getting the most out of direct3 d 10 graphics in your gamesWindows to reality   getting the most out of direct3 d 10 graphics in your games
Windows to reality getting the most out of direct3 d 10 graphics in your games
 
Basic ofreflectance kor
Basic ofreflectance korBasic ofreflectance kor
Basic ofreflectance kor
 
C++11(최지웅)
C++11(최지웅)C++11(최지웅)
C++11(최지웅)
 
Valve handbook low_res
Valve handbook low_resValve handbook low_res
Valve handbook low_res
 

Kürzlich hochgeladen

Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 

Kürzlich hochgeladen (20)

Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 

Gdc 14 bringing unreal engine 4 to open_gl

  • 1. BRINGING UNREAL ENGINE 4 TO OPENGL Nick Penwarden – Epic Games Mathias Schott, Evan Hart – NVIDIA | March 20, 2014
  • 2. About Us Nick Penwarden Lead Graphics Programmer – Epic Games @nickpwd on Twitter Mathias Schott Developer Technology Engineer – NVIDIA Evan Hart Principal Engineer - NVIDIA
  • 3. Unreal Engine 4 Latest version of the highly successful UnrealEngine Long, storied history Dozens of Licensees Hundreds of titles Built to scale Mobile phones and tablets Laptop computers Next-gen Console High-end PC
  • 4. Why OpenGL? The only choice on some platforms: Mac OS X iOS Android Linux HTML5/WebGL Access to new hardware features not tied to OS version. Microsoft has been tying new versions of Direct3D to new versions of Windows. OpenGL frees us of these restrictions: D3D 11.x features on Windows XP!
  • 5. Porting UE4 to OpenGL UE4 is a cross platform game engine. We run on many platforms today (8 by my count!) with more to come. Extremely important that we can develop features once and have them work across all platforms!
  • 6. (Very) High Level Look at UE4 Game Engine World PrimitiveComponents … LightComponents … Etc. Renderer Scene PrimitiveSceneInfo … LightSceneInfo … Etc. RHI Shaders Textures Vertex Buffers Etc. CROSS PLATFORM PLATFORM DEPENDENT
  • 7. Render Hardware Interface Largely based on the D3D11 API Resource management Shaders, textures, vertex buffers, etc. Commands DrawIndexedPrimitive, Clear, SetTexture, etc.
  • 8. OpenGL RHI Much of the API maps in a very straight-forward manner. RHICreateVertexBuffer -> glGenBuffers + glBufferData RHIDrawIndexedPrimitive -> glDrawRangeElements What about versions? 4.x vs 3.x vs ES2 vs ES3 What about extensions? A lot of code can be shared between versions, even ES2 and GL4.
  • 9. OpenGL RHI Use a class hierarchy with static functions. Each platform ultimately defines its own, typedefs it as FOpenGL. Optional functions defined here, e.g. FOpenGL::TexStorage2D() Depending on platform could be unimplemented, glTexStorage2DEXT, glTexStorage2D, … Optional features have a corresponding SupportsXXX() function. E.g. FOpenGL::SupportsTexStorage(). If support depends on an extension, we parse the GL_EXTENSIONS string at boot time and return that. If support is always (or never) available for the platform we return a static true or false. This allows the compiler to remove unneeded branches and dead code. Allows us to share basically all of our OpenGL code across ES2, GL3, and GL4!
  • 10. What about shaders? That other stuff was easy. Well, not exactly, Evan and Mathias will explain why later. We do not want to write our shaders twice! We have a large, existing HLSL shader code base. Most of our platforms support an HLSL-like shader language. We want to be able to compile and validate our shader code offline. Need reflection to create metadata used by the renderer at runtime. Which textures are bound to which indices? Which uniforms are used and thus need to be computed by the CPU?
  • 11. What are our options? Can we just compile our HLSL shaders as GLSL with some magical macros? No. Maybe for really simple shaders but the languages really are different in subtle ways that will cause your shaders to break in OpenGL. So use FXC to compile the HLSL and translate the bytecode! Can work really well! But not for UE4: we support Mac OS X as a full development platform. Any good cross compilers out there? None that support shader model 5 syntax, compute, etc. At least none that we could find two years ago :)
  • 12. HLSLCC Developed our own internal HLSL cross compiler. Inspired by glsl-optimizer: https://github.com/aras-p/glsl-optimizer Started with the Mesa GLSL parser and IR: http://www.mesa3d.org/ Modified the parser to parse shader model 5 HLSL instead of GLSL. Wrote a Mesa IR to GLSL converter, similar to that used in glsl- optimizer. Mesa includes IR optimization: this is an optimizing compiler!
  • 13. Benefits of Cross Compilation Write shaders once, they work everywhere. Contain platform specific workarounds to the compiler itself. Offline validation and reflection of shaders for OpenGL. Treat generated GLSL that doesn’t work with a GL driver as a bug in the cross compiler. Simplifies the runtime, especially resource binding and uniforms. Makes mobile preview on PC easy!
  • 14. GLSL Shader Pipeline Shader Compiler Input HLSL Source HLSL Source HLSL Source HLSL Source #defines Compiler Flags MCPP* Shader Compiler Output GLSL Source Parameter Map Parse HLSL -> Abstract Syntax Tree Convert AST -> Mesa IR Convert IR for HLSL entry point to GLSL compatible main() Optimize IR Pack uniforms Optimize IR Convert IR -> GLSL + parameter map HLSLCC * http://mcpp.sourceforge.net/
  • 15. Resource Binding When the OpenGL RHI creates the GLSL programs, setting up bind points is easy: Renderer binds textures to indices. RHISetShaderTexture(PixelShader,LightmapParameter,LightmapTexture); Same as every other platform. for (int32 i=0; i < NumSamplers; ++i) { Location = glGetUniformLocation(Program, “_psi”); glUniform1i(Location,i); } (from parameter map)
  • 16. Uniform Packing All global uniforms are packed in to arrays. Simplifies binding uniforms when creating programs: We don’t need the strings of the uniform’s names, they are fixed! glGetUniformLocation(“_pu0”) Simplifies shadowing of uniforms when rendering: RHISetShaderParameter(Shader,Parameter,Value) Copy value to shadow buffer, just like the D3D11 global constant buffer. Simplifies setting of uniforms in the RHI: Track dirty ranges. For each dirty range: glUniform4fv(…) Easy to trade off between the quantity of glUniform calls and the amount of bytes uploaded per draw call.
  • 17. Uniform Buffer Emulation (ES2) Straight-forward extension of uniform packing. Rendering code gets to treat ES2 like D3D11: it packages uniforms in to buffers based on the frequency at which uniforms change. The cross compiler packs used uniforms in to the uniform arrays. Also provides a copy list: where to get uniforms from and where they go in the uniform array.
  • 18. UV Space Differences -1,-1 +1,+1 0,0 1,1 0,0 1,1 Normalized Device Coordinates Direct3D UV Space OpenGL UV Space
  • 19. From the Shader’s Point of View -1,-1 +1,+1 0,0 1,1 0,0 1,1 Normalized Device Coordinates Direct3D UV Space OpenGL UV Space
  • 20. Rendering Upside Down Solutions? Could sample with (U, 1 - V) Could render and flip in NDC (X,-Y,Z) We choose to render upside down. Fewer choke points: lots of texture samples in shader code! You have to transform your post-projection vertex positions anyway. D3D expects Z in [0,1], OpenGL expects Z in [-1,1]. Can wrap all of this in to a single patched projection matrix for no runtime cost. You also have to reverse your polygon winding when setting rasterizer state! BUT: Remember not to flip when rendering to the backbuffer!
  • 21. After Patching the Projection Matrix -1,-1 +1,+1 0,0 1,1 0,0 1,1 Normalized Device Coordinates Direct3D UV Space OpenGL UV Space
  • 22. Next… Evan and Mathias will talk about extending the GL RHI to a full D3D11 feature set. Also, how to take all of this stuff and make it fast!
  • 23. Achieving DX11 Parity Enable advanced effects Not a generic DX11 layer Only handle engine needs 60% shader / 40% engine Evolves over time with engine OpenGL API
  • 24. From OpenGL 3.x to 4.x Compute shaders General Shader upgrades Unordered Access Views Tessellation Lots of extensions And API enchancements but Maintain Core API Compatibility ES2, GL3, GL4 etc
  • 25. General Shader Upgrades Too many to discuss Most straight forward Some ripple in HLSLCC Type system IR Optimization Different UBO packing rules for Arrays Shared Memory RWTexture* RWBuffer* Memory Barriers Shared Memory Atomics UAV Atomic Conversion and packing functions Texture arrays Integer Functions Texture Queries New Texture Functions Early Depth Test
  • 26. Compute Shader API Simple analogs for Dispatch Need to manage resources No separate binding points i.e. no CSetXXX Memory behavior is different DirectX 11 hazard free OpenGL exposes hazards Enforce full coherence Guarantees correctness Optimize later //State Setup glUseProgram( ComputeShader); SetupTexturesCompute(); SetupImagesCompute(); SetupUniforms(); //Ensure reads are coherent glMemoryBarrier(GL_ALL_BARRIER_BITS); glDispatchCompute( X, Y, Z); //Ensure writes are coherent glMemoryBarrier(GL_ALL_BARRIER_BITS);
  • 27. Unordered Access Views Direct3D 11 world view Multiple shader object types RWBuffer, RWTexture2D, RWStructuredBuffer, etc Unified API for setup CSSetUnorderedAccessViews OpenGL world view GL_image_load_store (Closer to Direct3D 11) GL_shader_buffer_storage_object Solution Use GL_image_load_store as much as possible Indirection table for other types
  • 28. Tessellation (WIP) Both have two shader stages Direct3D11: Hull Shader & Domain Shader OpenGL: Tessellation Control Shader &Tessellation Evaluation Shader Notable divergence: HLSL Hull vs. GLSL Tessellation Control Shader HLSL has two Hull Shader functions Hull shader (per control point) PatchConstantFunction (per patch) GLSL has a single TESSELLATION_CONTROL_SHADER Explicitly handle patch constant computations
  • 29. Tessellation Solution Execute TCS in phases First hull shader Next patch constants GLSL provides barrier Ensures values are ready Patch constant function is scalar Just execute on one thread Possibly vectorize later layout( vertices = 3) out; void main() { //Execute Hull shader barrier(); if (gl_InvocationID == 0) { //Execute Patch constant } }
  • 30. Optimization Feature complete is only half the battle No one wants pretty OpenGL 4.3 at half the performance Initial test scene < ¼ of Direct3D 90 FPS versus ~13 FPS (IIRC)
  • 31. HLSLCC Optimizatons Initially heavily GPU bound Internal engine tools were fantastic Just needed to implement proper OpenGL query support Shadows were the biggest offender Filtering massively slower Projection somewhat slower Filter offsets and projection matrices turned into temp arrays GPU compilers couldn’t fold through the temp array Needed to aggressively fold / propagate Reduced performance deficit to < 2x
  • 32. Software Bottlenecks Remaining bottlenecks were mostly CPU Both application and driver issues Driver is very solid, but Unreal Engine 4 is a big test Heaviest use of new features This effort made the driver better Applications issues were often unintuitive Most problems were synchronizations Overall, lots of onion peeling
  • 33. Application & Driver Syncs Driver likes to perform most of its work on a separate thread Often improves software performance by 50% Control panel setting to turn it off for testing Can be a hindrance with excessive synchronization Most optimization effort went into resolving these issues General process Run code with sampling profiler Look at inclusive timing results Find large hotspot on function calling into the driver Time reflects wait for other thread
  • 34. Top Offenders glMapBuffer (and similar) Needs to resolve and queued glBufferData calls Replace with malloc + glBufferSubData glGetTexImage Used with reallocating mipmap chains Replace with glCopyImageSubData glGenXXX Every time a new object is created Replace with a cache of names quaried in a large block glGetError Just say no, except in debug
  • 35. Where are we? Infiltrator performance is at parity (<5% difference) Primitive throughput test is faster Roughly 1 ms Rendering test is slower Roughly 1.5-2.0 ms
  • 36. Where do we go? More GPU optimization Driver differences Loop unrolling improvements More software optimizations Reduce object references Already implemented for UBOs Bindless Textures More textures & Less overhead BufferStorage Allows persistent mappings Reduces memory copying
  • 37. Supporting Android Tegra K1 opened an interesting door Mobile chip with full OpenGL 4.4 capability Common driver code means all the same features Can we run “full” Unreal Engine 4 content? Yes Need to make some optimizations / compromises No different than a low-end PC Work done in NVIDIA’s branch On-going work with Epic to feedback changes
  • 38. What were the challenges? High resolution, modest GPU Rely on scaling Turn some features back (motion blur, bloom, etc) Aggressively optimize where feasible Do reflections always need to be at full resolution? Modest CPU Often disable cpu-hungry rendering Shadows Optimize API at every turn Modest Memory Remove any duplication