SlideShare ist ein Scribd-Unternehmen logo
1 von 48
How we optimized our Game – Jake & Tess’
Finding Monsters Adventure
Phil Lira
Sr. Staff Engineer (Graphics)
@phi_lira
RELEASE TRAILER
https://www.youtube.com/watch?v=STzdj04n7dc
TECHNICAL CHALLENGES
Technical challenges
Many custom shaders and effects
Technical challenges
Many custom shaders and effects
Technical challenges
Multiple characters with complex skinning
Our budget is the limit
• Push as much content as
possible with smooth gameplay
and no overheat
– Can we get the same quality
with a similar approach?
– Are we doing something we
don’t need to?
What if we hit our budge
• What happens when we fail?
– Either gameplay or visual quality will
be impacted
• When it comes to remove
effects, trust is important
OPTIMIZATION PROCESS
Optimization Process
• Do not make any assumptions.
• A profiler will tell you where the bottleneck is.
Profile Optimize Test
Optimization Process
• Rewrite code to use resources more efficiently
• Often we can fake or simplify effects
• Experience comes into play here.
OptimizeProfile Test
Optimization Process
• Guarantee your tests have same conditions
• Did you work reduced overall gpu ms?
TestProfile Optimize
How to find our bottleneck?
• Unity comes with a built-in profiler
that does most of the work
• We wanted to have more
detailed GPU info
– Adreno Profiler – Snapdragon GPUs
– Mali Graphics Debugger (MGD) and
DS-5 Streamline – Mali GPUs
Adreno GPU Profiler
How to find our bottleneck?
Disable GL
Frame rate
increased?
No Yes
CPU Bound GPU Bound
Vertex Frag Memory
How to find our bottleneck?
• Vertex
– #triangles
– Vertex shader
– Per-vertex lighting
• Fragment
– Fragment Shader (instruc. / sample)
– Blend Ops
– Per-Pixel light (forward rendering)
• Bandwidth
– Large textures
– Dependent Texture Reads
– Block Resolve (ReadPixels)
CASE STUDY – ROYAL MOON
Case Study – Royale Moon
• Triangles 106k
• Drawcalls 87
• Overdraw 2.51x
• Shader Stats:
– Up to 160 ALU/Frag
– Up to 7 texture samples
• Adreno %Time Shading Fragment - max
– Fragment bound
Overdraw Debug
Case Study – Royale Moon
• Early Z-Test Discards occluded fragments
• Render Order Matters
• Optimized Render Order
– Opaques – Front to Back
– Skybox
– Transparent – Back to Front
– Overlay (UI / HUD)
We need to improve this
How to assign object to sorting layers?
• Per Shader
– Have to duplicate shader files. Hard to maintain because we
have to make changes individually to each duplicate.
• Per Mesh
– Not scalable, requires lot of work.
– Risky! May break batches by mistake.
• Per Material
– YES!
– In that case do not use same material for different scene
• While you fix sort for one might break for the other.
Custom Material Inspector
• Created an editor script
BRSMaterialEditor to set
Material.renderQueue
• Add CustomEditor “BRSMaterialEditor”
to the end of shader file.
Character and Props
Camera Island Top
Outer Islands
Skydome
Before and After Improving Sort
Reduced from 2.51 to 1.91
Z-Reject
FRAGMENT SHADER
Shader hotzone (% time shading)
Shader hotzone (ALU per frag)
• Improving Shader Instructions
– Model: ops that can be done once per drawcall
• Use scripts to compute and pass values to shader
• Input Vector Normalization (ex. Rim Light)
• Scroll Offset
– Vertex: Ops that can be done per vertex
• Uniform texture tile & offset
– Fragment: Ops that needs to be done per pixel
• Equation simplification
• Half & Fixed precision for better thermal
• Saturate vs max(0.0, dot)
Fragment
Vertex
Model
COMPLEXITY
How to optimize fragment shader
Optimizing Shaders
• Many custom shaders done in ShaderForge
– ShaderForge does heavy work on fragment
• Many variants and not exactly the same code
structure
• How to optimize them all?
– 1st pass optimizing in ShaderForge
– 2nd pass optimizing in Code
1st Pass: ShaderForge
• Identify core changes to lighting model
– BlinnPhongWrapped
– BlinnPhongRamp
• Created custom code node
– Artist helped with the process to replace for this code
– This made shader code common and more organized
1st Pass: ShaderForge
Custom Lightmap in ShaderForge
• One major art complain was the lack of support for lightmap
in custom lighting
• Created a Lightmap node for them
• Problem1: Need to enable lightmap in config shader header.
• Problem2: ShaderForge does not exposes interpolated data.
2nd Pass: Shader Code
Created a cginc file with macros for optimized code
• ShaderForge follows name convention for input
data
The results - Ground Shader
After optimization:
Before optimization:
• Avg ALU/Frag – ~21% reduction
• Fragments Shaded – ~45% reduction Overall Improvement: ~7ms
• Fragment Instructions – ~64% reduction
Further Improvements
• Fallback Shader
– We came across some problems
with shaders not being supported
for some configurations
– Vertex Animation with a noise
texture (tex2dlod) is not supported
on OpenGL ES 2.0 profiles
– Fallback shader to standout in
those cases
– Makes it easy to differentiate from
other errors
ASTC
TEXTURE COMPRESSIONTEXTURE COMPRESSION
ASTC
• Optimal performance with high quality
• Improves bandwitdh and power consuption
• Galaxy Note 4, Galaxy S6 and above support it
• Supported with OpenGL 3 Unity profile
ASTC
ASTC 4x4 ASTC 6x6 ETC 2
ASTC
Format RGB RGBA Normal Map
Codec ASTC 6x6 ASTC 4x4 ASTC 4x4
BPP 3.56 8 8
Size vs
Uncompressed
14.8% 50% 50%
Size vs ETC2 89% 100% 100%
Recommended Settings:
Review
• Do not make assumptions, use a profiler.
• GPU profilers will give you in-depth data per
drawcall
• One can assign objects to sorting layers at material
level for best workflow
• Reduce amount of work to optimize shader by
creating means to reuse optimized code.
• ASTC texture compression is best option available
for quality but only supported in a few devices.
Phil Lira
f.lira@samsung.com
@phi_lira
Q&A
CONTACTS
www.blackriverstudios.net
@BlackRvrStudios
/blackrivergames
Phil Lira
f.lira@samsung.com
@phi_lira
THANKS!
CONTACTS
www.blackriverstudios.net
@BlackRvrStudios
/blackrivergames

Weitere ähnliche Inhalte

Was ist angesagt?

NDC2017 언리얼엔진4 디버깅 101 - 게임 기획자, 프로그래머가 버그와 만났을 때 사용할 수 있는 지침들
NDC2017 언리얼엔진4 디버깅 101 - 게임 기획자, 프로그래머가 버그와 만났을 때 사용할 수 있는 지침들NDC2017 언리얼엔진4 디버깅 101 - 게임 기획자, 프로그래머가 버그와 만났을 때 사용할 수 있는 지침들
NDC2017 언리얼엔진4 디버깅 101 - 게임 기획자, 프로그래머가 버그와 만났을 때 사용할 수 있는 지침들영욱 오
 
[Unite2015 박민근] 유니티 최적화 테크닉 총정리
[Unite2015 박민근] 유니티 최적화 테크닉 총정리[Unite2015 박민근] 유니티 최적화 테크닉 총정리
[Unite2015 박민근] 유니티 최적화 테크닉 총정리MinGeun Park
 
[NDC 2018] 신입 개발자가 알아야 할 윈도우 메모리릭 디버깅
[NDC 2018] 신입 개발자가 알아야 할 윈도우 메모리릭 디버깅[NDC 2018] 신입 개발자가 알아야 할 윈도우 메모리릭 디버깅
[NDC 2018] 신입 개발자가 알아야 할 윈도우 메모리릭 디버깅DongMin Choi
 
[NDC2017] 뛰는 프로그래머 나는 언리얼 엔진 - 언알못에서 커미터까지
[NDC2017] 뛰는 프로그래머 나는 언리얼 엔진 - 언알못에서 커미터까지[NDC2017] 뛰는 프로그래머 나는 언리얼 엔진 - 언알못에서 커미터까지
[NDC2017] 뛰는 프로그래머 나는 언리얼 엔진 - 언알못에서 커미터까지Minjung Ko
 
Mobile Performance Tuning: Poor Man's Tips And Tricks
Mobile Performance Tuning: Poor Man's Tips And TricksMobile Performance Tuning: Poor Man's Tips And Tricks
Mobile Performance Tuning: Poor Man's Tips And TricksValentin Simonov
 
Plug-ins & Third-Party SDKs in UE4
Plug-ins & Third-Party SDKs in UE4Plug-ins & Third-Party SDKs in UE4
Plug-ins & Third-Party SDKs in UE4Gerke Max Preussner
 
UE4 Garbage Collection
UE4 Garbage CollectionUE4 Garbage Collection
UE4 Garbage CollectionQooJuice
 
Player Traversal Mechanics in the Vast World of Horizon Zero Dawn
Player Traversal Mechanics in the Vast World of Horizon Zero DawnPlayer Traversal Mechanics in the Vast World of Horizon Zero Dawn
Player Traversal Mechanics in the Vast World of Horizon Zero DawnGuerrilla
 
そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>
  そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>  そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>
そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>エピック・ゲームズ・ジャパン Epic Games Japan
 
Cpp에서 활용해보는 Lambda식
Cpp에서 활용해보는 Lambda식Cpp에서 활용해보는 Lambda식
Cpp에서 활용해보는 Lambda식TonyCms
 
Custom fabric shader for unreal engine 4
Custom fabric shader for unreal engine 4Custom fabric shader for unreal engine 4
Custom fabric shader for unreal engine 4동석 김
 
Unreal Open Day 2017 Optimize in Mobile UI
Unreal Open Day 2017 Optimize in Mobile UIUnreal Open Day 2017 Optimize in Mobile UI
Unreal Open Day 2017 Optimize in Mobile UIEpic Games China
 
Decima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero DawnDecima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero DawnGuerrilla
 
[IGC 2017] 펄어비스 민경인 - Mmorpg를 위한 voxel 기반 네비게이션 라이브러리 개발기
[IGC 2017] 펄어비스 민경인 - Mmorpg를 위한 voxel 기반 네비게이션 라이브러리 개발기[IGC 2017] 펄어비스 민경인 - Mmorpg를 위한 voxel 기반 네비게이션 라이브러리 개발기
[IGC 2017] 펄어비스 민경인 - Mmorpg를 위한 voxel 기반 네비게이션 라이브러리 개발기강 민우
 
マジシャンズデッド ポストモーテム ~マテリアル編~ (株式会社Byking: 鈴木孝司様、成相真治様) #UE4DD
マジシャンズデッド ポストモーテム ~マテリアル編~ (株式会社Byking: 鈴木孝司様、成相真治様) #UE4DDマジシャンズデッド ポストモーテム ~マテリアル編~ (株式会社Byking: 鈴木孝司様、成相真治様) #UE4DD
マジシャンズデッド ポストモーテム ~マテリアル編~ (株式会社Byking: 鈴木孝司様、成相真治様) #UE4DDエピック・ゲームズ・ジャパン Epic Games Japan
 
NDC 2015 삼시세끼 빌드만들기
NDC 2015 삼시세끼 빌드만들기NDC 2015 삼시세끼 빌드만들기
NDC 2015 삼시세끼 빌드만들기Hyunsuk Ahn
 
전형규, SilvervineUE4Lua: UE4에서 Lua 사용하기, NDC2019
전형규, SilvervineUE4Lua: UE4에서 Lua 사용하기, NDC2019전형규, SilvervineUE4Lua: UE4에서 Lua 사용하기, NDC2019
전형규, SilvervineUE4Lua: UE4에서 Lua 사용하기, NDC2019devCAT Studio, NEXON
 

Was ist angesagt? (20)

NDC2017 언리얼엔진4 디버깅 101 - 게임 기획자, 프로그래머가 버그와 만났을 때 사용할 수 있는 지침들
NDC2017 언리얼엔진4 디버깅 101 - 게임 기획자, 프로그래머가 버그와 만났을 때 사용할 수 있는 지침들NDC2017 언리얼엔진4 디버깅 101 - 게임 기획자, 프로그래머가 버그와 만났을 때 사용할 수 있는 지침들
NDC2017 언리얼엔진4 디버깅 101 - 게임 기획자, 프로그래머가 버그와 만났을 때 사용할 수 있는 지침들
 
[Unite2015 박민근] 유니티 최적화 테크닉 총정리
[Unite2015 박민근] 유니티 최적화 테크닉 총정리[Unite2015 박민근] 유니티 최적화 테크닉 총정리
[Unite2015 박민근] 유니티 최적화 테크닉 총정리
 
UE4におけるLoadingとGCのProfilingと最適化手法
UE4におけるLoadingとGCのProfilingと最適化手法UE4におけるLoadingとGCのProfilingと最適化手法
UE4におけるLoadingとGCのProfilingと最適化手法
 
[NDC 2018] 신입 개발자가 알아야 할 윈도우 메모리릭 디버깅
[NDC 2018] 신입 개발자가 알아야 할 윈도우 메모리릭 디버깅[NDC 2018] 신입 개발자가 알아야 할 윈도우 메모리릭 디버깅
[NDC 2018] 신입 개발자가 알아야 할 윈도우 메모리릭 디버깅
 
[NDC2017] 뛰는 프로그래머 나는 언리얼 엔진 - 언알못에서 커미터까지
[NDC2017] 뛰는 프로그래머 나는 언리얼 엔진 - 언알못에서 커미터까지[NDC2017] 뛰는 프로그래머 나는 언리얼 엔진 - 언알못에서 커미터까지
[NDC2017] 뛰는 프로그래머 나는 언리얼 엔진 - 언알못에서 커미터까지
 
Mobile Performance Tuning: Poor Man's Tips And Tricks
Mobile Performance Tuning: Poor Man's Tips And TricksMobile Performance Tuning: Poor Man's Tips And Tricks
Mobile Performance Tuning: Poor Man's Tips And Tricks
 
Plug-ins & Third-Party SDKs in UE4
Plug-ins & Third-Party SDKs in UE4Plug-ins & Third-Party SDKs in UE4
Plug-ins & Third-Party SDKs in UE4
 
UE4 Garbage Collection
UE4 Garbage CollectionUE4 Garbage Collection
UE4 Garbage Collection
 
Player Traversal Mechanics in the Vast World of Horizon Zero Dawn
Player Traversal Mechanics in the Vast World of Horizon Zero DawnPlayer Traversal Mechanics in the Vast World of Horizon Zero Dawn
Player Traversal Mechanics in the Vast World of Horizon Zero Dawn
 
そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>
  そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>  そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>
そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 1 <Shader Compile, PSO Cache編>
 
Cpp에서 활용해보는 Lambda식
Cpp에서 활용해보는 Lambda식Cpp에서 활용해보는 Lambda식
Cpp에서 활용해보는 Lambda식
 
Custom fabric shader for unreal engine 4
Custom fabric shader for unreal engine 4Custom fabric shader for unreal engine 4
Custom fabric shader for unreal engine 4
 
Unreal Open Day 2017 Optimize in Mobile UI
Unreal Open Day 2017 Optimize in Mobile UIUnreal Open Day 2017 Optimize in Mobile UI
Unreal Open Day 2017 Optimize in Mobile UI
 
Decima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero DawnDecima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero Dawn
 
[IGC 2017] 펄어비스 민경인 - Mmorpg를 위한 voxel 기반 네비게이션 라이브러리 개발기
[IGC 2017] 펄어비스 민경인 - Mmorpg를 위한 voxel 기반 네비게이션 라이브러리 개발기[IGC 2017] 펄어비스 민경인 - Mmorpg를 위한 voxel 기반 네비게이션 라이브러리 개발기
[IGC 2017] 펄어비스 민경인 - Mmorpg를 위한 voxel 기반 네비게이션 라이브러리 개발기
 
UE4 Hair & Groomでのリアルタイムファーレンダリング (UE4 Character Art Dive Online)
UE4 Hair & Groomでのリアルタイムファーレンダリング (UE4 Character Art Dive Online)UE4 Hair & Groomでのリアルタイムファーレンダリング (UE4 Character Art Dive Online)
UE4 Hair & Groomでのリアルタイムファーレンダリング (UE4 Character Art Dive Online)
 
マジシャンズデッド ポストモーテム ~マテリアル編~ (株式会社Byking: 鈴木孝司様、成相真治様) #UE4DD
マジシャンズデッド ポストモーテム ~マテリアル編~ (株式会社Byking: 鈴木孝司様、成相真治様) #UE4DDマジシャンズデッド ポストモーテム ~マテリアル編~ (株式会社Byking: 鈴木孝司様、成相真治様) #UE4DD
マジシャンズデッド ポストモーテム ~マテリアル編~ (株式会社Byking: 鈴木孝司様、成相真治様) #UE4DD
 
60fpsアクションを実現する秘訣を伝授 解析編
60fpsアクションを実現する秘訣を伝授 解析編60fpsアクションを実現する秘訣を伝授 解析編
60fpsアクションを実現する秘訣を伝授 解析編
 
NDC 2015 삼시세끼 빌드만들기
NDC 2015 삼시세끼 빌드만들기NDC 2015 삼시세끼 빌드만들기
NDC 2015 삼시세끼 빌드만들기
 
전형규, SilvervineUE4Lua: UE4에서 Lua 사용하기, NDC2019
전형규, SilvervineUE4Lua: UE4에서 Lua 사용하기, NDC2019전형규, SilvervineUE4Lua: UE4에서 Lua 사용하기, NDC2019
전형규, SilvervineUE4Lua: UE4에서 Lua 사용하기, NDC2019
 

Andere mochten auch

Practical Guide for Optimizing Unity on Mobiles
Practical Guide for Optimizing Unity on MobilesPractical Guide for Optimizing Unity on Mobiles
Practical Guide for Optimizing Unity on MobilesValentin Simonov
 
Unity Optimization Tips, Tricks and Tools
Unity Optimization Tips, Tricks and ToolsUnity Optimization Tips, Tricks and Tools
Unity Optimization Tips, Tricks and ToolsIntel® Software
 
EA: Optimization of mobile Unity application
EA: Optimization of mobile Unity applicationEA: Optimization of mobile Unity application
EA: Optimization of mobile Unity applicationDevGAMM Conference
 
Visual surface detection i
Visual surface detection   iVisual surface detection   i
Visual surface detection ielaya1984
 
Unity3D Tips and Tricks or "You are doing it wrong!"
Unity3D Tips and Tricks or "You are doing it wrong!"Unity3D Tips and Tricks or "You are doing it wrong!"
Unity3D Tips and Tricks or "You are doing it wrong!"Taras Leskiv
 
Photography & Development of Magzine Cover
Photography & Development of Magzine CoverPhotography & Development of Magzine Cover
Photography & Development of Magzine Coverioji1
 
IGDA RI January '16 - Jammin' - Game Jams and Hackathons Workshop
IGDA RI January '16 - Jammin' - Game Jams and Hackathons WorkshopIGDA RI January '16 - Jammin' - Game Jams and Hackathons Workshop
IGDA RI January '16 - Jammin' - Game Jams and Hackathons WorkshopBen Taylor
 
Intro to Game Modding - Lecture 6
Intro to Game Modding - Lecture 6Intro to Game Modding - Lecture 6
Intro to Game Modding - Lecture 6Charles Palmer
 
Intro to Game Modding - Lecture 3
Intro to Game Modding - Lecture 3Intro to Game Modding - Lecture 3
Intro to Game Modding - Lecture 3Charles Palmer
 
Intro to Game Modding - Lecture 4
Intro to Game Modding - Lecture 4Intro to Game Modding - Lecture 4
Intro to Game Modding - Lecture 4Charles Palmer
 
Virtual Reality Presentation at #HybridLive
Virtual Reality Presentation at #HybridLiveVirtual Reality Presentation at #HybridLive
Virtual Reality Presentation at #HybridLiveCharles Palmer
 
Oit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsOit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsHolger Gruen
 
유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) Unite Seoul Ver.
유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) Unite Seoul Ver.유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) Unite Seoul Ver.
유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) Unite Seoul Ver.ozlael ozlael
 
유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) NDC15 Ver.
유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) NDC15 Ver.유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) NDC15 Ver.
유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) NDC15 Ver.ozlael ozlael
 
Z Buffer Optimizations
Z Buffer OptimizationsZ Buffer Optimizations
Z Buffer Optimizationspjcozzi
 
hidden surface elimination using z buffer algorithm
hidden surface elimination using z buffer algorithmhidden surface elimination using z buffer algorithm
hidden surface elimination using z buffer algorithmrajivagarwal23dei
 

Andere mochten auch (17)

Practical Guide for Optimizing Unity on Mobiles
Practical Guide for Optimizing Unity on MobilesPractical Guide for Optimizing Unity on Mobiles
Practical Guide for Optimizing Unity on Mobiles
 
Unity Optimization Tips, Tricks and Tools
Unity Optimization Tips, Tricks and ToolsUnity Optimization Tips, Tricks and Tools
Unity Optimization Tips, Tricks and Tools
 
EA: Optimization of mobile Unity application
EA: Optimization of mobile Unity applicationEA: Optimization of mobile Unity application
EA: Optimization of mobile Unity application
 
Visual surface detection i
Visual surface detection   iVisual surface detection   i
Visual surface detection i
 
Unity3D Tips and Tricks or "You are doing it wrong!"
Unity3D Tips and Tricks or "You are doing it wrong!"Unity3D Tips and Tricks or "You are doing it wrong!"
Unity3D Tips and Tricks or "You are doing it wrong!"
 
Photography & Development of Magzine Cover
Photography & Development of Magzine CoverPhotography & Development of Magzine Cover
Photography & Development of Magzine Cover
 
IGDA RI January '16 - Jammin' - Game Jams and Hackathons Workshop
IGDA RI January '16 - Jammin' - Game Jams and Hackathons WorkshopIGDA RI January '16 - Jammin' - Game Jams and Hackathons Workshop
IGDA RI January '16 - Jammin' - Game Jams and Hackathons Workshop
 
Intro to Game Modding - Lecture 6
Intro to Game Modding - Lecture 6Intro to Game Modding - Lecture 6
Intro to Game Modding - Lecture 6
 
Intro to Game Modding - Lecture 3
Intro to Game Modding - Lecture 3Intro to Game Modding - Lecture 3
Intro to Game Modding - Lecture 3
 
Intro to Game Modding - Lecture 4
Intro to Game Modding - Lecture 4Intro to Game Modding - Lecture 4
Intro to Game Modding - Lecture 4
 
Virtual Reality Presentation at #HybridLive
Virtual Reality Presentation at #HybridLiveVirtual Reality Presentation at #HybridLive
Virtual Reality Presentation at #HybridLive
 
Oit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsOit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked Lists
 
Stochastic Screen-Space Reflections
Stochastic Screen-Space ReflectionsStochastic Screen-Space Reflections
Stochastic Screen-Space Reflections
 
유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) Unite Seoul Ver.
유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) Unite Seoul Ver.유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) Unite Seoul Ver.
유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) Unite Seoul Ver.
 
유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) NDC15 Ver.
유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) NDC15 Ver.유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) NDC15 Ver.
유니티 그래픽 최적화, 어디까지 해봤니 (Optimizing Unity Graphics) NDC15 Ver.
 
Z Buffer Optimizations
Z Buffer OptimizationsZ Buffer Optimizations
Z Buffer Optimizations
 
hidden surface elimination using z buffer algorithm
hidden surface elimination using z buffer algorithmhidden surface elimination using z buffer algorithm
hidden surface elimination using z buffer algorithm
 

Ähnlich wie How we optimized our Game - Jake & Tess' Finding Monsters Adventure

Developing Next-Generation Games with Stage3D (Molehill)
Developing Next-Generation Games with Stage3D (Molehill) Developing Next-Generation Games with Stage3D (Molehill)
Developing Next-Generation Games with Stage3D (Molehill) Jean-Philippe Doiron
 
Uncharted3 effect technique
Uncharted3 effect techniqueUncharted3 effect technique
Uncharted3 effect techniqueMinGeun Park
 
Adding more visuals without affecting performance
Adding more visuals without affecting performanceAdding more visuals without affecting performance
Adding more visuals without affecting performanceSt1X
 
Better, Faster, Smarter, Witcher. Production tips from The Witcher 3: Wild Hu...
Better, Faster, Smarter, Witcher. Production tips from The Witcher 3: Wild Hu...Better, Faster, Smarter, Witcher. Production tips from The Witcher 3: Wild Hu...
Better, Faster, Smarter, Witcher. Production tips from The Witcher 3: Wild Hu...DevGAMM Conference
 
Uncharted 2: Character Pipeline
Uncharted 2: Character PipelineUncharted 2: Character Pipeline
Uncharted 2: Character PipelineNaughty Dog
 
Android open gl2_droidcon_2014
Android open gl2_droidcon_2014Android open gl2_droidcon_2014
Android open gl2_droidcon_2014Droidcon Berlin
 
Sista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performanceSista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performanceESUG
 
Shadowing production requests
Shadowing production requestsShadowing production requests
Shadowing production requestsJakauteri
 
Optimizing thread performance for a genomics variant caller
Optimizing thread performance for a genomics variant callerOptimizing thread performance for a genomics variant caller
Optimizing thread performance for a genomics variant callerAllineaSoftware
 
Oculus insight building the best vr aaron davies
Oculus insight building the best vr   aaron daviesOculus insight building the best vr   aaron davies
Oculus insight building the best vr aaron daviesMary Chan
 
Dynamic Wounds on Animated Characters in UE4
Dynamic Wounds on Animated Characters in UE4Dynamic Wounds on Animated Characters in UE4
Dynamic Wounds on Animated Characters in UE4Michał Kłoś
 
Making a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie TycoonMaking a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie TycoonJean-Philippe Doiron
 
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Daosheng Mu
 
Smooth Animations for Web & Hybrid
Smooth Animations for Web & HybridSmooth Animations for Web & Hybrid
Smooth Animations for Web & HybridFITC
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problemGrokking VN
 
Software testing and quality assurance
Software testing and quality assuranceSoftware testing and quality assurance
Software testing and quality assuranceBenjamin Baumann
 
APB Customisation System
APB Customisation SystemAPB Customisation System
APB Customisation Systemmsciglio
 
Evaluation Activity 6
Evaluation Activity  6Evaluation Activity  6
Evaluation Activity 6SHEKARIE
 

Ähnlich wie How we optimized our Game - Jake & Tess' Finding Monsters Adventure (20)

Developing Next-Generation Games with Stage3D (Molehill)
Developing Next-Generation Games with Stage3D (Molehill) Developing Next-Generation Games with Stage3D (Molehill)
Developing Next-Generation Games with Stage3D (Molehill)
 
IMAGE PROCESSING
IMAGE PROCESSINGIMAGE PROCESSING
IMAGE PROCESSING
 
Uncharted3 effect technique
Uncharted3 effect techniqueUncharted3 effect technique
Uncharted3 effect technique
 
Adding more visuals without affecting performance
Adding more visuals without affecting performanceAdding more visuals without affecting performance
Adding more visuals without affecting performance
 
Better, Faster, Smarter, Witcher. Production tips from The Witcher 3: Wild Hu...
Better, Faster, Smarter, Witcher. Production tips from The Witcher 3: Wild Hu...Better, Faster, Smarter, Witcher. Production tips from The Witcher 3: Wild Hu...
Better, Faster, Smarter, Witcher. Production tips from The Witcher 3: Wild Hu...
 
Uncharted 2: Character Pipeline
Uncharted 2: Character PipelineUncharted 2: Character Pipeline
Uncharted 2: Character Pipeline
 
Android open gl2_droidcon_2014
Android open gl2_droidcon_2014Android open gl2_droidcon_2014
Android open gl2_droidcon_2014
 
Sista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performanceSista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performance
 
Shadowing production requests
Shadowing production requestsShadowing production requests
Shadowing production requests
 
Optimizing thread performance for a genomics variant caller
Optimizing thread performance for a genomics variant callerOptimizing thread performance for a genomics variant caller
Optimizing thread performance for a genomics variant caller
 
Oculus insight building the best vr aaron davies
Oculus insight building the best vr   aaron daviesOculus insight building the best vr   aaron davies
Oculus insight building the best vr aaron davies
 
Dynamic Wounds on Animated Characters in UE4
Dynamic Wounds on Animated Characters in UE4Dynamic Wounds on Animated Characters in UE4
Dynamic Wounds on Animated Characters in UE4
 
Component-first Applications
Component-first ApplicationsComponent-first Applications
Component-first Applications
 
Making a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie TycoonMaking a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie Tycoon
 
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
 
Smooth Animations for Web & Hybrid
Smooth Animations for Web & HybridSmooth Animations for Web & Hybrid
Smooth Animations for Web & Hybrid
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problem
 
Software testing and quality assurance
Software testing and quality assuranceSoftware testing and quality assurance
Software testing and quality assurance
 
APB Customisation System
APB Customisation SystemAPB Customisation System
APB Customisation System
 
Evaluation Activity 6
Evaluation Activity  6Evaluation Activity  6
Evaluation Activity 6
 

Kürzlich hochgeladen

9892124323 | Book Call Girls in Juhu and escort services 24x7
9892124323 | Book Call Girls in Juhu and escort services 24x79892124323 | Book Call Girls in Juhu and escort services 24x7
9892124323 | Book Call Girls in Juhu and escort services 24x7Pooja Nehwal
 
CALL ON ➥8923113531 🔝Call Girls Saharaganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Saharaganj Lucknow best sexual serviceCALL ON ➥8923113531 🔝Call Girls Saharaganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Saharaganj Lucknow best sexual serviceanilsa9823
 
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best Night Fun service
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best Night Fun serviceCALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best Night Fun service
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best Night Fun serviceanilsa9823
 
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost Lover
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost LoverPowerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost Lover
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost LoverPsychicRuben LoveSpells
 
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Call US Pooja 9892124323 ✓Call Girls In Mira Road ( Mumbai ) secure service,
Call US Pooja 9892124323 ✓Call Girls In Mira Road ( Mumbai ) secure service,Call US Pooja 9892124323 ✓Call Girls In Mira Road ( Mumbai ) secure service,
Call US Pooja 9892124323 ✓Call Girls In Mira Road ( Mumbai ) secure service,Pooja Nehwal
 
FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCR
FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCRFULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCR
FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCRnishacall1
 

Kürzlich hochgeladen (7)

9892124323 | Book Call Girls in Juhu and escort services 24x7
9892124323 | Book Call Girls in Juhu and escort services 24x79892124323 | Book Call Girls in Juhu and escort services 24x7
9892124323 | Book Call Girls in Juhu and escort services 24x7
 
CALL ON ➥8923113531 🔝Call Girls Saharaganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Saharaganj Lucknow best sexual serviceCALL ON ➥8923113531 🔝Call Girls Saharaganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Saharaganj Lucknow best sexual service
 
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best Night Fun service
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best Night Fun serviceCALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best Night Fun service
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best Night Fun service
 
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost Lover
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost LoverPowerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost Lover
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost Lover
 
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort Service
 
Call US Pooja 9892124323 ✓Call Girls In Mira Road ( Mumbai ) secure service,
Call US Pooja 9892124323 ✓Call Girls In Mira Road ( Mumbai ) secure service,Call US Pooja 9892124323 ✓Call Girls In Mira Road ( Mumbai ) secure service,
Call US Pooja 9892124323 ✓Call Girls In Mira Road ( Mumbai ) secure service,
 
FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCR
FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCRFULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCR
FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCR
 

How we optimized our Game - Jake & Tess' Finding Monsters Adventure

  • 1. How we optimized our Game – Jake & Tess’ Finding Monsters Adventure Phil Lira Sr. Staff Engineer (Graphics) @phi_lira
  • 4.
  • 5. Technical challenges Many custom shaders and effects
  • 6. Technical challenges Many custom shaders and effects
  • 8. Our budget is the limit • Push as much content as possible with smooth gameplay and no overheat – Can we get the same quality with a similar approach? – Are we doing something we don’t need to?
  • 9. What if we hit our budge • What happens when we fail? – Either gameplay or visual quality will be impacted • When it comes to remove effects, trust is important
  • 11. Optimization Process • Do not make any assumptions. • A profiler will tell you where the bottleneck is. Profile Optimize Test
  • 12. Optimization Process • Rewrite code to use resources more efficiently • Often we can fake or simplify effects • Experience comes into play here. OptimizeProfile Test
  • 13. Optimization Process • Guarantee your tests have same conditions • Did you work reduced overall gpu ms? TestProfile Optimize
  • 14. How to find our bottleneck? • Unity comes with a built-in profiler that does most of the work • We wanted to have more detailed GPU info – Adreno Profiler – Snapdragon GPUs – Mali Graphics Debugger (MGD) and DS-5 Streamline – Mali GPUs
  • 16. How to find our bottleneck? Disable GL Frame rate increased? No Yes CPU Bound GPU Bound Vertex Frag Memory
  • 17. How to find our bottleneck? • Vertex – #triangles – Vertex shader – Per-vertex lighting • Fragment – Fragment Shader (instruc. / sample) – Blend Ops – Per-Pixel light (forward rendering) • Bandwidth – Large textures – Dependent Texture Reads – Block Resolve (ReadPixels)
  • 18. CASE STUDY – ROYAL MOON
  • 19. Case Study – Royale Moon • Triangles 106k • Drawcalls 87 • Overdraw 2.51x • Shader Stats: – Up to 160 ALU/Frag – Up to 7 texture samples • Adreno %Time Shading Fragment - max – Fragment bound
  • 21. Case Study – Royale Moon • Early Z-Test Discards occluded fragments • Render Order Matters • Optimized Render Order – Opaques – Front to Back – Skybox – Transparent – Back to Front – Overlay (UI / HUD) We need to improve this
  • 22. How to assign object to sorting layers? • Per Shader – Have to duplicate shader files. Hard to maintain because we have to make changes individually to each duplicate. • Per Mesh – Not scalable, requires lot of work. – Risky! May break batches by mistake. • Per Material – YES! – In that case do not use same material for different scene • While you fix sort for one might break for the other.
  • 23. Custom Material Inspector • Created an editor script BRSMaterialEditor to set Material.renderQueue • Add CustomEditor “BRSMaterialEditor” to the end of shader file.
  • 27.
  • 29. Before and After Improving Sort Reduced from 2.51 to 1.91
  • 32. Shader hotzone (% time shading)
  • 33. Shader hotzone (ALU per frag)
  • 34. • Improving Shader Instructions – Model: ops that can be done once per drawcall • Use scripts to compute and pass values to shader • Input Vector Normalization (ex. Rim Light) • Scroll Offset – Vertex: Ops that can be done per vertex • Uniform texture tile & offset – Fragment: Ops that needs to be done per pixel • Equation simplification • Half & Fixed precision for better thermal • Saturate vs max(0.0, dot) Fragment Vertex Model COMPLEXITY How to optimize fragment shader
  • 35. Optimizing Shaders • Many custom shaders done in ShaderForge – ShaderForge does heavy work on fragment • Many variants and not exactly the same code structure • How to optimize them all? – 1st pass optimizing in ShaderForge – 2nd pass optimizing in Code
  • 36. 1st Pass: ShaderForge • Identify core changes to lighting model – BlinnPhongWrapped – BlinnPhongRamp • Created custom code node – Artist helped with the process to replace for this code – This made shader code common and more organized
  • 38. Custom Lightmap in ShaderForge • One major art complain was the lack of support for lightmap in custom lighting • Created a Lightmap node for them • Problem1: Need to enable lightmap in config shader header. • Problem2: ShaderForge does not exposes interpolated data.
  • 39. 2nd Pass: Shader Code Created a cginc file with macros for optimized code • ShaderForge follows name convention for input data
  • 40. The results - Ground Shader After optimization: Before optimization: • Avg ALU/Frag – ~21% reduction • Fragments Shaded – ~45% reduction Overall Improvement: ~7ms • Fragment Instructions – ~64% reduction
  • 41. Further Improvements • Fallback Shader – We came across some problems with shaders not being supported for some configurations – Vertex Animation with a noise texture (tex2dlod) is not supported on OpenGL ES 2.0 profiles – Fallback shader to standout in those cases – Makes it easy to differentiate from other errors
  • 43. ASTC • Optimal performance with high quality • Improves bandwitdh and power consuption • Galaxy Note 4, Galaxy S6 and above support it • Supported with OpenGL 3 Unity profile
  • 44. ASTC ASTC 4x4 ASTC 6x6 ETC 2
  • 45. ASTC Format RGB RGBA Normal Map Codec ASTC 6x6 ASTC 4x4 ASTC 4x4 BPP 3.56 8 8 Size vs Uncompressed 14.8% 50% 50% Size vs ETC2 89% 100% 100% Recommended Settings:
  • 46. Review • Do not make assumptions, use a profiler. • GPU profilers will give you in-depth data per drawcall • One can assign objects to sorting layers at material level for best workflow • Reduce amount of work to optimize shader by creating means to reuse optimized code. • ASTC texture compression is best option available for quality but only supported in a few devices.

Hinweis der Redaktion

  1. We will play Finding Monsters Release Trailer Here.
  2. Optimizitation allows us to push more content at higher framerates. We want to push as much content as possible without impacting gameplay. At mobile we are also concerned with Overheat and Battery time. Optimizing for thermal will give more gameplay time for players. While Optimizing, we frequently ask ourselves the following: * Can we get the same quality with a similar effect? For instance, if you want to take a screenshot, a RenderTexture is faster in most cases than doing a ReadPixels. Or sometimes we can make some simplifications in the shader to achieve a similar effect. However, there’s no free lunch. I often come up to the technical artists and say: “Hey, we can achieve a very similar effect but we’ll have to change some material properties and/or maps.” * Are we doing something we don’t need to? For instance, creating and destroying game objects while you could be pre-alocating and caching them.
  3. If we fail at further optimizing our game and still consume more resources the GPU can offer, either gameplay (lower framerates) or visual quality will be impacted. Usually we favor smooth gameplay over visual quality and end up removing effects. When it comes to that, Trust plays an important role. At Blackriver studios we built a team upon trust. We look out for each other. I know the effort, dedication and passion the art team put into our games and I do my best to optimize it. When it comes to the point I say we must make some adjustments that will impact visuals they know that we really do.
  4. We need to find the responsible for consuming those precious ms of your game. Engineers often tend to make assumptions on what might be slowing down our game and off course those assumptions get better with experience. However, one golden rule of optimization is to never assume anything. Sometimes the culprit is something that looks fairly simple like a blob shadow for instance. Use a profiler to tell where your bottleneck is. If you’re optimizing something that’s not your bottleneck then you’re wasting time.
  5. Once you find your bottleneck then it’s time to actually get the hands on optimizing the hotzone. Experience plays an important role here and will give you a hint of what to do.
  6. Finally we want to test if we actually had some improvement. One very important thing is to note that the test scenario has to have exactly the same conditions of the scenario we profiled or you might get wrong results. sometimes. That might be a little tricky though. At the end, we see how many ms we saved and repeat it all over again.
  7. How to find the bottleneck? Profilers will timestamp your game to tell what the hot zones are. Unity comes with a builtin profile that can do most of the work. However, we want to have more detailed info on what’s going on in the GPU. We used GPU profilers for that. They come with specific counters that can tell you easily the graphics pipeline hotzones and even allow you to replace a few resources while running to speed up your tests. * Adreno Profiler is a all-in-one solution to profile Qualcomm’s Snapdragon GPUs. * Mali Graphis Debugger and DS-5 Streamline are tools provided by ARM to debug and profiler Mali GPUs.
  8. Throughout this talk will show how we profiled our game using Adreno GPU Profiler.
  9. Our optimization workflow goes like this: We fire up Adreno Profiler. There’s an override to disable all OpenGL calls submitted to GPU. Disable OpenGL calls -&amp;gt; (Does it greatly improve fps?) -&amp;gt; No -&amp;gt; We’re CPU bound -&amp;gt; Go for Unity profiler. (You might get Render data there, in that case you have too much driver overhead) Yes -&amp;gt; GPU Bound -&amp;gt; Adreno also has many counters to tell which stage of the pipeline is stalled % time vertex (draw calls and triangles, index vs triangle ratio, vertex shader) % time fragment (frag shader instructions, blend, overdraw, texture sampling &amp; filtering) memory stalls (blocking resolves, texture bandwidth)
  10. One can breakdown the graphics pipeline into 3 macro stages: Vertex, Fragment, and Bandwidth. Vertex Bound: Improve Index Locality for better cache. (Unity does this for you if you toggle Optimize Mesh at import settings.) Use less vertex attributes possible (normals, color, tangent, etc). Each additional attribute might split your vertices. Decrease the amount of triangles sent to GPU by performing Frustum &amp; Occlusion Culling and by using Mesh LOD and Impostors to render distant meshes. Simplify Vertex Shaders: Per-vertex lights. GPU Skinning Vertex Offset Fragment Bound: Simplify Fragment Shader Amount of instructions and samples in texture Dependent Texture Reads Blending Decrease amount of per-pixel lights. (Forward Rendering) Bandwidth Use compression and mipmaps. Avoid operations that stall GPU (block resolve). ReadPixels for instance.
  11. Royal Moon is one of the stages in our game. We’ll show it a few techniques we used to optimize it.
  12. This is the breakdown of our scene. We’re clearly Fragment Bound.
  13. Here’s an Overdraw debugger captured with Adreno Profiler. Brighter pixels are hot zones and tell that how much they have been written. For opaque meshes every time we redraw a pixel we’re wasting time. We need to sort our scene for optimial performance. From this image we can do less fragment operations by reducing overdraw ratio.
  14. Whenever you process a fragment in the frag shader, the GPU already know it’s depth or z value. The GPU then can test if the current fragment is already occluded by a previously computed one and discard it as it will not have any effect in the final image. That is called Depth Test. Thus, the order in which you render your objects matters for perfomance as you can try to maximize the amount of fragments that gets discarded. The best way to render your scene is: Render Opaque objects from Front to Back. Render Skybox Render Transparent objects from Back to Front. The reason for that is because in order to correctly blend alpha objects must have the value of pixels behing it already computed. Overlays (HUD/UI) In order to improve overdraw we need to improve the render order of our opaque objects. Unity already does this for you based on game object pivot. However there are some special cases that doesn’t work (as we can see in our image). What we can do is to group objects into different sorting layers to improve those specific cases.
  15. We have a few options when it comes to assign objects to different sorting layers. Per-Shader: In Unity you can set a RenderQueue in the ShaderLab file. The problem with that is that you’ll have to duplicate a shader file just to assign it to a different sorting layer. It will increase shader compilation and warm time and that’s not easily managed. Plus, when a change is required in the shader we’ll have to propagate to all variants manually. Per-Mesh: This is not scalable and requires a lot of work tweaking per-mesh settings. Also, this is risky as assigning objects to different layers will break batches and one might do it by mistake. Per-Material Seems a balanced approach. It’s easy to group and create materials. One can do per-scene materials to make sure the work done in a scene doesn’t affect other.
  16. Unity allows to extend material inspector by creating a custom MaterialEditor script called BRSMaterialEditor. We created one that exposes the render order and layer to easily tweak it. In order to use it on just need to add the following line to the end of the ShaderLab file: CustomEditor “BRSMaterialEditor”
  17. We ended up having five opaque render layers for this scene: 1) Character and Props 2) Island Top that camera is on. 3) Outer Islands 4) Planets 5) Skydome
  18. This is a comparative of before and after improving sort. You can see now that characters have much better overdraw and bottom islands don’t appear anymore. OBS: The ground will appears darker on the first image due to me capturing the frame without rendering shadows by mistake (which add a additional render pass to render the ground)
  19. This is a hightlight of Depth Test discards. You can think of this image as a negative to the previous one where more red the better. You can see characters and planets now have much more discards too.
  20. Adreno Profiler provides a nice and fast way to see your fragment shader hotzone. You can query pre-draw call stats like Fragment Instructions, Textures / Fragment and Math Ops / Frag and Adreno will colorize each one of them. This picture sorts drawcalls by the % percent of time spent on shading fragments, which is our bottleneck. This counter takes into account the fragments shaded * complexity to shade fragments. This picture shows that the ground is the rendercall that spents most time shading fragments. So that will be a good candidate for improvement.
  21. Here´s another interesting shader we can look at. Characters. They have the most ALU/frag and texture/frag in the scene. So, why isn´t it the this the rendercall that spends more time shading fragments. Simply due to the fragments shaded being about 1/3 of the one in the ground. Remember the ground was renderer prior to characters before we optimized for overdraw.
  22. One good thing to notice when optimizing fragment shaders is to do less operations possible on it. If there’s something we can do at vertex or even at model that would be best.For instance, one of our monsters has a inner point light inside of him. This light flickers by adjusting light intensity using a Fourier Sum of sines. We don’t need to compute this light intensity per-fragment not per-vertex. We do it at a script level. Then we pass the light intensity as a uniform to the shader. Another example is: if we know all of our textures that use uv0 have the same tile &amp; offset we can perform this at vertex instead of doing at fragment. This will save us not only a few instructions on the fragment but also be better for gpu to sample the textures. At fragment level we can also do some micro-optimizations.
  23. One of the challenges that we faced to optimize the shaders of this game was the fact that most shaders were authored by Tech Artists using a visual node tool called ShaderForge. Although ShaderForge is a nice to create and prototype shaders it does heavy work on fragment, which is far from ideal in our case. Also, due to the shaders being written by a visual node tool frequently there are tons of mini variations that don’t produce the same code. At this point we came to the question of how to optimize all these shaders. We did it in 2 passes. First we did a first pass on ShaderForge and in the shader code to optimized for things ShaderForge don’t account for.
  24. In the first pass we first identified all the core lighting model functions. Most of the shaders were using variations of BlinnPhong with Diffuse Wrap and Ramps. There were some other variations not to the core of lighting like Rim Lights and Custom Fog. ShaderForge allows one to create code nodes. Then, we created a few code to implement uniformly these core lighting functions and with the help of the artists replicated in the shaders we had. Also, we also come up with a solution to save/load these code nodes. if we ever needed a change to this core code nodes, we could just change them and replicate to other shaders as opposed to make changes individually to each one. This made shader code more uniform and easy to work on later.
  25. This is an example of a Code Node we did. Ambient color is not applied in it as you can see because not all of our shaders use it.
  26. While optimizing the shaders, one major complain from art was the lack of support for lightmap for custom lighting shaders in ShaderForge. I sat down with our artists and we discussed how they wanted it to be implemented. We came up with a solution with a code node that worked with minimal changes required. We found out the following: Although ShaderForge doesn’t support lightmap in custom lighting one can open the shader file and change the variable lmpd:False to lmpd:True. ShaderForge does not rewrite the shader header when it gets compiled, so we only needed to do this once per new shader. Another problem we found was that we have no means to get fragment shader input interpolated data. We have to pass that as input with a Node property and had to reapply tile/offset to lightmap uv in the fragment. Later, when we optimize in shader code we move this to vertex.
  27. In a second pass we optimized for the shader in code. First we created a cginc file to add all of our functions and MACROS. Because the code is uniform, i.e, all vertex and fragment has same name conventions and core functions have all same params names we can easily replace ShaderForge generated code with our optimized one by replacing with MACROS. In the picture you can see the MACROS and functions we made make the shader code really clean and lean. Plus, if we come up with a improvement they will all be replicated to all shaders. You can also see the custom material editor and our error fallback shader in which I will discuss further in this presentation.
  28. Here’s rough comparative of the results we got for the ground shader. We came from 90.50 ALU/Frag down to 71. The fragments shaded were reduced by 45% accouting for a total of 64% in the fragment instructions. Considering all shaders optimized for this scene, the total improvement was roughly 7ms.
  29. As a further improvement in the shaders we came up with a fallback error shader. We came up with a few errors in the shaders. Some of them were related to features not supported in OpenGL 2.0 profiles like doing a vertex offset by sampling a texture in the vertex shader (with tex2dlod). In those cases, the shader was fallbacking to plain diffuse which was kind of hard to spot right on. We then created a fallback error shader to make it easily standout when a shader is not supported in our current configurations. That makes it really easy to standout from other shader problems.
  30. Texture compression is important to improve bandwidth and improve power consuption. Blocky texture compression like ASTC, ETCn, DXTn, ATC and PVRTC are straighforward to GPUs and they don’t need to decompress it in order to read. However, the algorithms lose information when compress the texture (they are called lossy compression). ASTC is a texture compression developed by ARM that has the advantage of a block texture compression speed without losing much of the texture quality. At the moment, Samsung’s Galaxy Note 4, Galaxy S6 and above support it. It is supported in Unity OpenGL 3 profile and for Android Lollipop Android devices.
  31. Unlike other texture compression formats, ASTC supports different block compression configurations, allowing one to tweak tradeoff between performance and quality. Here’s a comparative of ASTC4x4, ASTC6x6 and ETC2. It is important to notice that even ASTC 6x6 having a block size larger than ETC2 the quality of the compression is still much better.
  32. This table showing the texture configuration we have for our most common assets. It’s also interesting to notice that ASTC4x4 provides the same size of ETC2 however with greater quality.