How we optimized our Game - Jake & Tess' Finding Monsters Adventure

How we optimized our Game – Jake & Tess’
Finding Monsters Adventure
Phil Lira
Sr. Staff Engineer (Graphics)
@phi_lira

RELEASE TRAILER
https://www.youtube.com/watch?v=STzdj04n7dc

Technical challenges
Many custom shaders and effects

Technical challenges
Multiple characters with complex skinning

Our budget is the limit
• Push as much content as
possible with smooth gameplay
and no overheat
– Can we get the same quality
with a similar approach?
– Are we doing something we
don’t need to?

What if we hit our budge
• What happens when we fail?
– Either gameplay or visual quality will
be impacted
• When it comes to remove
effects, trust is important

Optimization Process
• Do not make any assumptions.
• A profiler will tell you where the bottleneck is.
Profile Optimize Test

• Rewrite code to use resources more efficiently
• Often we can fake or simplify effects
• Experience comes into play here.
OptimizeProfile Test

• Guarantee your tests have same conditions
• Did you work reduced overall gpu ms?
TestProfile Optimize

How to find our bottleneck?
• Unity comes with a built-in profiler
that does most of the work
• We wanted to have more
detailed GPU info
– Adreno Profiler – Snapdragon GPUs
– Mali Graphics Debugger (MGD) and
DS-5 Streamline – Mali GPUs

Disable GL
Frame rate
increased?
No Yes
CPU Bound GPU Bound
Vertex Frag Memory

• Vertex
– #triangles
– Vertex shader
– Per-vertex lighting
• Fragment
– Fragment Shader (instruc. / sample)
– Blend Ops
– Per-Pixel light (forward rendering)
• Bandwidth
– Large textures
– Dependent Texture Reads
– Block Resolve (ReadPixels)

Case Study – Royale Moon
• Triangles 106k
• Drawcalls 87
• Overdraw 2.51x
• Shader Stats:
– Up to 160 ALU/Frag
– Up to 7 texture samples
• Adreno %Time Shading Fragment - max
– Fragment bound

Case Study – Royale Moon
• Early Z-Test Discards occluded fragments
• Render Order Matters
• Optimized Render Order
– Opaques – Front to Back
– Skybox
– Transparent – Back to Front
– Overlay (UI / HUD)
We need to improve this

How to assign object to sorting layers?
• Per Shader
– Have to duplicate shader files. Hard to maintain because we
have to make changes individually to each duplicate.
• Per Mesh
– Not scalable, requires lot of work.
– Risky! May break batches by mistake.
• Per Material
– YES!
– In that case do not use same material for different scene
• While you fix sort for one might break for the other.

Custom Material Inspector
• Created an editor script
BRSMaterialEditor to set
Material.renderQueue
• Add CustomEditor “BRSMaterialEditor”
to the end of shader file.

Before and After Improving Sort
Reduced from 2.51 to 1.91

Shader hotzone (% time shading)

• Improving Shader Instructions
– Model: ops that can be done once per drawcall
• Use scripts to compute and pass values to shader
• Input Vector Normalization (ex. Rim Light)
• Scroll Offset
– Vertex: Ops that can be done per vertex
• Uniform texture tile & offset
– Fragment: Ops that needs to be done per pixel
• Equation simplification
• Half & Fixed precision for better thermal
• Saturate vs max(0.0, dot)
Fragment
Vertex
Model
COMPLEXITY
How to optimize fragment shader

Optimizing Shaders
• Many custom shaders done in ShaderForge
– ShaderForge does heavy work on fragment
• Many variants and not exactly the same code
structure
• How to optimize them all?
– 1st pass optimizing in ShaderForge
– 2nd pass optimizing in Code

1st Pass: ShaderForge
• Identify core changes to lighting model
– BlinnPhongWrapped
– BlinnPhongRamp
• Created custom code node
– Artist helped with the process to replace for this code
– This made shader code common and more organized

Custom Lightmap in ShaderForge
• One major art complain was the lack of support for lightmap
in custom lighting
• Created a Lightmap node for them
• Problem1: Need to enable lightmap in config shader header.
• Problem2: ShaderForge does not exposes interpolated data.

2nd Pass: Shader Code
Created a cginc file with macros for optimized code
• ShaderForge follows name convention for input
data

The results - Ground Shader
After optimization:
Before optimization:
• Avg ALU/Frag – ~21% reduction
• Fragments Shaded – ~45% reduction Overall Improvement: ~7ms
• Fragment Instructions – ~64% reduction

Further Improvements
• Fallback Shader
– We came across some problems
with shaders not being supported
for some configurations
– Vertex Animation with a noise
texture (tex2dlod) is not supported
on OpenGL ES 2.0 profiles
– Fallback shader to standout in
those cases
– Makes it easy to differentiate from
other errors

ASTC
TEXTURE COMPRESSIONTEXTURE COMPRESSION

ASTC
• Optimal performance with high quality
• Improves bandwitdh and power consuption
• Galaxy Note 4, Galaxy S6 and above support it
• Supported with OpenGL 3 Unity profile

ASTC
Format RGB RGBA Normal Map
Codec ASTC 6x6 ASTC 4x4 ASTC 4x4
BPP 3.56 8 8
Size vs
Uncompressed
14.8% 50% 50%
Size vs ETC2 89% 100% 100%
Recommended Settings:

Review
• Do not make assumptions, use a profiler.
• GPU profilers will give you in-depth data per
drawcall
• One can assign objects to sorting layers at material
level for best workflow
• Reduce amount of work to optimize shader by
creating means to reuse optimized code.
• ASTC texture compression is best option available
for quality but only supported in a few devices.

Phil Lira
f.lira@samsung.com
@phi_lira
Q&A
CONTACTS
www.blackriverstudios.net
@BlackRvrStudios
/blackrivergames

Phil Lira
f.lira@samsung.com
@phi_lira
THANKS!
CONTACTS
www.blackriverstudios.net
@BlackRvrStudios
/blackrivergames

How we optimized our Game - Jake & Tess' Finding Monsters Adventure

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (17)

Ähnlich wie How we optimized our Game - Jake & Tess' Finding Monsters Adventure

Ähnlich wie How we optimized our Game - Jake & Tess' Finding Monsters Adventure (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (7)

How we optimized our Game - Jake & Tess' Finding Monsters Adventure

Hinweis der Redaktion