This is the presentation deck from my talk at Unite 2019 . This focused on the writing scriptable modular and GPU efficient shaders in Unity along with hardware flexibility and optimizations.
5. Generative
Art
—
Made
with
Unity
Introduction to Traditional
CG-HLSL
General CG –HLSL is built with Nvidia’s CG and wrapped with Unity’s
ShaderLab
Typically Unity shaders have a definite codebase:
❑ ShaderLab- Deals with properties of the material ,wrapper for CG code.
❑ CG code- Deals with vertex and fragment shaders
❑ ShaderLab- Completes the CG codebase
6. Generative
Art
—
Made
with
Unity
Traditional Unlit Shader and
Modularization
The most fundamental shader is the Unlit shader which can be used to create a
variation in the geometry of the object during rendering. The general
architecture involves writing Properties followed by vertex attribute structs and
then channelling those attributes through Subshaders specific to graphics cards
and through individual Pass(es) to change the geometry .Ultimately the
geometry shaders is passed to the pixel /fragment shader to be rendered per
frame on the screen.
The modular approach lies in abstracting common includes ,pragma directives
and structs which may be called several times based on the Pass required and
lighting in the scene. Instead of repeatedly adding the same codebase
,modularizing it can improve performance as well as complexity, since Pass(es)
are computationally very intensive.
7. Generative
Art
—
Made
with
Unity
Unlit Shader – Brief Overview
A typical Unlit shader starts with “Shader” tag which
specifies the location in the editor where the shader
Is located.
The Properties tag declares all the properties which
Will be rendered in the editor for tweaking .
8. Generative
Art
—
Made
with
Unity
Unlit Shader – Brief Overview
The Subshader contains the core vertex and fragment
shaders ,attributes.
It contains Pass which are used for different light
models based on Forward or Deferred Rendering Path
The Tags property contain the details of the render
type, queue and buffers .
Also LOD, Zwrite are other features which can be
added
Inside the Pass ,lies the CGPROGRAM Code which
contains structs and attributes along with the vertex and
fragment shaders and intermediate geometry shaders..
9. Generative
Art
—
Made
with
Unity
Unlit Shader – Brief Overview
The header inclusions such as “UnityCG.cginc” and
the vertex and fragment declarations with #pragma are
to be notified at the start of the CG code.
Struct Appdata contains the vertex attributes such as
position, normals and uv(texture maps) which are
passed to the fragment shader via the v2f struct
Struct v2f contains attributes which are passed to the
pixel or fragment shaders via geometry shaders to be
drawn on the screen.
10. Generative
Art
—
Made
with
Unity
Unlit Shader – Brief Overview
The variables declared in the Properties tag are to be re
declared inside the CG code to be used for pixel
rendering.
The v2f vertex shader modifies the vertex position via
Model ,View and Projection matrices to clip to local
Camera space ,and the uv 2D wrapping over the
vertices are done in this stage.
After the vertex shader is complete ,the fragment
shader is used to render the pixels on the screen. The
color attribute specified in the Properties is used along
with the uv wrapping to get colored texture on the
object.Finally the alpha channel of the Color (RGBA) is
used to specify the Transparency which is also added
as an editor property.
11. Generative
Art
—
Made
with
Unity
Unlit Shader – Modularization
The modularization is achieved by declaring includes
and pragma outside the SubShader layer so that it does
not execute for each SubShader. This reduces the
computation complexity of vertex and fragment
shaders.
One such significant modularization involves declaring
the
Struct appdata{
// structs for vertex attributes
};
Outside the SubShader, and making it global so that we
donot have to declare for each SubShader based on
hardware requirements.
12. Generative
Art
—
Made
with
Unity
Unlit Shader – Modularization
Unlit Shaders which mostly use the general :
#include “UnityCG.cginc”
and other common commands for initialization can be
put outside the SubShader.
The extensive optimization is achieved when the struct
appdata component is declared outside the Pass
function. Computing Pass takes time based on the
Forward/Deferred lighting paths and adding the same
declarations causes overhead. The remaining part of
the program for modification of pixel shaders can be
written based on the render type required.
As a key optimization it is best to write as much code as
possible outside the Pass function to reduce
complexity.
13. Generative
Art
—
Made
with
Unity
Traditional Surface Shader and
Modularization
The Surface shader controls the different lighting models, diffused reflections
and is used to implement features like Lambert relection,Bumped ,Specular
effects, Blinn-Phong model. The surface shader consists of texture mapping
and modifying the texture coordinates based on the direction of light and
normals of the texture. The anatomy of Surface Shader is similar to Unlit
shader with the exception of vertex and fragment functions and addition of
surface functions(eg.Lambert).
Like unlit shaders,the modular approach in making standard surface shaders
less computationally intensive is to have preprocessor directives and pragmas
along with general structs defined before the SubShader and Pass layers.
Apart from that ,since many hybrid shaders pass the output of fragment (unlit)
shader to the surface shader ,these global declarations are very useful to
reduce computation of sequential shader models.
16. Generative
Art
—
Made
with
Unity
Variants of Surface Shader
The traditional Surface shader program variants can be tweaked to get
different effects without any customised rendering path code :
• Surface Bump Shader
•
• Surface Detailed Shader
• Surface Rim Shader
• Surface Normal Map Shader
18. Generative
Art
—
Made
with
Unity
Brief Overview of Rendering
Paths for Customized Lighting
The main lighting paths are:
Forward Rendering: In Forward Rendering, some number of brightest lights
that affect each object are rendered in fully per-pixel lit mode. Then, up to 4
point lights are calculated per-vertex. The other lights are computed as
Spherical Harmonics (SH), which is much faster but is only an approximation.
Less computationally intensive due to non calculation of normal maps, and do
not have sharp frequency transitions.
Rendering happens per-pixel for high intensity lights to per-vertex or SH . Base
pass supports per-pixel lighting and subsequent 4 lights are passed as
per-vertex mode and remaining in per-SH mode.
19. Generative
Art
—
Made
with
Unity
Brief Overview of Rendering
Paths for Customized Lighting
The main lighting paths are:
Deffered Rendering: When using deferred shading, there is no limit on the
number of lights that can affect a GameObject
. All lights are evaluated per-pixel, which means that they all interact correctly
with normal maps, etc. Additionally, all lights can have cookies and shadows.
Deferred shading has the advantage that the processing overhead of lighting is
proportional to the number of pixels
the light shines on. This is determined by the size of the light volume in
the Scene regardless of how many GameObjects it illuminates. Therefore,
performance can be improved by keeping lights small. Deferred shading also
has highly consistent and predictable behaviour. The effect of each light is
computed per-pixel, so there are no lighting computations that break down on
large triangles.
20. Generative
Art
—
Made
with
Unity
Optimization of Surface Shaders
based on Custom Lighting
The general lighting models used includes the Lambert and Blinn-Phong
models.
As a general rule ,when declaring custom shaders based on Forward and
Deferred Rendering paths, it is recommended to use half4 datatype .
This reduces complexity in computing the shaders per frame when using
custom lighting models.
The precision to fixed4 can be used for very simple operations on texture data.
Again modularization of custom lighting models in Surface shaders gives a
great performance boost.
22. Generative
Art
—
Made
with
Unity
Replacement Shaders
Some rendering effects require rendering a scenewith a different set of shaders.
For example, good edge detection would need a texture with scene normals, so it
could detect edges where surface orientations differ. Other effects might need a
texture with scene depth, and so on. To achieve this, it is possible to render
the scene with replaced shaders of all objects.
If replacementTag is empty, then all objects in the scene are rendered with the
given replacement shader.
If replacementTag is not empty, then for each object that would be rendered:
The real object’s shader is queried for the Tag value
If it does not have that tag, object is not rendered.
A SubShader is found in the replacement shader that has a given tag with the
found value. If no such subshader is found, object is not rendered.Now that
subshader is used to render the object.
23. Generative
Art
—
Made
with
Unity
Optimization of Shaders
The optimization of custom shaders ,hybrid shader modules can be attained by
a variation of performance metrics and shader code changes.
Optimizing Coupled Surface Shader :
The general light models when used with Unlit shader(Unlit + Lambert Surface)
can be optimised based on requirement of forward light passes.
noforwardadd makes a shader fully support one-directional light in Forward
rendering only. The rest of the lights can still have an effect as per-vertex lights
or spherical harmonics. This is great to make your shader smaller and make
sure it always renders in one pass, even with multiple lights present.
24. Generative
Art
—
Made
with
Unity
Optimization of Shaders
Noambient disables ambient lighting and spherical harmonics lights on a
shader. This can make performance slightly faster
Approxview directive for shaders that use view direction (i.e. Specular) makes
the view direction normalized per vertex instead of per pixel. This is
approximate, but often good enough.
Halfasview for Specular shader types is even faster. The half-vector (halfway
between lighting direction and view vector) is computed and normalized per
vertex, and the lighting function receives the half-vector as a parameter instead
of the view vector
25. Generative
Art
—
Made
with
Unity
Precision of Computations
When writing shaders in Cg/HLSL, there are three basic number
types: float, half and fixed .
For good performance, always use the lowest precision that is possible. This is
especially important on mobile platforms like iOS
and Android. Good rules of thumb are:
For world space positions and texture coordinates, use float precision.
For everything else (vectors, HDR colors, etc.), start with half precision.
Increase only if necessary.
For very simple operations on texture data, use fixed precision.
In practice, exactly which number type you should use for depends on the
platform and the GPU.
26. Generative
Art
—
Made
with
Unity
Alpha Test Optimization
The fixed-function Alpha Test- or its programmable equivalent, clip() - has different
performance characteristics on different platforms:
Generally you gain a small advantage when using it to remove totally
transparent pixels on most platforms.
However, on PowerVR GPUs found in iOS and some Android devices, alpha
testing is resource-intensive. Do not try to use it for performance optimization
on these platforms, as it causes the game to run slower than usual.
Variants of Alpha Testing when used with Unlit Vertex Shaders to produce blending
between two alpha levels using (Blend src Alpha OneMinusAlpha) gives boost in
performance.
27. Generative
Art
—
Made
with
Unity
Hybrid Shader Performance
Metrics
When building sequential passes or SubShaders and applying several lighting
models ,there is often an extensive computation . Hybrid shader performance relies
mainly on the complexity of the pixel and vertex shader code . By refactoring the
codebase by transferring similar declarations/preprocessor directives and functions
/structs out of the Pass blocks and out of the shaders produces a great
performance boost. Taking all the declarative details and controlling them by script
even improves performance .
A very common hybrid shader to draw simple outline while maintaining custom
texture and lighting of the 3d model can be very intensive when repeatedly same
structs and pragmas are passed more than once in the Subshader section.
Modularizing along with other parameter tuning can provide better results. Some of
those scriptable parameters are discussed.
28. Generative
Art
—
Made
with
Unity
Hybrid Shader Performance
Metrics
SubShader Specification: When Unity chooses which subshader to render with, it
renders an object once for each Pass defined (and possibly more due to light
interactions). As each render of the object is an expensive operation, you want to
define the shader in minimum amount of passes possible
Pass filtering: Based on the optimization requirements ,there can be Regular
Pass, Use Pass and Grab Pass
Regular Pass and Use Pass are used for optimizing shaders,by re-using
alternate shader codes/ Pass. Grab Pass grabs the contents of the screen where
the object is about to be drawn into a texture. This texture can be used in
subsequent passes to do advanced image based effects and is computationally
intensive.
29. Generative
Art
—
Made
with
Unity
Hybrid Shader Performance
Metrics
Pass (Regular): Pass causes geometry to be
rendered once .Controls various parameters such
as Blending ,Depth Testing, Alpha Testing,Color
Mask for optimization.
Variants include:
Cull
Zwrite
Ztest
Offset
Blend
Color Mask
30. Generative
Art
—
Made
with
Unity
Hybrid Shader Performance
Metrics
Use Pass: Some of the shaders could reuse existing passes from other shaders,
reducing code duplication. For example, you might have a shader pass that draws
object outline, and you’d want to reuse that pass in other shaders. The UsePass
command does just that - it includes a given pass from another shader
31. Generative
Art
—
Made
with
Unity
Hybrid Shader Performance
Metrics
Shader LOD optimization: Shader Level of Detail (LOD) works by only
using shaders or subshaders that have their LOD value less than a given number.
Built-in shaders in Unity have their LODs set up this way:
VertexLit kind of shaders = 100
Decal, Reflective VertexLit = 150
Diffuse = 200
Diffuse Detail, Reflective Bumped Unlit, Reflective Bumped VertexLit = 250
Bumped, Specular = 300
Bumped Specular = 400
Parallax = 500
Parallax Specular = 600
32. Generative
Art
—
Made
with
Unity
Advanced Optimization
techniques
Further optimization of shaders can also be attained by using a combination of all of
the discussed optimization techniques along with Color Mask/Stencil optimization
for drawing specific portions of the vertices for rendering.
Multiple shader program variants is another feature attained by using
#pragma multi_compile A B C command. This causes reusability of base shader
code “mega code” to produce variant shaders.
Shader Compilation Target Levels: The greater the target version ,the more
optimization /instancing by GPU is achieved.However when using tesellation
/geometry shaders the shader code sets compilation target to 4 or 4.6
Depth Texture to optimize the pizel values and G buffer for perspective
view.Effects such as culling and occlusion can greatly optimize the Depth buffer.
33. Generative
Art
—
Made
with
Unity
API specific Platform
Performance
Unity supports Directx, Opengl ,Metal APIs .However each of these platforms have
differences as Unity flips rendering upside down ,matching OpenGL /GLES
platforms.
When you use Image Effects and anti-aliasing, the resulting source Texture for an
Image Effect is not flipped to match the OpenGL-like platform convention. In this
case, Unity renders to the screen to get anti-aliasing and then resolves rendering
into a Render Texture for further processing with an Image Effect .
Some GPUs (most notably PowerVR-based ones on iOS) allow you to do a form of
programmable blending by providing current fragment color as input to
the Fragment Shader (inout )
Shaders typically for mobile devices have optimizations of LOD,stencil ,masking
applied so as to reduce the complexity of the pixel shader.Optimizations can be
further attained by using Sampler states which samples the texture coordinates for
uv wrapping. Using the geometry shader to cluster vertices /displacement mapping
instead of Directx11 Core Tesellation shader gives better advantage.
34. Generative
Art
—
Made
with
Unity
Summary
• General modularization of Shaders,customization with
rendering paths
• Optimization Tricks for Surface shaders on Forward and
Deferred Rendering
• Replacement Shaders
• Hybrid Shader optimization techniques
• Advanced shader code implementation methods
• Platform specific shader code optimization and
modularization.