SlideShare ist ein Scribd-Unternehmen logo
1 von 71
Killzone Shadow Fall:
Threading the Entity
Update on PS4
Jorrit Rouwé
Lead Game Tech, Guerrilla Games
Introduction
• Killzone Shadow Fall is a First Person Shooter
• PlayStation 4 launch title
• In SP up to 60 characters @ 30 FPS
• In MP up to 24 players @ 60 FPS
• Gameplay logic has lots of
• Branches
• Virtual functions
• Cache misses
• Not suitable for PS3 SPU’s but PS4 has 6 x86 cores
What do we cover?
• What is an Entity?
• What we did on PS3
• Multi threading on PS4
• Balancing Entities across frames
• Performance issues found
• Performance results achieved
• Debug tools
What is an Entity?
What is an Entity?
• Base class for most game objects
• E.g. Player, Enemy, Weapon, Door
• Not used for static world
• Has Components
• E.g. Model, Mover, Destructibility
• Update at a fixed frequency
• 15, 30, 60 Hz
• Almost everything at 15 Hz
• Player updated at higher frequency to avoid lag
What is a Representation?
• Entities and Components have Representation
• Controls rendering, audio and VFX
• State is interpolated in frames where entity not updated
• Cheaper to interpolate than to update
• Introduces latency
• Always blend towards last update
Movie
Multi Threading Approach
PS3: 1 Entity = 1 Fiber
• Most time spent on PPU
• No clear concurrency model
• Read partial updated state
• Entities deadlock waiting for each other
yield fiber resume fiber
Time
PPU
SPU 1
SPU 2
Entity 1 Entity 2 Entity 3 Entity 1
Animation
Ray Cast
Entity 2
PS4: 1 Entity = 1 Job
• No fibers
• Entity updates as a whole
• How to solve race conditions?
Time
CPU 1
CPU 2
CPU 3
Entity 1
Entity 2
Entity 3
Entity 1Animation
Ray Cast Entity 2

job
• Make update order explicit:
Strong Dependencies
Missile
MissileMissile LauncherMissile Launcher Soldier
Time
Soldier
strong
dependency
• No (indirect) dependency = no access
• Works two ways: Weapon can access Soldier too
• Create dependency has 1 frame latency
• Global systems need locks
Non-dependent Entities can Execute Concurrently
Soldier1
Weapon1
Time
Soldier2
Weapon2
CPU 2
CPU 1 Soldier1 Weapon1
Soldier2 Weapon2
• A few entities cause huge bottleneck
What about this?
Time
Soldier1
Pistol1
Soldier2
Pistol2
Bullet System
Soldier1 Pistol1 Soldier2 Pistol2 Bullet System
Non-exclusive Dependencies
• Access to ‘Bullet System’ must be lock protected
Soldier1 Soldier2
Bullet System
non-exclusive
Time
CPU 2
CPU 1 Soldier1 Pistol1
Soldier2 Pistol2
Bullet System
Barrier
Pistol1 Pistol2
Weak Dependencies
• 2 tanks fire at each other:
• Update order reversed when circular dependency occurs
• Not used very often (< 10 per frame)
Time
Missile2Missile1
Tank1 Tank2 Missile2Missile1
Tank2 Tank1 Missile1Missile2
or
weak
strong
Tank1 Tank2
Non-updating Entities
• Entity can skip updates (LOD)
• Entity can update in other frame
• Do normal scheduling!
Entity1
Entity3
Entity1
Entity2 Entity3
Time
not updating
Summarizing Dependencies
Strong
Exclusive
Weak
Exclusive
Strong
Non-excl.
Weak
Non-excl.
Symbol
Two way access ✓ ✓ ✓ ✓
Order guaranteed ✓ ✓
Allow concurrency + + ++ ++
Require lock ✓ ✓
Referencing Entities
• Dev build: CheckedPtr<Entity>
• Acts as normal pointer
• Check dependency on dereference
• Retail build: Entity *
• No overhead!
• Doesn’t catch everything
• Can use pointers to members
• Bugs were easy to find
Working with Entities without dependency
• ThreadSafeEntityInterface
• Mostly read only
• Often used state (name, type, position, …)
• Mutex per Entity
• Send message (expensive)
• Processed single threaded when no dependency
• Schedule single threaded callback (expensive)
• Everything can be accessed
Scheduling Algorithm
Scheduling Algorithm
• Entities with exclusive dependencies merged to 1 job
• Dependencies determine sorting
• Non-exclusive dependencies become job dependencies
• Expensive jobs kicked first!
Missile
Missile Launcher
Tank
Bullet System
Pistol
Soldier
Missile Tank
Bullet System
PistolSoldier
Job1
Job2
job dependency
not updating
weak
non-exclusive
strong
Scheduling Algorithm – Edge Case
Entity1 Entity3
Entity4Entity2
non-exclusive
exclusive job dependency
Entity1 Entity2
Entity3 Entity4
Job1
Job2
Scheduling Algorithm – Edge Case
• Non cyclic dependency becomes cyclic job dependency
• Job1 and Job2 need to be merged
Entity1 Entity3
Entity4Entity2
non-exclusive
exclusive
Entity1 Entity2Entity3 Entity4
Job1
Balancing Entities Across Frames
Balancing Entities Across Frames
• Prevent all 15 Hz entities from updating in same frame
• Entity can move to other frame
• Smaller delta time for 1 update
• Keep parent-child linked entities together
• Weapon of soldier
• Soldier on mounted gun
• Locked close combat
Balancing Entities – In Action
Time in even frame
(sum across cores,
running average)
Balancing Entities – In Action
Time in odd frame
(should be equal)
Balancing Entities – In Action
Civilian @ 15Hz
update even frame
Balancing Entities – In Action
Civilian @ 15Hz
update odd frame
Balancing Entities – In Action
Flash on switch odd / even
Movie
Movie
Performance Issues
Performance Issues
• Memory allocation mutex
• Eliminated many dynamic allocations
• Use stack allocator
• Locking physics world
• R/W mutex for main simulation world
• Second ‘bullet collision’ broadphase + lock
• Large groups of dependent entities
• Player update very expensive
Cut Scene - Problem
• Cut scene entity requires dependencies
• 10+ characters in cut scene creates huge job!
Civilian 1
Cut Scene
Civilian 2 Player Camera
Cut Scene - Solution
• Create sub cut scenes for non-interacting entities
• Master cut scene determines time and flow
• Scan 1 frame ahead in timeline to create dependency
Civilian 1
Cut Scenenon-exclusive
Sub Cut Scene 1 Sub Cut Scene 2
Civilian 2
Sub Cut Scene 3
Player Camera
Using an Object
• Dependencies on useable objects not possible (too many)
• Get list of usable objects
• Global system protected by lock
• ‘Use’ icon appears on screen
• Player selects
• Create dependency
• Start ‘use’ animation
• Start interaction 1 frame later (dependency valid)
• Hides 1 frame delay!
Grenade
• Explosion damages many entities
• Creating dependencies not option (too many)
• ThreadSafeEntityInterface not an option
• Need knowledge of parts
• Run line of sight checks inside update
• Uses scheduled callback to apply damage
Performance Results
5000 Crates
Performance Results - Synthetic
100 Soldiers 500 Flags
Level
Counts Dependencies Max
Entities
in Job
SpeedupNumber
Entities
Updating
Entities
Number
Humans
Strong
Excl
Strong
Non-Excl
5000 Crates (20 µs each) 5019 5008 1 12 4 13 2.8X
100 Soldiers (700 µs each) 326 214 105 212 204 19 4.2X
500 Flags (160 µs each) 519 508 1 12 4 13 5.2X
6 cores!
Performance Results - Levels
Level
Counts Dependencies Max
Entities
in Job
SpeedupNumber
Entities
Updating
Entities
Number
Humans
Strong
Excl
Strong
Non-Excl
The Helghast (You Owe Me) 1141 206 32 71 23 20 4.1X
The Patriot (On Vectan Soil) 435 257 44 199 107 15 4.3X
The Remains (12p Botzone) 450 128 14 97 44 18 3.7X
The Helghast The RemainsThe Patriot
Game Frame - The Patriot
Game Frame - Global
Time
18 ms
Game Frame - Global
CPU 1 (main thread)
Game Frame - Global
CPU 2
Game Frame - Global
CPU 3
Game Frame - Global
AI
Physics
Entity Update VFX & Draw
Barrier
Game Frame – Entity Update
Fix Weak Dependency Cycles
Game Frame – Entity Update
Prepare Jobs
Game Frame – Entity Update
Link and Order Jobs
Game Frame – Entity Update
Execute Jobs
Game Frame – Entity Update
Single Threaded Callbacks
Game Frame – Entity Update
Player Job
Game Frame – Entity Update
Player Entity
Game Frame – Entity Update
Animation
Game Frame – Entity Update
Capsule Collision
Game Frame – Entity Update
Player Representation
Game Frame – Entity Update
OWL (Helper Robot)
Game Frame – Entity Update
Inventory
Game Frame – Entity Update
AI Soldier
Game Frame – Entity Update
Cloth Simulation
Game Frame – Entity Update
Cut Scene
Game Frame – Entity Update
Sub Cut Scenes
Game Frame – Entity Update
Cheap Entities (Destructibles)
Debug Tools
Debug Tools
• Dependencies get complex, we use yEd to visualize!
Debug Tools
Player
Debug Tools
Bullet System
Debug ToolsCut scenes
Conclusions
• Easy to implement in existing engine
• Game programmers can program as if single threaded
• Very few multithreading issues
Questions?
jorrit@guerrilla-games.com

Weitere ähnliche Inhalte

Was ist angesagt?

Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2Philip Hammer
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonAMD Developer Central
 
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)Philip Hammer
 
[KGC2014] DX9에서DX11로의이행경험공유
[KGC2014] DX9에서DX11로의이행경험공유[KGC2014] DX9에서DX11로의이행경험공유
[KGC2014] DX9에서DX11로의이행경험공유Hwan Min
 
Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Graham Wihlidal
 
Killzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo PostmortemKillzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo PostmortemGuerrilla
 
게임프로젝트에 적용하는 GPGPU
게임프로젝트에 적용하는 GPGPU게임프로젝트에 적용하는 GPGPU
게임프로젝트에 적용하는 GPGPUYEONG-CHEON YOU
 
Optimize your game with the Profile Analyzer - Unite Copenhagen 2019
Optimize your game with the Profile Analyzer - Unite Copenhagen 2019Optimize your game with the Profile Analyzer - Unite Copenhagen 2019
Optimize your game with the Profile Analyzer - Unite Copenhagen 2019Unity Technologies
 
Destruction Masking in Frostbite 2 using Volume Distance Fields
Destruction Masking in Frostbite 2 using Volume Distance FieldsDestruction Masking in Frostbite 2 using Volume Distance Fields
Destruction Masking in Frostbite 2 using Volume Distance FieldsElectronic Arts / DICE
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Johan Andersson
 
Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Tiago Sousa
 
Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Tiago Sousa
 
빠른 렌더링을 위한 오브젝트 제외 기술
빠른 렌더링을 위한 오브젝트 제외 기술빠른 렌더링을 위한 오브젝트 제외 기술
빠른 렌더링을 위한 오브젝트 제외 기술YEONG-CHEON YOU
 
Developing and optimizing a procedural game: The Elder Scrolls Blades- Unite ...
Developing and optimizing a procedural game: The Elder Scrolls Blades- Unite ...Developing and optimizing a procedural game: The Elder Scrolls Blades- Unite ...
Developing and optimizing a procedural game: The Elder Scrolls Blades- Unite ...Unity Technologies
 
Moving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based RenderingMoving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based RenderingElectronic Arts / DICE
 
Future Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsFuture Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsElectronic Arts / DICE
 
Advancements in-tiled-rendering
Advancements in-tiled-renderingAdvancements in-tiled-rendering
Advancements in-tiled-renderingmistercteam
 
Rendering Tech of Space Marine
Rendering Tech of Space MarineRendering Tech of Space Marine
Rendering Tech of Space MarinePope Kim
 

Was ist angesagt? (20)

DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3
 
Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2Bindless Deferred Decals in The Surge 2
Bindless Deferred Decals in The Surge 2
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
 
Frostbite on Mobile
Frostbite on MobileFrostbite on Mobile
Frostbite on Mobile
 
[KGC2014] DX9에서DX11로의이행경험공유
[KGC2014] DX9에서DX11로의이행경험공유[KGC2014] DX9에서DX11로의이행경험공유
[KGC2014] DX9에서DX11로의이행경험공유
 
Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016
 
Killzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo PostmortemKillzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo Postmortem
 
게임프로젝트에 적용하는 GPGPU
게임프로젝트에 적용하는 GPGPU게임프로젝트에 적용하는 GPGPU
게임프로젝트에 적용하는 GPGPU
 
Optimize your game with the Profile Analyzer - Unite Copenhagen 2019
Optimize your game with the Profile Analyzer - Unite Copenhagen 2019Optimize your game with the Profile Analyzer - Unite Copenhagen 2019
Optimize your game with the Profile Analyzer - Unite Copenhagen 2019
 
Destruction Masking in Frostbite 2 using Volume Distance Fields
Destruction Masking in Frostbite 2 using Volume Distance FieldsDestruction Masking in Frostbite 2 using Volume Distance Fields
Destruction Masking in Frostbite 2 using Volume Distance Fields
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
 
Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)
 
Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666
 
빠른 렌더링을 위한 오브젝트 제외 기술
빠른 렌더링을 위한 오브젝트 제외 기술빠른 렌더링을 위한 오브젝트 제외 기술
빠른 렌더링을 위한 오브젝트 제외 기술
 
Developing and optimizing a procedural game: The Elder Scrolls Blades- Unite ...
Developing and optimizing a procedural game: The Elder Scrolls Blades- Unite ...Developing and optimizing a procedural game: The Elder Scrolls Blades- Unite ...
Developing and optimizing a procedural game: The Elder Scrolls Blades- Unite ...
 
Moving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based RenderingMoving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based Rendering
 
Future Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsFuture Directions for Compute-for-Graphics
Future Directions for Compute-for-Graphics
 
Advancements in-tiled-rendering
Advancements in-tiled-renderingAdvancements in-tiled-rendering
Advancements in-tiled-rendering
 
Rendering Tech of Space Marine
Rendering Tech of Space MarineRendering Tech of Space Marine
Rendering Tech of Space Marine
 

Ähnlich wie Killzone Shadow Fall: Threading the Entity Update on PS4

4Developers 2015: Gamedev-grade debugging - Leszek Godlewski
4Developers 2015: Gamedev-grade debugging - Leszek Godlewski4Developers 2015: Gamedev-grade debugging - Leszek Godlewski
4Developers 2015: Gamedev-grade debugging - Leszek GodlewskiPROIDEA
 
GDC 2010 - A Dynamic Component Architecture for High Performance Gameplay - M...
GDC 2010 - A Dynamic Component Architecture for High Performance Gameplay - M...GDC 2010 - A Dynamic Component Architecture for High Performance Gameplay - M...
GDC 2010 - A Dynamic Component Architecture for High Performance Gameplay - M...Terrance Cohen
 
Gdc gameplay replication in acu with videos
Gdc   gameplay replication in acu with videosGdc   gameplay replication in acu with videos
Gdc gameplay replication in acu with videosCharles Lefebvre
 
Maximize Your Production Effort (English)
Maximize Your Production Effort (English)Maximize Your Production Effort (English)
Maximize Your Production Effort (English)slantsixgames
 
Supersize Your Production Pipe
Supersize Your Production PipeSupersize Your Production Pipe
Supersize Your Production Pipeslantsixgames
 
Xen in Safety-Critical Systems - Critical Summit 2022
Xen in Safety-Critical Systems - Critical Summit 2022Xen in Safety-Critical Systems - Critical Summit 2022
Xen in Safety-Critical Systems - Critical Summit 2022Stefano Stabellini
 
Networking Architecture of Warframe
Networking Architecture of WarframeNetworking Architecture of Warframe
Networking Architecture of WarframeMaciej Siniło
 
GDC 2010 - A Dynamic Component Architecture for High Performance Gameplay
GDC 2010 - A Dynamic Component Architecture for High Performance GameplayGDC 2010 - A Dynamic Component Architecture for High Performance Gameplay
GDC 2010 - A Dynamic Component Architecture for High Performance GameplayTerrance Cohen
 
Create a Scalable and Destructible World in HITMAN 2*
Create a Scalable and Destructible World in HITMAN 2*Create a Scalable and Destructible World in HITMAN 2*
Create a Scalable and Destructible World in HITMAN 2*Intel® Software
 
Sephy engine development document
Sephy engine development documentSephy engine development document
Sephy engine development documentJaejun Kim
 
Unity - Internals: memory and performance
Unity - Internals: memory and performanceUnity - Internals: memory and performance
Unity - Internals: memory and performanceCodemotion
 
Supersize your production pipe enjmin 2013 v1.1 hd
Supersize your production pipe    enjmin 2013 v1.1 hdSupersize your production pipe    enjmin 2013 v1.1 hd
Supersize your production pipe enjmin 2013 v1.1 hdslantsixgames
 
Advances in Game AI
Advances in Game AIAdvances in Game AI
Advances in Game AILuke Dicken
 
Making a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie TycoonMaking a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie TycoonJean-Philippe Doiron
 
How shit works: the CPU
How shit works: the CPUHow shit works: the CPU
How shit works: the CPUTomer Gabel
 
Lagom - Mircoservices "Just Right"
Lagom - Mircoservices "Just Right"Lagom - Mircoservices "Just Right"
Lagom - Mircoservices "Just Right"Markus Jura
 
Developing Next-Generation Games with Stage3D (Molehill)
Developing Next-Generation Games with Stage3D (Molehill) Developing Next-Generation Games with Stage3D (Molehill)
Developing Next-Generation Games with Stage3D (Molehill) Jean-Philippe Doiron
 

Ähnlich wie Killzone Shadow Fall: Threading the Entity Update on PS4 (20)

4Developers 2015: Gamedev-grade debugging - Leszek Godlewski
4Developers 2015: Gamedev-grade debugging - Leszek Godlewski4Developers 2015: Gamedev-grade debugging - Leszek Godlewski
4Developers 2015: Gamedev-grade debugging - Leszek Godlewski
 
GDC 2010 - A Dynamic Component Architecture for High Performance Gameplay - M...
GDC 2010 - A Dynamic Component Architecture for High Performance Gameplay - M...GDC 2010 - A Dynamic Component Architecture for High Performance Gameplay - M...
GDC 2010 - A Dynamic Component Architecture for High Performance Gameplay - M...
 
InfectNet Technical
InfectNet TechnicalInfectNet Technical
InfectNet Technical
 
Gdc gameplay replication in acu with videos
Gdc   gameplay replication in acu with videosGdc   gameplay replication in acu with videos
Gdc gameplay replication in acu with videos
 
Maximize Your Production Effort (English)
Maximize Your Production Effort (English)Maximize Your Production Effort (English)
Maximize Your Production Effort (English)
 
Supersize Your Production Pipe
Supersize Your Production PipeSupersize Your Production Pipe
Supersize Your Production Pipe
 
Xen in Safety-Critical Systems - Critical Summit 2022
Xen in Safety-Critical Systems - Critical Summit 2022Xen in Safety-Critical Systems - Critical Summit 2022
Xen in Safety-Critical Systems - Critical Summit 2022
 
Networking Architecture of Warframe
Networking Architecture of WarframeNetworking Architecture of Warframe
Networking Architecture of Warframe
 
GDC 2010 - A Dynamic Component Architecture for High Performance Gameplay
GDC 2010 - A Dynamic Component Architecture for High Performance GameplayGDC 2010 - A Dynamic Component Architecture for High Performance Gameplay
GDC 2010 - A Dynamic Component Architecture for High Performance Gameplay
 
Relic's FX System
Relic's FX SystemRelic's FX System
Relic's FX System
 
Create a Scalable and Destructible World in HITMAN 2*
Create a Scalable and Destructible World in HITMAN 2*Create a Scalable and Destructible World in HITMAN 2*
Create a Scalable and Destructible World in HITMAN 2*
 
Sephy engine development document
Sephy engine development documentSephy engine development document
Sephy engine development document
 
Unity - Internals: memory and performance
Unity - Internals: memory and performanceUnity - Internals: memory and performance
Unity - Internals: memory and performance
 
Supersize your production pipe enjmin 2013 v1.1 hd
Supersize your production pipe    enjmin 2013 v1.1 hdSupersize your production pipe    enjmin 2013 v1.1 hd
Supersize your production pipe enjmin 2013 v1.1 hd
 
Advances in Game AI
Advances in Game AIAdvances in Game AI
Advances in Game AI
 
Making a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie TycoonMaking a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie Tycoon
 
How shit works: the CPU
How shit works: the CPUHow shit works: the CPU
How shit works: the CPU
 
Lagom - Mircoservices "Just Right"
Lagom - Mircoservices "Just Right"Lagom - Mircoservices "Just Right"
Lagom - Mircoservices "Just Right"
 
Gamedev-grade debugging
Gamedev-grade debuggingGamedev-grade debugging
Gamedev-grade debugging
 
Developing Next-Generation Games with Stage3D (Molehill)
Developing Next-Generation Games with Stage3D (Molehill) Developing Next-Generation Games with Stage3D (Molehill)
Developing Next-Generation Games with Stage3D (Molehill)
 

Kürzlich hochgeladen

Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxPurva Nikam
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 

Kürzlich hochgeladen (20)

Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptx
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 

Killzone Shadow Fall: Threading the Entity Update on PS4

  • 1. Killzone Shadow Fall: Threading the Entity Update on PS4 Jorrit Rouwé Lead Game Tech, Guerrilla Games
  • 2. Introduction • Killzone Shadow Fall is a First Person Shooter • PlayStation 4 launch title • In SP up to 60 characters @ 30 FPS • In MP up to 24 players @ 60 FPS • Gameplay logic has lots of • Branches • Virtual functions • Cache misses • Not suitable for PS3 SPU’s but PS4 has 6 x86 cores
  • 3. What do we cover? • What is an Entity? • What we did on PS3 • Multi threading on PS4 • Balancing Entities across frames • Performance issues found • Performance results achieved • Debug tools
  • 4. What is an Entity?
  • 5. What is an Entity? • Base class for most game objects • E.g. Player, Enemy, Weapon, Door • Not used for static world • Has Components • E.g. Model, Mover, Destructibility • Update at a fixed frequency • 15, 30, 60 Hz • Almost everything at 15 Hz • Player updated at higher frequency to avoid lag
  • 6. What is a Representation? • Entities and Components have Representation • Controls rendering, audio and VFX • State is interpolated in frames where entity not updated • Cheaper to interpolate than to update • Introduces latency • Always blend towards last update
  • 9. PS3: 1 Entity = 1 Fiber • Most time spent on PPU • No clear concurrency model • Read partial updated state • Entities deadlock waiting for each other yield fiber resume fiber Time PPU SPU 1 SPU 2 Entity 1 Entity 2 Entity 3 Entity 1 Animation Ray Cast Entity 2
  • 10. PS4: 1 Entity = 1 Job • No fibers • Entity updates as a whole • How to solve race conditions? Time CPU 1 CPU 2 CPU 3 Entity 1 Entity 2 Entity 3 Entity 1Animation Ray Cast Entity 2  job
  • 11. • Make update order explicit: Strong Dependencies Missile MissileMissile LauncherMissile Launcher Soldier Time Soldier strong dependency
  • 12. • No (indirect) dependency = no access • Works two ways: Weapon can access Soldier too • Create dependency has 1 frame latency • Global systems need locks Non-dependent Entities can Execute Concurrently Soldier1 Weapon1 Time Soldier2 Weapon2 CPU 2 CPU 1 Soldier1 Weapon1 Soldier2 Weapon2
  • 13. • A few entities cause huge bottleneck What about this? Time Soldier1 Pistol1 Soldier2 Pistol2 Bullet System Soldier1 Pistol1 Soldier2 Pistol2 Bullet System
  • 14. Non-exclusive Dependencies • Access to ‘Bullet System’ must be lock protected Soldier1 Soldier2 Bullet System non-exclusive Time CPU 2 CPU 1 Soldier1 Pistol1 Soldier2 Pistol2 Bullet System Barrier Pistol1 Pistol2
  • 15. Weak Dependencies • 2 tanks fire at each other: • Update order reversed when circular dependency occurs • Not used very often (< 10 per frame) Time Missile2Missile1 Tank1 Tank2 Missile2Missile1 Tank2 Tank1 Missile1Missile2 or weak strong Tank1 Tank2
  • 16. Non-updating Entities • Entity can skip updates (LOD) • Entity can update in other frame • Do normal scheduling! Entity1 Entity3 Entity1 Entity2 Entity3 Time not updating
  • 17. Summarizing Dependencies Strong Exclusive Weak Exclusive Strong Non-excl. Weak Non-excl. Symbol Two way access ✓ ✓ ✓ ✓ Order guaranteed ✓ ✓ Allow concurrency + + ++ ++ Require lock ✓ ✓
  • 18. Referencing Entities • Dev build: CheckedPtr<Entity> • Acts as normal pointer • Check dependency on dereference • Retail build: Entity * • No overhead! • Doesn’t catch everything • Can use pointers to members • Bugs were easy to find
  • 19. Working with Entities without dependency • ThreadSafeEntityInterface • Mostly read only • Often used state (name, type, position, …) • Mutex per Entity • Send message (expensive) • Processed single threaded when no dependency • Schedule single threaded callback (expensive) • Everything can be accessed
  • 21. Scheduling Algorithm • Entities with exclusive dependencies merged to 1 job • Dependencies determine sorting • Non-exclusive dependencies become job dependencies • Expensive jobs kicked first! Missile Missile Launcher Tank Bullet System Pistol Soldier Missile Tank Bullet System PistolSoldier Job1 Job2 job dependency not updating weak non-exclusive strong
  • 22. Scheduling Algorithm – Edge Case Entity1 Entity3 Entity4Entity2 non-exclusive exclusive job dependency Entity1 Entity2 Entity3 Entity4 Job1 Job2
  • 23. Scheduling Algorithm – Edge Case • Non cyclic dependency becomes cyclic job dependency • Job1 and Job2 need to be merged Entity1 Entity3 Entity4Entity2 non-exclusive exclusive Entity1 Entity2Entity3 Entity4 Job1
  • 25. Balancing Entities Across Frames • Prevent all 15 Hz entities from updating in same frame • Entity can move to other frame • Smaller delta time for 1 update • Keep parent-child linked entities together • Weapon of soldier • Soldier on mounted gun • Locked close combat
  • 26. Balancing Entities – In Action Time in even frame (sum across cores, running average)
  • 27. Balancing Entities – In Action Time in odd frame (should be equal)
  • 28. Balancing Entities – In Action Civilian @ 15Hz update even frame
  • 29. Balancing Entities – In Action Civilian @ 15Hz update odd frame
  • 30. Balancing Entities – In Action Flash on switch odd / even
  • 31. Movie
  • 32. Movie
  • 34. Performance Issues • Memory allocation mutex • Eliminated many dynamic allocations • Use stack allocator • Locking physics world • R/W mutex for main simulation world • Second ‘bullet collision’ broadphase + lock • Large groups of dependent entities • Player update very expensive
  • 35. Cut Scene - Problem • Cut scene entity requires dependencies • 10+ characters in cut scene creates huge job! Civilian 1 Cut Scene Civilian 2 Player Camera
  • 36. Cut Scene - Solution • Create sub cut scenes for non-interacting entities • Master cut scene determines time and flow • Scan 1 frame ahead in timeline to create dependency Civilian 1 Cut Scenenon-exclusive Sub Cut Scene 1 Sub Cut Scene 2 Civilian 2 Sub Cut Scene 3 Player Camera
  • 37. Using an Object • Dependencies on useable objects not possible (too many) • Get list of usable objects • Global system protected by lock • ‘Use’ icon appears on screen • Player selects • Create dependency • Start ‘use’ animation • Start interaction 1 frame later (dependency valid) • Hides 1 frame delay!
  • 38. Grenade • Explosion damages many entities • Creating dependencies not option (too many) • ThreadSafeEntityInterface not an option • Need knowledge of parts • Run line of sight checks inside update • Uses scheduled callback to apply damage
  • 40. 5000 Crates Performance Results - Synthetic 100 Soldiers 500 Flags Level Counts Dependencies Max Entities in Job SpeedupNumber Entities Updating Entities Number Humans Strong Excl Strong Non-Excl 5000 Crates (20 µs each) 5019 5008 1 12 4 13 2.8X 100 Soldiers (700 µs each) 326 214 105 212 204 19 4.2X 500 Flags (160 µs each) 519 508 1 12 4 13 5.2X 6 cores!
  • 41. Performance Results - Levels Level Counts Dependencies Max Entities in Job SpeedupNumber Entities Updating Entities Number Humans Strong Excl Strong Non-Excl The Helghast (You Owe Me) 1141 206 32 71 23 20 4.1X The Patriot (On Vectan Soil) 435 257 44 199 107 15 4.3X The Remains (12p Botzone) 450 128 14 97 44 18 3.7X The Helghast The RemainsThe Patriot
  • 42. Game Frame - The Patriot
  • 43. Game Frame - Global Time 18 ms
  • 44. Game Frame - Global CPU 1 (main thread)
  • 45. Game Frame - Global CPU 2
  • 46. Game Frame - Global CPU 3
  • 47. Game Frame - Global AI Physics Entity Update VFX & Draw Barrier
  • 48. Game Frame – Entity Update Fix Weak Dependency Cycles
  • 49. Game Frame – Entity Update Prepare Jobs
  • 50. Game Frame – Entity Update Link and Order Jobs
  • 51. Game Frame – Entity Update Execute Jobs
  • 52. Game Frame – Entity Update Single Threaded Callbacks
  • 53. Game Frame – Entity Update Player Job
  • 54. Game Frame – Entity Update Player Entity
  • 55. Game Frame – Entity Update Animation
  • 56. Game Frame – Entity Update Capsule Collision
  • 57. Game Frame – Entity Update Player Representation
  • 58. Game Frame – Entity Update OWL (Helper Robot)
  • 59. Game Frame – Entity Update Inventory
  • 60. Game Frame – Entity Update AI Soldier
  • 61. Game Frame – Entity Update Cloth Simulation
  • 62. Game Frame – Entity Update Cut Scene
  • 63. Game Frame – Entity Update Sub Cut Scenes
  • 64. Game Frame – Entity Update Cheap Entities (Destructibles)
  • 66. Debug Tools • Dependencies get complex, we use yEd to visualize!
  • 70. Conclusions • Easy to implement in existing engine • Game programmers can program as if single threaded • Very few multithreading issues