Pixar has scaled their use of Perforce over time to support increasing numbers of users, files, and data types associated with their animated films. They now have over 90 Perforce servers storing over 20 TB of data. To manage this scale, they utilize techniques like virtualization for flexible server provisioning, de-duplication to reduce storage usage, and scripts like Superp4 to automate management of metadata tables across multiple servers. While scaling has provided benefits, it has also introduced challenges around monitoring, performance, and administration across many interconnected systems.
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Â
Scaling Servers and Storage for Film Assets
1. Scaling Servers and
Storage for Film Assets
Mike Sundy
Digital Asset System Administrator
David Baraff
Senior Animation Research Scientist
Pixar Animation Studios
5. Environment
As of March 2011:
â˘âŻ ~1000 Perforce users (80% of company)
â˘âŻ 70 GB db.have
â˘âŻ 12 million p4 ops per day (on busiest server)
â˘âŻ 30+ VMWare server instances
â˘âŻ 40 million submitted changelists (across all servers)
â˘âŻ On 2009.1 but planning to upgrade to 2010.1 soon
6. Growth & Types of Data
Pixar grew from one code server in 2007 to 90+ Perforce
servers storing all types of assets:
â˘âŻ art â reference and concept art â inspirational art for film.
â˘âŻ tech â show-specific data. e.g. models, textures, pipeline.
â˘âŻ studio â company-wide reference libraries. e.g. animation
reference, config files, flickr-like company photo site.
â˘âŻ tools â code for our central tools team, software projects.
â˘âŻ dept â department-specific files. e.g. Creative Resources
has âblessedâ marketing images.
â˘âŻ exotics â patent data, casting audio, data for live action
shorts, story gags, theme park concepts, intern art show.
8. Storage Stats
â˘âŻ 115 million files in Perforce.
â˘âŻ 20+ TB of versioned files.
9. Techniques to Manage Storage
â˘âŻ Use +S filetype for the majority of generated data.
Saved 40% of storage for Toy Story 3 (1.2 TB).
â˘âŻ Work with teams to migrate versionless data out of
Perforce. Saved 2 TB by moving binary scene data out.
â˘âŻ De-dupe files â saved 1 million files and 1 TB.
14. Scale Up vs. Scale Out
Why did we choose to scale out?
â˘âŻ Shows are self-contained.
â˘âŻ Performance of one depot wonât affect another.*
â˘âŻ Easy to browse other depots.
â˘âŻ Easier administration/downtime scheduling.
â˘âŻ Fits with workflow (e.g. no merging art)
â˘âŻ Central code server â share where it matters.
15. Pixar Perforce Server Spec
â˘âŻ VMWare ESX Version 4.
â˘âŻ RHEL 5 (Linux 2.6).
â˘âŻ 4 GB RAM.
â˘âŻ 50 GB âlocalâ data volume (on EMC SAN).
â˘âŻ Versioned files on Netapp GFX.
â˘âŻ 90 Perforce depots on 6 node VMWare cluster â
special 2-node cluster for âhotâ tech show.
â˘âŻ For more details, see 2009 conference paper.
16. Virtualization Benefits
â˘âŻ Quick to spin up new servers.
â˘âŻ Stable and fault tolerant.
â˘âŻ Easy to remotely administer.
â˘âŻ Cost-effective.
â˘âŻ Reduces datacenter footprint, cooling, power, etc.
17. Reduce Dependencies
â˘âŻ Clone all servers from a VM template.
â˘âŻ RHEL vs. Fedora.
â˘âŻ Reduce triggers to minimum.
â˘âŻ Default tables, p4d startup options.
â˘âŻ Versioned files stored on NFS.
â˘âŻ VM on a cluster.
â˘âŻ Can build new VM quickly if one ever dies.
18. Virtualization Gotchas
â˘âŻ Had severe performance problem when one datastore
grew to over 90% full.
â˘âŻ Requires some jockeying to ensure load stays
balanced across multiple nodes â manual vs. auto.
â˘âŻ Physical host performance issues can cause cross-
depot issues.
19. Speed of Virtual Perforce Servers
â˘âŻ Used Perforce Benchmark Results Database tools.
â˘âŻ Virtualized servers 95% of performance for
branchsubmit benchmark.
â˘âŻ 85% of performance for browse benchmark (not as
critical to us).
â˘âŻ VMWare flexibility outweighed minor performance hit.
20. Quick Server Setup
â˘âŻ Critical to be able to quickly spin up new servers.
â˘âŻ Went from 2-3 days for setup to 1 hour.
1-hour Setup
â˘âŻ Clone a p4 template VM. (30 minutes)
â˘âŻ Prep the VM. ( 15 minutes)
â˘âŻ Run âsquireâ script to build out p4 instance. (8 seconds)
â˘âŻ Validate and test. (15 minutes)
22. Superp4
Script for managing p4 metadata tables across multiple
servers.
â˘âŻ Preferable to hand-editing 90 tables.
â˘âŻ Database driven (i.e. list of depots)
â˘âŻ Scopable by depot domain (art, tech, etc.)
â˘âŻ Rollback functionality.
23. Superp4 example
$ cd /usr/anim/ts3!
$ p4 triggers -o!
Triggers: â¨
!noHost form-out client âremoveHost.py %formfile%â!
!
$ cat fix-noHost.py!
def modify(data, depot):!
return [line.replace("noHost form-outâ,!
"noHost form-inâ)!
for line in data]!
!
$ superp4 âtable triggers âscript fix-noHost.py âdiff!
⢠Copies triggers to restore dir
⢠Runs fix-noHost.py to produce new triggers, for each depot.
⢠Shows me a diff of the above.
⢠Asks confirmation; finally, modifies triggers on each depot.
⢠Tells me where the restore dir is!!
24. Superp4 options
$ superp4 âhelp!
-n Donât actually modify data!
-diff Show diffs for each depot using xdiff.
-category category Pick depots by category (art, tech, etc.)
-units unit1 unit2 ... Specify an explicit depot list (regexp allowed).
-script script Python file to be execfile()'d; must define a
function named modify().
-table tableType Table to operate on (triggers, typemap,âŚ)
-configFile configFile Config file to modify (e.g. admin/values-config)
-outDir outDir Directory to store working files, and for restoral.
-restoreDir restoreDir Directory previously produced by running
superp4, for when you screw up.
26. Gotchas
â˘âŻ //spec/client filled up.
â˘âŻ user-written triggers sub-optimal.
â˘âŻ âshadow filesâ consumed server space.
â˘âŻ monitoring difficult â cue templaRX and mayday.
â˘âŻ cap renderfarm ops.
â˘âŻ beware of automated tests and clueless GUIs.
â˘âŻ verify can be dangerous to your health (cross-depot).
27. Summary
â˘âŻ Perforce scales well for large amounts of binary data.
â˘âŻ Virtualization = fast and cost-effective server setup.
â˘âŻ Use +S filetype and de-dupe to reduce storage usage.