Gen AI in Business - Global Trends Report 2024.pdf
[Pixar] Templar Underminer
1. Templar Underminer
Templar is Pixar’s asset management system. Its charter is to manage assets and asset
metadata for a 50 year period. The Underminer manages Templar backend depot storage. It
is in production use.
Phase I implements multi-partion storage for Templar Perforce instances. Specifically, some
partions can be made read-only, enabling the backup team to perform more efficient
backups.
Fundamental Underminer Idea
Perforce will write to the main partition in its usual way. The depot must be configured so that
all files are of file type “+F”.
The underminer will create directory structures on the target that mirrors the directory
structure on the main partition. Specified files in the main partition will be copied to their
corresponding location on the active target, and corresponding symlinks will replace the files
in the main partition. (note: We don’t use archive triggers as we’re not interested in replacing
P4’s extensively tested and validated reading/writing code with new code that could
potentially fail in such a was as to result in permanent data loss.)
In Phase I this operation will always be performed independently of any Perforce operation. It
can be done automatically (e.g. via a cron job) or manually (as part of a partion addition).
In Phase I The database tables will be maintained manually.
2. LINKATRON links will be updated as part of this process.
note: LINKATRON overview is available here:
http://www.perforce.com/sites/default/files/harrison-sundy-perforce-pixar-linkatron-paper.pdf
The main partition can consist of
all files. this is identitical to “standard” Perforce storage.
all symlinks. this will be the state following a full undermining of a depot with no
subsequent modifications to the depot.
mix of files and symlinks. this will be the state following a partial undermining, or of a
full undermining followed by subsequent Perforce depot modifications.
In all cases, Perforce views the depot as a standard depot and is not aware of nor concerned
with the modifications to the underlying storage mechanism.
All state-changing operations check to make sure the program has been run as the diskfarm
user.
Phases
Phase I – no Perforce dendencies
Phase II – Perforce “create file” event
Phase III – Perforce “file staged” import (mv instead of cp)
Phase I
Database
We will store underminer partition configuration information in the Templar database
(currently in Oracle). Most operations will go from the “main” partition to the “target” partition.
By keeping the configuration in the database, there are far fewer opportunities for mistakes.
3. Partition Types
main
where Perforce writes the data.
target
writable partition that is the default target for underminer operations.
archived (r/o)
read-only partitions have links pointing to them but (of course) are never written to and do
not need to be traversed looking for files to back up.
writable (r/w)
a read-write partition is a writable partition that is not the active target. It most likely will
exist for a short time when a new active target is specified and the old active target has
not been made r/o. We treat a r/w partition in the same logical manner as a r/o partition
but note that file system modifications can be made by activities outside the underminer.
Deliverables
documentation
database schema populated for our various depots
underminer executable as detailed below
crontab configuration for verification
(possible) crontab configuration for automatically updating storage
Issues and Investigation
+S files
need to make sure that as we expire revisions we clean up the storage.
+S obliteration
need to pester p4 for the obliterate trigger. as a practical matter we don’t do much
obliteration though.
+S deduplication
how does this affect our backend dedupe? can we re-dedupe once the files have been
copied to the target partition?
Perforce Features
Need to show this to the perforce guys and get their feedback and see what we
need to do to get the phase 2 and 3 dependencies on their schedule
Reporting
4. These are things which are useful to report from the underminer command line.
database dump
show configuration for a particular depot
show status for a particular depot or p4spec (e.g. what files have been undermined,
etc.)
Command Line
underminer [-n] cmd [-a] //p4path ...
operation will be similar to p4 files. Only the head version will be used unless the -a
flag is specified. P4 paths may be repeated. All path resolution will be performed via p4
files.
“-n” means “no don’t do it”. Work will be displayed that would have been done.
Commands:
stdmove
perform a “standard” move. This will move all files moved since the last stdmove
operation. The changelist of the last stdmove operation is stored in the database and will
be updated upon successful completion of this command.
move
move the files specified in the //p4paths from the main partiton to the active target. (We
may allow specification of -from and -to flags for testing and balancing of partitions.)
dbdump
dump the state of the underminer configuration. If no depots are specified, all depots will
be dumped.
report
report on the status of the specified //p4spec.
spaceused
report on space used for all partitions.
verifyrw
verify the read/write and read/only status of all partitions.
dbverify
verify that the database specification is consistent for the specified repositories.
verify
verify that all files specfied in the //p4files are functional.
inconsistencies
report “harmless” inconsistencies that would not affect correct operation.
5. Database Dump
Here’s what the database dump looks like. Without going into details on the schema it should
give you a general idea of how a depot is configured.
$ ./underminer dbdump
---
('markive', '001', 'main', '/pixar/templar/markive/markive/rep')
('markive', '002', 'target', '/pixar/templar/markive-002/markive/rep')
('markive', '003', 'writable', '/pixar/templar/markive-003/markive/rep')
---
('main', 'the main extent that p4 knows about')
('archived', 'an archived, non-writable extent')
('writable', 'not active, presumably will be archived soon')
('target', 'the extent to which p4 data is being copied')
Notes for future work
if the decision is made to back up thumbnail partitions (perhaps because of transcoded
video), the thumbnail partions should likewise be undermined.