Driving Behavioral Change for Information Management through Data-Driven Gree...
PEARC17: Data Access for LIGO on the OSG
1. Data Access for LIGO on the
OSG
Derek Weitzel & Brian Bockelman - University of Nebraska – Lincoln
Duncan A. Brown - Syracuse University
Peter Couvares - California Institute of Technology
Frank Würthwein & Edgar Fajardo Hernandez - University of California San
Diego
5. LIGO Needs: PyCBC Workflow
• PyCBC workflow consists of approximately a hundred thousand jobs for
each day’s worth of recorded LIGO data;
• The total need is driven by various aspects of the science, for example,
enough data must be analyzed to measure the statistical significance of
detection candidates and the computational aspects
• The workflows themselves are managed using the Pegasus Workflow
Management System
• PyCBC workflow requires several terabytes of non-public input data;
throughout the analysis, the data may be read up to 200 times.
• Accordingly, the PyCBC pipeline was historically always run at sites with a
full copy of the LIGO data on a shared filesystem.
Can we get PyCBC running on OSG?
6. File Size & Velocity
• PyCBC team makes software independent of the OS environment.
• The PyCBC science payload reads, on average, 1Mbps per core.
• Modest until you run thousands of cores!
• Total data size for
• Observation 1 (O1): 7TB.
• O2: ~3TB so far.
• Jobs require a few hundred MB of common, public calibration data.
• The data will be re-read approximately 200 times and the set of workflows
needed will consume several million CPU hours.
7. Syracuse
HTCondor Submit
Host
LIGO Pool: SUGAR
Generic OSG Site
WN
WN
WNWN
PILOT
JOBS
Nebraska
HDFS Install
LIGO Data Replicator
GridFTP Xfer GridFTP Xfer
Xrootd Xfer
O1 - Implementation
• Used central repository @ Nebraska
with very high bandwidth.
• LIGO data was copied to central
repository
• Submit host submitted to both local and
OSG resources.
• Pegasus runtime managed file
downloads.
• PyCBC executable managed OS
heterogeneity.
• Global shared filesystem (CVMFS)
distributed callibration data.
8. Syracuse
HTCondor Submit
Host
LIGO Pool: SUGAR
Generic OSG Site
WN
WN
WNWN
PILOT
JOBS
Nebraska
HDFS Install
LIGO Data Replicator
GridFTP Xfer GridFTP Xfer
Xrootd Xfer
O1 - Implementation
• Purposefully Simple - Wanted to get
something running fast!
• Single repository had 100Gbps
connection using GridFTP
• Volume of data is small, ~7TB,
compared to stored CMS data at the
repository of 2.7PB.
9. Syracuse
HTCondor Submit
Host
LIGO Pool: SUGAR
Generic OSG Site
WN
WN
WNWN
PILOT
JOBS
Nebraska
HDFS Install
LIGO Data Replicator
GridFTP Xfer GridFTP Xfer
Xrootd Xfer
O1 - Implementation
• Each job needs 1Mbps of 100Gbps
total
• This setup was expected to scale
across the 10,000 cores we estimated
could be available to LIGO.
• But, we started to see issues with this
architecture.
10. O1 – The issues
• Ramp-Up
• GridFTP requires ~128MB of memory per connection due to a per-process
Java VM started by Hadoop HDFS client.
• Transfer nodes could handle the steady state; however, at ramp-up, the OSG
started jobs faster than the GridFTP servers could handle.
• Solution:
• Throttle HTCondor job startup to 1.5Hz. Still caused issues at sites with slow
TCP connections to Nebraska.
• Developed and deployed GridFTP extension to throttle connections per user,
preventing LIGO from disrupting other Nebraska site users.
11. O1 – The issues
• Scalability
• GridFTP limiting how many jobs could start on the OSG.
• Lots of wasted time because of throttle job starts.
• Solution:
• Switched to XRootD server & protocol. Pegasus makes this easy as it
understands the concept of the same storage available via different
mechanism.
• Implementation uses single process (single Java VM) with many threads.
12. Adding non-OSG resources
• Added TACC’s Stampede resource with an allocation reward
• Challenges at Stampede:
• Lack of global filesystem (CVMFS). Solution: Use venerable `rsync` to copy
LIGO software/calibrations to Lustre on Stampede.
• Input data access: External data access for each job will likely not scale.
GridFTP copied the entire O1 data to Stampede
• Scalable grid interface: Found Stampede’s Globus GRAM endpoint was
limiting in the number of jobs it could manage. Developed wrapper script to
launch 1024 invocations in single GRAM/SLURM submission.
17. Securing LIGO data - Authentication
• By default, CVMFS distributes files through a HTTP-based CDN: all files are considered public!
• Seen as acceptable for the original use case of distributing software.
• We enabled “secure-CVMFS” which uses X.509 certificates to authenticate users.
• Authentication happens twice: once to access the worker node’s cache, once to access the StashCache CDN if
there is a local cache-miss.
• LIGO already uses X.509 certificates with their jobs, so this does not increase the burden on their
users.
• Only data is secured with X.509 certificates: the namespace is public (unauthenticated) and
distributed with the normal CVMFS CDNs.
• Done primarily for scalability reasons.
• Sensitive “metadata” about contents of data are not encoded into filename or directory structure.