We provide an overview of the Globus platform features, and demonstrate several data management features. Serving as an introductory session suitable for new users, we use the Globus web app to show data transfer and sharing, use of Globus Connect Personal for laptop/desktop access and introduce the Globus Command Line Interface for interactive and scripting.
This material was presented at the Research Computing and Data Management Workshop, hosted by Rensselaer Polytechnic Institute on February 27-28, 2024.
2. Globus is …
a non-profit service
developed and operated by
3. Globus Platform for Research IT
Managed transfer & sync
Collaborative data sharing
Unified data access
Publication & discovery
Reliable automation Platform-as-a-Service
Managed remote execution
Software-as-a-Service
4. One service, many interfaces
4
GET /endpoint/go%23ep1
PUT /endpoint/vas#my_endpt
200 OK
X-Transfer-API-Version: 0.10
Content-Type: application/json
…
Globus service
Web
CLI
Rest
API
Flows
5. Fast, reliable file transfer …from any to any system
User-initiated,
or automated
transfer request
1
Instrument,
Lab server
Compute
Facility
Globus transfers files
reliably, securely
2
Globally accessible
multi-tenant service
• Fire-and-forget
transfers/sync
• Optimized speed
• Assured reliability
• Unified view of storage
• HTTP/S access to data
v
Optional
notifications
3
9. Endpoints, Collections and
Globus Connect
• Globus Connect Server
– for multi-user Linux Systems
docs.globus.org/globus-connect-server
• Globus Connect Personal
– for personal workstations and laptops
globus.org/globus-connect-personal
docs.globus.org/how-to
10. Collections for data access
• Directly addressable entities
• Bulk data access (via Globus transfer service)
• HTTP/S access directly from collection
• Connected to a storage system, and policy managed
by institution
Mapped Collections
14. Move without (worrying about) limits
• API request rates
• File size
• Data volume
• Third-party tools cannot circumvent…
• …but Globus lets you “fire-and-forget”
• à it will (eventually) be done
14
16. Best practices for data transfer
• Submit all data in single task
– Smaller number of large tasks for best performance
• Choose sync options carefully
– Checksum sync has the most overhead
• Filters are applied separately for
– Listing
– Transfer
17. Secure data sharing …from any storage
Collaborator logs into Globus
and accesses shared files;
no local account required;
download via Globus
2
On-prem or
public cloud
storage
Select files to share,
select user or group,
and set access
permissions
1
Globally accessible
multi-tenant service
Globus controls
access to shared files
on existing storage
Laptop, server,
compute facility
• Fine-grained access
control “overlay” on
storage system
• Share with any identity,
email, group
• No need to stage data just
for sharing
v
18. Collections for data access and sharing
• Directly addressable entities
• Bulk data access (via Globus transfer service)
• HTTP/S access directly from collection
• Connected to a storage system, and policy managed by institution
• Guest collections include collaborative data sharing
Guest Collections
Mapped Collections
19. Data sharing permissions management
• Permissions are set per folder, on a guest collection
• Permissions management can be automated
• For a user
– Identity: user must log in with this
– Email: user gets a code via email; link to their Globus Account
• For a group
– Group UUID: search for group to get UUID
– Access governed by membership in the group
• For an application
– Application identity: appclientid@clients.auth.globus.org
21. Data sharing roles management
• Roles can be used to grant rights to other users,
groups or applications
• Roles for management of guest collection
– Administrator
– Access manager
• Roles for management of activity on guest collection
– Activity Manager
– Activity Monitor
22. Let’s try it…
• Discover collections
• Create guest collection
• Set permissions
• Set Roles
• Console for management (admin)
Tutorial cheatsheet: bit.ly/gw-tut-rpi
23. Globus core security features
• Access Control
– Identities provided and managed by institution
– Institution controls all access policies
– Globus is identity broker; no access to/storage of user credentials
• Data remain at institutions, no storage/routing via Globus
• Integrity checks of transferred data
• Enforced encryption of Globus control data
• Institution-configured encryption of user data in transit
24. Globus High Assurance for managing protected data
Restricted data
handling
à PHI, PII, CUI
à Compliant
data sharing
Security controls
à NIST 800-53
à 800-171 Low+
BAA w/Uchicago
à UChicago BAA with Amazon