Scientists have embraced the use of specialized cloud-hosted services to perform data management operations. Globus offers a suite of data and user management capabilities to the community, encompassing data transfer and sharing, user identity and authorization, and data publication. Globus capabilities are accessible via both a web browser and REST APIs. Web access allows Globus to address the needs of research labs through a software-as-a-service model; the newer REST APIs address the needs of developers of research services, who can now use Globus as a platform, outsourcing complex user and data management tasks to Globus cloud-hosted services. Here we review Globus capabilities and outline how it is being applied as a platform for scientific services. Presentation by Steve Tuecke from The University of Chicago. Steve is Globus Founder and Project Lead.
2. Cloud has transformed how software
and platforms are delivered
2
Infrastructure as a service: IaaS
Platform as a service: PaaS
Software as a service: SaaS
PaaS enables more rapid, cheap, and
scalable delivery of powerful (SaaS) apps
(web & mobile apps)
3. Globus and the research data lifecycle
Researcher initiates
transfer request; or
requested automatically
by script, science
gateway
1
Instrument
Compute Facility
Globus transfers files
reliably, securely
2
Globus controls
access to shared
files on existing
storage; no need
to move files to
cloud storage!
4
Curator reviews and
approves; data set
published on campus
or other system
7
Researcher
selects files to
share, selects
user or group,
and sets access
permissions
3
Collaborator logs in to
Globus and accesses
shared files; no local
account required;
download via Globus
5
Researcher
assembles data set;
describes it using
metadata (Dublin
core and domain-
specific)
6
6
Peers, collaborators
search and discover
datasets; transfer and
share using Globus
8
Publication
Repository
Personal Computer
Transfer
Share
Publish
Discover
• Access via web browser
or command line
• Use any storage system
• Use existing identity
3
6. Publish peer reviewed paper data
• Review Update
Resubmit cycle
Curation workflow
• DOI
Persistent
identifier
• Dublin core metadata
• Domain metadata
• Provenance info
• ...
Describe…• PDF/A
• HDF
• …
(Re)format…
Community
and public
repositories
7. Why “outsource” RDM to Globus?
• Simplicity
– Consistent UI across systems
– Easy access to collaborators
• Reliability and performance
– “Fire-and-forget” file transfer
– Maximized WAN throughput
• Operational efficiency
– Low overhead SaaS model
– Highly automatable: CLI, RESTful API
• Access to a large and growing community
7
8. How can I use Globus
with my storage system?
Globus Connect
8
9. Globus Connect Personal
• Installers do not require admin access
• Zero configuration; auto updating
• Handles NATs
9
10. Globus Connect Server
• Create endpoint on practically any filesystem
• Enable access for all users with local accounts
• Native packages: RPMs and DEBs
Local system users
10
Local Storage System
(HPC cluster, campus server, …)
Globus Connect Server
MyProxy
CA
GridFTP
Server
OAuth
Server
DTN
13. Platform Questions
• How do you leverage Globus services in
your own applications?
• How do you extend Globus with your own
services?
• How do we empower the research
community to create an integrated
ecosystem of services and applications?
13
20. Globus Auth
• Foundational identity and access management
(IAM) platform service
• Simplify creation and integration of advanced
apps and services
• Brokers authentication and authorization
interactions between:
– end-users
– identity providers: InCommon, XSEDE, Google,
portals
– services: resource servers with REST APIs
– apps: web, mobile, desktop, command line clients
– services acting as clients to other services
20
21. Log in with Globus
• Similar to:
“Log in with Google”
“Log in with Facebook”
• Using existing identities
• Providing access to
community services
22. Protect all REST API communications
• App Globus services
• App non-Globus services
• Service Service
22
23. Globus Transfer API
• Nearly all Globus Web App functionality
implemented via public Transfer API
– File and folder management, transfer, sharing,
and sync
docs.globus.org/api/transfer
23
25. HTTPS support (coming soon)
• Synchronous alternative to GridFTP
• Same fine-grained access control model
• Greatly simplified sharing/transfer of
“small” datasets
• Standard browser behaviors
• Integration with clients and web apps to
further leverage existing research storage
systems
25
26. Globus sustainability model
• Standard Subscription
– Shared endpoints
– Data publication
– HTTPS support*
– Management console
– Usage reporting
– Priority support
– Application integration
• Branded Web Site
• Premium Storage Connectors
– Amazon S3, Ceph, HPSS, Spectra, Google Drive, Box*, HDFS*
• Alternate Identity Provider (InCommon is standard)
26*Coming soon
27. Thank you to our users...
5
major services
13
national labs
use Globus
280PB
transferred
10,000
active endpoints
47 Bn
files processed
60,000
registered users
99.5%
uptime
65+
institutional
subscribers
1 PB
largest single
transfer to date
3 months
longest
continuously
managed transfer
300+
federated
campus identities
10,000
active users/year
29. Join the Globus community
• Access the service: globus.org/login
• Create a personal endpoint:
globus.org/app/endpoints/create-gcp
• Documentation: docs.globus.org
• Engage: globus.org/mailing-lists
• Subscribe: globus.org/subscriptions
• Need help? support@globus.org
• Follow us: @globusonline
29
30. Thank you to our sponsors!
U . S . D E P A R T M E N T O F
ENERGY
30
Hinweis der Redaktion
Abstract: Globus Auth is a foundational identity and access management platform service designed to address unique needs of the science and engineering community. It serves to broker authentication and authorization interactions between end-users, identity providers, resource servers (services), and clients (including web, mobile, and desktop applications, and other services). Globus Auth thus makes it easy, for example, for a researcher to authenticate with one credential, connect to a specific remote storage resource with another identity, and share data with colleagues based on their global identity. By eliminating friction associated with the frequent need for multiple accounts, identities, credentials, and groups when using distributed cyberinfrastructure, Globus Auth streamlines the creation, integration, and use of advanced research services. Here we introduce Globus Auth by describing how it can be used by a real research service, the Research Data Archive of the National Center for Atmospheric Research, to enhance both delivered capabilities and user experience.
Support all Posix-compliant filesystems out of the box
We have been adding premium connectors
Makes the underlying storage system look just like any other Globus endpoint
Enables full Globus transfer and sharing capabilities on the endpoint
Continue to add these based on user demand and funding availability
Latest one is Google Drive
Initial development funded by LBNL; Fully supported by Globus team
We are using this approach to build out more connectors; exploring options to develop connectors for Box, Microsoft, etc.