SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Ian Foster
Computation Institute
Argonne National Lab & University of Chicago
Services for Science
2
Thanks!
DOE Office of Science
NSF Office of Cyberinfrastructure
National Institutes of Health
Colleagues at Argonne, U.Chicago, USC/ISI,
OSU, Manchester, and elsewhere
3
Scientific
Communication, ~1600
Brahe Kepler
4
1980
5
Scientific Communication, ~2000
Data ArchivesData Archives
User
Analysis toolsAnalysis tools
Gateway
Figure: S. G. Djorgovski
Discovery toolsDiscovery tools
Service-Oriented Science
6
Application Scenario
Location A
Microarray, Protein,
Image data
Location B
Microarray, Protein,
Image data
Location C
Microarray, Protein,
Image data
Location C
Image Analysis
Location D
Image Analysis
Microarray and protein
databases at other institutions
Different database
systems, data
representations,
security
Different
program
invocation,
remote access,
data transfer
7
caBIG: sharing of infrastructure, applications, and
data.
Data
Integration!
Services
& Cancer Biology
Globus
8
Service-Oriented Science
People create services (data, code, instr.) …
which I discover (& decide whether to use) …
& compose to create a new function ...
& then publish as a new service.
 I find “someone else” to host services,
so I don’t have to become an expert in
operating services & computers!
 I hope that this “someone else” can
manage security, reliability, scalability, …
!!
“Service-Oriented Science”, Science, 2005
9
Creating Services
People create services (data, code, instr.) …
which I discover (& decide whether to use) …
& compose to create a new function ...
& then publish as a new service.
 I find “someone else” to host services,
so I don’t have to become an expert in
operating services & computers!
 I hope that this “someone else” can
manage security, reliability, scalability, …
!!
“Service-Oriented Science”, Science, 2005
10
Anatomy of a Service
op1 opN (meta)data
Implementation(s)
Clients
Registry
Management
Clients
…
Service
Service
Attribute
Authority
Attribute
Authority
Persistence
11
Creating Services (~2005)
“This full-day tutorial provides an
introduction to programming Java
services with the latest version of the
Globus Toolkit version 4 (GT4). The
tutorial teaches how to build a Java
Service that makes use of GT4
mechanisms for state management,
security, registry and related topics.”
12
Appln
Service
Create
Index
service
Store
Repository
Service
Advertize
Discover
Invoke;
get results
Introduce
Container
Transfer
GAR
Deploy
Ohio State University and Argonne/U.Chicago
Creating Services in 2008
Introduce and gRAVI
Introduce
Define service
Create skeleton
Discover types
Add operations
Configure security
Grid Remote
Application
Virtualization
Infrastructure
Wrap executables
Globus
Demonstration:
Creating Services
Introduce + gRAVI
Shannon Hastings
Scott Oster
David Ervin
Stephen Langella
Kyle Chard
Ravi Madduri
14Center for Enabling Distributed Petascale Science
Workflow Automation at DOE Facilities
Automation
Reproducibility
Security
Reusability
Storage
Metadata
Analysis
Visualization
Advanced Photon
Source
15
Discovering Services
People create services (data or functions) …
which I discover (& decide whether to use) …
& compose to create a new function ...
& then publish as a new service.
 I find “someone else” to host services,
so I don’t have to become an expert in
operating services & computers!
 I hope that this “someone else” can
manage security, reliability, scalability, …
!!
“Service-Oriented Science”, Science, 2005
16
 The ultimate arbiter?
 Types, ontologies
 Can I use it?
 Billions of services
Discovering Services
Assume success
Syntax,
semantics
Permissions
Reputation
A B
17
Discovery (1):
Registries
Globus
18
Discovery (2):
Standardized Vocabularies
Core Services
Grid
Service
Uses Terminology
Described In
Cancer Data
Standards
Repository
Enterprise
Vocabulary
Services
References
Objects
Defined in
Service
Metadata
Publishes
Subscribes to
and Aggregates
Queries
Service
Metadata
Aggregated In
Registers To
Discovery
Client API
Index
Service
Globus
19
20
Discovery (3): Tagging
& Social Networking
GLOSS:
Generalized
Labels Over Scientific
data Sources
(Foster, Nestorov)
21
Discovery (3): Tagging
& Social Networking
David de Roure,
Carole Goble,
et al.
22
Composing Services
People create services (data or functions) …
which I discover (& decide whether to use) …
& compose to create a new function ...
& then publish as a new service.
 I find “someone else” to host services,
so I don’t have to become an expert in
operating services & computers!
 I hope that this “someone else” can
manage security, reliability, scalability, …
!!
“Service-Oriented Science”, Science, 2005
23
Composing Services:
E.g., BPEL Workflow System
Data Service
@ uchicago.edu
Analytic service
@ osu.edu
Analytic service
@ duke.edu
<BPEL
Workflow
Doc>
<Workflow
Inputs>
<Workflow
Results>
BPEL
Engine
link
caBiG: https://cabig.nci.nih.gov/; BPEL work: Ravi Madduri et al.
link
link
link
See also Kepler & Taverna
Globus
24
Composing Services:
Taverna
caGrid Scavenger with
semantic/metadata-
based caGrid service query
A sample
caGrid
workflow
Globus
25
Composing
Services
Globus
Demonstration:
Composing Services
Taverna + GT4
Taverna team
Wei Tan
Ravi Madduri
27
Publishing Services
People create services (data or functions) …
which I discover (& decide whether to use) …
& compose to create a new function ...
& then publish as a new service.
 I find “someone else” to host services,
so I don’t have to become an expert in
operating services & computers!
 I hope that this “someone else” can
manage security, reliability, scalability, …
!!
“Service-Oriented Science”, Science, 2005
28
Publishing Services
Description  Syntax, semantics
State  Availability, load, …
Policies  Who, what, when, …
Hosting  Location, scalability, …
29
Authorization: SAML & XACML
VOMS Shibboleth LDAP PERMIS…
GT4 Client
GT4 Server
PDP
Attributes
Authorization
Decision
PIP PIP PIP
SAML
XACML
Globus
30
Hosting Services
People create services (data or functions) …
which I discover (& decide whether to use) …
& compose to create a new function ...
& then publish as a new service.
 I find “someone else” to host services,
so I don’t have to become an expert in
operating services & computers!
 I hope that this “someone else” can
manage security, reliability, scalability, …
!!
“Service-Oriented Science”, Science, 2005
31
The Two Dimensions
of Service-Oriented Science
Decompose across network
Clients integrate dynamically
Select & compose services
Select “best of breed” providers
Publish result as new services
Decouple resource & service providers
Function
Resource
Data Archives
Analysis tools
Discovery toolsUsers
Fig: S. G. Djorgovski
32
The geWorkbench/caGrid/TeraGrid
Interface
33
Putting It Together for the Example Scenario
Location A
Microarray, Protein,
Image data
Location B
Microarray, Protein,
Image data
Location C
Microarray, Protein,
Image data
Location C
Image Analysis
Location D
Image Analysis
caGrid Service
Interfaces
caGrid
Environ-
ment
Registered
Object
Definitions
Advertise-
ment
Log on, Grid
credentials
Query and
Analysis
Workflow
Discovery
Microarray & protein
databases at other
institutions
34
Lessons Learned
A convenient higher-level abstraction
Suitable for a subset of scientific use cases
Infrastructure need to be sustainable
Integrates well with hospital/cancer
center/experimental facility IT
infrastructure
Workflows are attractive to users
Scalability and provenance are important
No vendor lock-in (if you are careful)
User experience remains ambiguous
Early adopters are enthusiastic (50+ services)
35
Services for Science
A new approach to communicating
A (not-so new) approach to structuring systems
They’re real
Excellent infrastructure and tools (Globus,
Introduce, gRAVI, Taverna, Swift, etc., etc.)
Substantial numbers of services out there
They’re challenging
Sociology: incentives, rewards
Infrastructure: hosting
Provenance: justifying “results”
Scaling: services, requests

Weitere ähnliche Inhalte

Ähnlich wie Services for Science v2 (APAN26)

Services For Science April 2009
Services For Science April 2009Services For Science April 2009
Services For Science April 2009
Ian Foster
 
Natural Language Processing & Semantic Models in an Imperfect World
Natural Language Processing & Semantic Modelsin an Imperfect WorldNatural Language Processing & Semantic Modelsin an Imperfect World
Natural Language Processing & Semantic Models in an Imperfect World
Vital.AI
 
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Ian Foster
 
Semantic Sensor Service Networks
Semantic Sensor Service NetworksSemantic Sensor Service Networks
Semantic Sensor Service Networks
PayamBarnaghi
 

Ähnlich wie Services for Science v2 (APAN26) (20)

Services for Science
Services for ScienceServices for Science
Services for Science
 
Services For Science April 2009
Services For Science April 2009Services For Science April 2009
Services For Science April 2009
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
 
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication Repositories
 
Natural Language Processing & Semantic Models in an Imperfect World
Natural Language Processing & Semantic Modelsin an Imperfect WorldNatural Language Processing & Semantic Modelsin an Imperfect World
Natural Language Processing & Semantic Models in an Imperfect World
 
Alitora Innovation Networks
Alitora Innovation NetworksAlitora Innovation Networks
Alitora Innovation Networks
 
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...
 
GlobusWorld 2019 Opening Keynote
GlobusWorld 2019 Opening KeynoteGlobusWorld 2019 Opening Keynote
GlobusWorld 2019 Opening Keynote
 
2016 05 sanger
2016 05 sanger2016 05 sanger
2016 05 sanger
 
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
 
Biocatalogue Talk Slides
Biocatalogue Talk SlidesBiocatalogue Talk Slides
Biocatalogue Talk Slides
 
Physion.PDF
Physion.PDFPhysion.PDF
Physion.PDF
 
Semantic Sensor Service Networks
Semantic Sensor Service NetworksSemantic Sensor Service Networks
Semantic Sensor Service Networks
 
20120419 linkedopendataandteamsciencemcguinnesschicago
20120419 linkedopendataandteamsciencemcguinnesschicago20120419 linkedopendataandteamsciencemcguinnesschicago
20120419 linkedopendataandteamsciencemcguinnesschicago
 
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationThe Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
 
Test trend analysis: Towards robust reliable and timely tests
Test trend analysis: Towards robust reliable and timely testsTest trend analysis: Towards robust reliable and timely tests
Test trend analysis: Towards robust reliable and timely tests
 
Data Science: Philosopher's Stone
Data Science: Philosopher's StoneData Science: Philosopher's Stone
Data Science: Philosopher's Stone
 
B040101007012
B040101007012B040101007012
B040101007012
 
Cloud com foster december 2010
Cloud com foster december 2010Cloud com foster december 2010
Cloud com foster december 2010
 

Mehr von Ian Foster

Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptx
Ian Foster
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
Ian Foster
 

Mehr von Ian Foster (20)

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptx
 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, Evolution
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the Continuum
 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart Instruments
 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and Computation
 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptx
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental Science
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and Chemistry
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud Automation
 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven Discovery
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and Jupyter
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon Summary
 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperability
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture Ideas
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCF
 

Kürzlich hochgeladen

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Services for Science v2 (APAN26)

  • 1. Ian Foster Computation Institute Argonne National Lab & University of Chicago Services for Science
  • 2. 2 Thanks! DOE Office of Science NSF Office of Cyberinfrastructure National Institutes of Health Colleagues at Argonne, U.Chicago, USC/ISI, OSU, Manchester, and elsewhere
  • 5. 5 Scientific Communication, ~2000 Data ArchivesData Archives User Analysis toolsAnalysis tools Gateway Figure: S. G. Djorgovski Discovery toolsDiscovery tools Service-Oriented Science
  • 6. 6 Application Scenario Location A Microarray, Protein, Image data Location B Microarray, Protein, Image data Location C Microarray, Protein, Image data Location C Image Analysis Location D Image Analysis Microarray and protein databases at other institutions Different database systems, data representations, security Different program invocation, remote access, data transfer
  • 7. 7 caBIG: sharing of infrastructure, applications, and data. Data Integration! Services & Cancer Biology Globus
  • 8. 8 Service-Oriented Science People create services (data, code, instr.) … which I discover (& decide whether to use) … & compose to create a new function ... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005
  • 9. 9 Creating Services People create services (data, code, instr.) … which I discover (& decide whether to use) … & compose to create a new function ... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005
  • 10. 10 Anatomy of a Service op1 opN (meta)data Implementation(s) Clients Registry Management Clients … Service Service Attribute Authority Attribute Authority Persistence
  • 11. 11 Creating Services (~2005) “This full-day tutorial provides an introduction to programming Java services with the latest version of the Globus Toolkit version 4 (GT4). The tutorial teaches how to build a Java Service that makes use of GT4 mechanisms for state management, security, registry and related topics.”
  • 12. 12 Appln Service Create Index service Store Repository Service Advertize Discover Invoke; get results Introduce Container Transfer GAR Deploy Ohio State University and Argonne/U.Chicago Creating Services in 2008 Introduce and gRAVI Introduce Define service Create skeleton Discover types Add operations Configure security Grid Remote Application Virtualization Infrastructure Wrap executables Globus
  • 13. Demonstration: Creating Services Introduce + gRAVI Shannon Hastings Scott Oster David Ervin Stephen Langella Kyle Chard Ravi Madduri
  • 14. 14Center for Enabling Distributed Petascale Science Workflow Automation at DOE Facilities Automation Reproducibility Security Reusability Storage Metadata Analysis Visualization Advanced Photon Source
  • 15. 15 Discovering Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function ... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005
  • 16. 16  The ultimate arbiter?  Types, ontologies  Can I use it?  Billions of services Discovering Services Assume success Syntax, semantics Permissions Reputation A B
  • 18. 18 Discovery (2): Standardized Vocabularies Core Services Grid Service Uses Terminology Described In Cancer Data Standards Repository Enterprise Vocabulary Services References Objects Defined in Service Metadata Publishes Subscribes to and Aggregates Queries Service Metadata Aggregated In Registers To Discovery Client API Index Service Globus
  • 19. 19
  • 20. 20 Discovery (3): Tagging & Social Networking GLOSS: Generalized Labels Over Scientific data Sources (Foster, Nestorov)
  • 21. 21 Discovery (3): Tagging & Social Networking David de Roure, Carole Goble, et al.
  • 22. 22 Composing Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function ... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005
  • 23. 23 Composing Services: E.g., BPEL Workflow System Data Service @ uchicago.edu Analytic service @ osu.edu Analytic service @ duke.edu <BPEL Workflow Doc> <Workflow Inputs> <Workflow Results> BPEL Engine link caBiG: https://cabig.nci.nih.gov/; BPEL work: Ravi Madduri et al. link link link See also Kepler & Taverna Globus
  • 24. 24 Composing Services: Taverna caGrid Scavenger with semantic/metadata- based caGrid service query A sample caGrid workflow Globus
  • 26. Demonstration: Composing Services Taverna + GT4 Taverna team Wei Tan Ravi Madduri
  • 27. 27 Publishing Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function ... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005
  • 28. 28 Publishing Services Description  Syntax, semantics State  Availability, load, … Policies  Who, what, when, … Hosting  Location, scalability, …
  • 29. 29 Authorization: SAML & XACML VOMS Shibboleth LDAP PERMIS… GT4 Client GT4 Server PDP Attributes Authorization Decision PIP PIP PIP SAML XACML Globus
  • 30. 30 Hosting Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function ... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005
  • 31. 31 The Two Dimensions of Service-Oriented Science Decompose across network Clients integrate dynamically Select & compose services Select “best of breed” providers Publish result as new services Decouple resource & service providers Function Resource Data Archives Analysis tools Discovery toolsUsers Fig: S. G. Djorgovski
  • 33. 33 Putting It Together for the Example Scenario Location A Microarray, Protein, Image data Location B Microarray, Protein, Image data Location C Microarray, Protein, Image data Location C Image Analysis Location D Image Analysis caGrid Service Interfaces caGrid Environ- ment Registered Object Definitions Advertise- ment Log on, Grid credentials Query and Analysis Workflow Discovery Microarray & protein databases at other institutions
  • 34. 34 Lessons Learned A convenient higher-level abstraction Suitable for a subset of scientific use cases Infrastructure need to be sustainable Integrates well with hospital/cancer center/experimental facility IT infrastructure Workflows are attractive to users Scalability and provenance are important No vendor lock-in (if you are careful) User experience remains ambiguous Early adopters are enthusiastic (50+ services)
  • 35. 35 Services for Science A new approach to communicating A (not-so new) approach to structuring systems They’re real Excellent infrastructure and tools (Globus, Introduce, gRAVI, Taverna, Swift, etc., etc.) Substantial numbers of services out there They’re challenging Sociology: incentives, rewards Infrastructure: hosting Provenance: justifying “results” Scaling: services, requests

Hinweis der Redaktion

  1. I would mention data integration here: integration of different data types: tissue, (pre)clinical, genomics, proteomics….
  2. Developer benefits: Developers spend time developing scientific applications instead of rewriting code to turn into services. Developer time spared through code reuse—legacy apps are wrappered into services instead of rewritten as services Learning gRavi easier than learning service development—especially marshaling of security resources! Traditional applications are not secure and can not provide an audit trail Scientific benefits: Scientists spend time developing workflows rather than moving data from application to application Scientists interact with GUI to focus on scientific process not how to run applications. Services provide easy access to all scientific tools in one integrated GUI. Access to grid resources promotes cross disciplinary collaboration as it becomes easier to share data and applications across institutional boundaries
  3. Use funnel animation?
  4. The GT4 plug-in support semantic based service query. Users can input multiple (up to three, in current scenario three is enough. We can add more in our program upon request) service query criteria and input the corresponding value. We can combine multiple criteria. The initial GUI only shows one query criteria, but more can be added by clicking “Add Service Query” button. For example, we can query the caGrid services whose “Research Center” name is “Ohio State University”, with Service Name “DICOMDataService”, and has operation “PullOp”.