SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Linking Scientific Instruments and Computation:
Patterns, Technologies, Experiences
Ian Foster
The University of Chicago
Argonne National Laboratory
foster@anl.gov
Crescat scientia; vita excolatur
https://arxiv.org/abs/2204.05128
https://arxiv.org/abs/2208.09513
A new generation of
scientific instruments
New sensors produce data at high
velocities and in large volumes
New methods and structures are
required to capture and process
data, and to feed back to sensors
Increasing need to harness HPC,
cloud, edge computers
 An instrument becomes a set of
flows, overlaid on distributed
physical resources and software
Mark Boland, https://bit.ly/3cfSosk, 2017
Example: High-energy diffraction microscopy
Example: Ptychographic reconstruction
Example: Serial synchrotron crystallography
A modular, extensible approach to creating and running flows
Flows
Capture useful patterns as
sequences of actions.
Resource-independent
A modular, extensible approach to creating and running flows
Flows
Capture useful patterns as
sequences of actions.
Resource-independent
Action providers
Implement actions.
Resource-independent
Compute Action Provider: Run function at A.
Transfer Action Provider: Transfer from A to B.
Search Action Provider: Publish metadata.
…
A modular, extensible approach to creating and running flows
Flows
Capture useful patterns as
sequences of actions.
Resource-independent
Action providers
Implement actions.
Resource-independent
Fabric
Implements auth, data, and
compute APIs for
manipulating resources
Authenticate user.
Delegate credentials.
Manage file transfers.
Run jobs on computers.
Access data catalog.
…
Compute Action Provider: Run function at A.
Transfer Action Provider: Transfer from A to B.
Search Action Provider: Publish metadata.
…
Builds on
cloud-hosted
Globus
automation
services
Globus
automation
services
Triggers
Flows
Analysis
Computer
Timers
Queues
Step
Step
Step
Step
Event
Type: creation
Match: *tiff
Action
Queue
1 2 3 4
Action
Type: user selection
data: <feature extraction>
Options: approve/reject
Microscope
Step
Step
Step
Step
Flow run
Step
Step
Step
Step
Action
Type: transfer
From: microscope
To: analysis computer
https://arxiv.org/abs/2204.05128
Capture flows
in reusable
forms
In various ways:
- YAML documents
- Python “Gladier” SDK
- Web authoring
Customize
flow to
application
Capture flows
in reusable
forms
Customize
flow to
application
Specialize
flow to
resources
Capture flows
in reusable
forms
Check flow
status
Execute
specialized
flow
Customize
flow to
application
Specialize
flow to
resources
Capture flows
in reusable
forms
Execute
specialized
flow
Customize
flow to
application
Specialize
flow to
resources
Capture flows
in reusable
forms
Examine flow
actions
Execute
specialized
flow
Customize
flow to
application
Specialize
flow to
resources
Capture flows
in reusable
forms
Identify failed
actions
AI model
training
AI model
deployment
Data collection
& transfer
Cerebras
Catalog &
publish
Detector
Injector
x-ray
Target
FAIR data
Data reduction,
refine structures
Data collection
& transfer
AI accelerators, HPC
Ptychographic
reconstruction
Data collection
& transfer (raw)
Data collection &
transfer (position)
AI accelerators
Serial synchrotron crystallography
Ptychography
High energy diffraction microscopy
Flows have been developed for light source
data analysis, biomedical and materials
science data ingest, on-demand simulation, …
Determining protein structures 10-100x faster
“These data services have taken the
time to solve a structure from
weeks to days and now to hours”
Darren Sherrell, SBC beamline
scientist APS Sector 19
• Developed new automation pipeline to
collect data, analyze and visualize the data,
solve protein structure and load results into a
searchable portal for real-time feedback
• Achieved over 10-100x speed up in time to
solution of protein structures at APS beamline
• Leveraged unique DOE facilities at Advanced
Photon Source (SBC Sector 19) and ALCF
(Theta/ ThetaGPU, Petrel, and Data Portals)
Deposited first results in open repositories
Automation pipeline
(Chard, Vescovi, Foster, Blaiszik, Sherrell, Joachimiak, et al.)
ALCF Theta
ALCF Theta
ALCF Theta
Data Portals
APS
ALCF
Petrel
ALCF Theta
17
Flow invocations 2020-21 for five APS experiments
Numbers vary due to facility and experimental schedules.
We collect detailed performance data on flows
https://arxiv.org/abs/2204.05128
Transfer, compute,
and cataloging
costs for median
flows
Round-trip latencies for various action providers
• Current architecture
has ~1 sec minimum
latency due to cloud
interaction
• funcX latencies higher
due to polling strategy
• Both can be improved
as needed
We build on a universal auth, compute, & data fabric
Globus
Auth
Authentication and delegation mechanisms to control
what happens where
Run functions anywhere funcX deployed
Access data anywhere Globus Connect deployed
* See also: Integrated Research Infrastructure, computing continuum, grid
Globus
Connect
As of 4/2022
Globus hybrid “SaaS” model: Data fabric
Globus hybrid “SaaS” model: Compute fabric
funcX
agent
funcX
agent
Customer owned and
administered computer
with funcX agent
running on it
funcX service orchestrates function
execution via communication with
funcX agent
Polaris
Bebop
Cluster
Argonne
Leadership
Computing
Facility
Laboratory
Computing
Research
Center
Eagle store
APS
Computing
Orthros Cluster
APS DM system
Portal
server
Portal
server
Theta
Advanced
Photon
Source
Key: funcX agent
Globus Connect agent
API
API
API
User-defined flows
Globus-accessible
storage and
computing
(10,000s of systems)
Globus
Automation
Services
Building computationally-enhanced instruments:
There is much more to be done!
• We have worked so far with light sources and data ingest
pipelines
• We are pleased with adaptability and reliability
• Work required in capability (e.g., iteration) and performance
• Others are applying tools to microscopes and other
instruments
• New action providers are needed for instrument control
• We are eager to find partners who want to work with us on
developing and/or applying these methods and tools!
Thanks to talented colleagues!
Linking Scientific Instruments & HPC: Patterns, Technologies, Experiences
Globus Automation Services: Research process automation across the space-time continuum
Rachana
Ananthakrishnan
Josh Bryan Kyle Chard Ryan Chard Kurt McKee Jim Pruyne Brigitte Raumann
https://arxiv.org/abs/2204.05128 https://arxiv.org/abs/2208.09513
Raf Vescovi Ryan Chard Nick Saint Ben Blaiszik Jim Pruyne Tekin Bicer
Alex Lavens Zhengchun Liu Mike Papka Suresh Narayanan Nicholas Schwarz Kyle Chard
and
And sponsors
And the rest of
the ALCF, APS, &
Globus teams
Recap: Enabling
new instruments
Reusable flows
composed from an
extensible set of
actions
Built on global
auth, compute, data
fabric
Join us in applying
these methods!
https://arxiv.org/abs/2204.05128
https://arxiv.org/abs/2208.09513
https://www.globus.org/platform/services/flows

Weitere ähnliche Inhalte

Ähnlich wie Linking Scientific Instruments and Computation

So Long Computer Overlords
So Long Computer OverlordsSo Long Computer Overlords
So Long Computer OverlordsIan Foster
 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesIan Foster
 
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreBig Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreHPCC Systems
 
Science cloud foster june 2013
Science cloud foster june 2013Science cloud foster june 2013
Science cloud foster june 2013Kirill Osipov
 
Science as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryScience as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryIan Foster
 
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Ian Foster
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobus
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeGeoffrey Fox
 
Time to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the CloudTime to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the CloudAmazon Web Services
 
An Overview of VIEW
An Overview of VIEWAn Overview of VIEW
An Overview of VIEWShiyong Lu
 
Globus Labs: Forging the Next Frontier
Globus Labs: Forging the Next FrontierGlobus Labs: Forging the Next Frontier
Globus Labs: Forging the Next FrontierGlobus
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 
Scientific
Scientific Scientific
Scientific marpierc
 
Advanced Research Computing at York
Advanced Research Computing at YorkAdvanced Research Computing at York
Advanced Research Computing at YorkMing Li
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceinside-BigData.com
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachSoftServe
 
Geospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning DataGeospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning DataAlexMiowski
 

Ähnlich wie Linking Scientific Instruments and Computation (20)

So Long Computer Overlords
So Long Computer OverlordsSo Long Computer Overlords
So Long Computer Overlords
 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
 
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreBig Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
 
Science cloud foster june 2013
Science cloud foster june 2013Science cloud foster june 2013
Science cloud foster june 2013
 
Science as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryScience as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate Discovery
 
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 Keynote
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run Time
 
Time to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the CloudTime to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the Cloud
 
An Overview of VIEW
An Overview of VIEWAn Overview of VIEW
An Overview of VIEW
 
Globus Labs: Forging the Next Frontier
Globus Labs: Forging the Next FrontierGlobus Labs: Forging the Next Frontier
Globus Labs: Forging the Next Frontier
 
Grid computing
Grid computingGrid computing
Grid computing
 
Analytics&IoT
Analytics&IoTAnalytics&IoT
Analytics&IoT
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
Scientific
Scientific Scientific
Scientific
 
Advanced Research Computing at York
Advanced Research Computing at YorkAdvanced Research Computing at York
Advanced Research Computing at York
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
 
Geospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning DataGeospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning Data
 

Mehr von Ian Foster

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxIan Foster
 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionIan Foster
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumIan Foster
 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsIan Foster
 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryIan Foster
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptxIan Foster
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceIan Foster
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryIan Foster
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationIan Foster
 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryIan Foster
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterIan Foster
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon SummaryIan Foster
 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperabilityIan Foster
 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasIan Foster
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFIan Foster
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 
Software Infrastructure for a National Research Platform
Software Infrastructure for a National Research PlatformSoftware Infrastructure for a National Research Platform
Software Infrastructure for a National Research PlatformIan Foster
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Ian Foster
 

Mehr von Ian Foster (20)

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptx
 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, Evolution
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the Continuum
 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart Instruments
 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptx
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental Science
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and Chemistry
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud Automation
 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven Discovery
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and Jupyter
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon Summary
 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperability
 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture Ideas
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCF
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
Software Infrastructure for a National Research Platform
Software Infrastructure for a National Research PlatformSoftware Infrastructure for a National Research Platform
Software Infrastructure for a National Research Platform
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
 

Kürzlich hochgeladen

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 

Linking Scientific Instruments and Computation

  • 1. Linking Scientific Instruments and Computation: Patterns, Technologies, Experiences Ian Foster The University of Chicago Argonne National Laboratory foster@anl.gov Crescat scientia; vita excolatur https://arxiv.org/abs/2204.05128 https://arxiv.org/abs/2208.09513
  • 2. A new generation of scientific instruments New sensors produce data at high velocities and in large volumes New methods and structures are required to capture and process data, and to feed back to sensors Increasing need to harness HPC, cloud, edge computers  An instrument becomes a set of flows, overlaid on distributed physical resources and software Mark Boland, https://bit.ly/3cfSosk, 2017
  • 5. Example: Serial synchrotron crystallography
  • 6. A modular, extensible approach to creating and running flows Flows Capture useful patterns as sequences of actions. Resource-independent
  • 7. A modular, extensible approach to creating and running flows Flows Capture useful patterns as sequences of actions. Resource-independent Action providers Implement actions. Resource-independent Compute Action Provider: Run function at A. Transfer Action Provider: Transfer from A to B. Search Action Provider: Publish metadata. …
  • 8. A modular, extensible approach to creating and running flows Flows Capture useful patterns as sequences of actions. Resource-independent Action providers Implement actions. Resource-independent Fabric Implements auth, data, and compute APIs for manipulating resources Authenticate user. Delegate credentials. Manage file transfers. Run jobs on computers. Access data catalog. … Compute Action Provider: Run function at A. Transfer Action Provider: Transfer from A to B. Search Action Provider: Publish metadata. …
  • 9. Builds on cloud-hosted Globus automation services Globus automation services Triggers Flows Analysis Computer Timers Queues Step Step Step Step Event Type: creation Match: *tiff Action Queue 1 2 3 4 Action Type: user selection data: <feature extraction> Options: approve/reject Microscope Step Step Step Step Flow run Step Step Step Step Action Type: transfer From: microscope To: analysis computer https://arxiv.org/abs/2204.05128
  • 10. Capture flows in reusable forms In various ways: - YAML documents - Python “Gladier” SDK - Web authoring
  • 16. AI model training AI model deployment Data collection & transfer Cerebras Catalog & publish Detector Injector x-ray Target FAIR data Data reduction, refine structures Data collection & transfer AI accelerators, HPC Ptychographic reconstruction Data collection & transfer (raw) Data collection & transfer (position) AI accelerators Serial synchrotron crystallography Ptychography High energy diffraction microscopy Flows have been developed for light source data analysis, biomedical and materials science data ingest, on-demand simulation, …
  • 17. Determining protein structures 10-100x faster “These data services have taken the time to solve a structure from weeks to days and now to hours” Darren Sherrell, SBC beamline scientist APS Sector 19 • Developed new automation pipeline to collect data, analyze and visualize the data, solve protein structure and load results into a searchable portal for real-time feedback • Achieved over 10-100x speed up in time to solution of protein structures at APS beamline • Leveraged unique DOE facilities at Advanced Photon Source (SBC Sector 19) and ALCF (Theta/ ThetaGPU, Petrel, and Data Portals) Deposited first results in open repositories Automation pipeline (Chard, Vescovi, Foster, Blaiszik, Sherrell, Joachimiak, et al.) ALCF Theta ALCF Theta ALCF Theta Data Portals APS ALCF Petrel ALCF Theta 17
  • 18. Flow invocations 2020-21 for five APS experiments Numbers vary due to facility and experimental schedules.
  • 19. We collect detailed performance data on flows https://arxiv.org/abs/2204.05128 Transfer, compute, and cataloging costs for median flows
  • 20. Round-trip latencies for various action providers • Current architecture has ~1 sec minimum latency due to cloud interaction • funcX latencies higher due to polling strategy • Both can be improved as needed
  • 21. We build on a universal auth, compute, & data fabric Globus Auth Authentication and delegation mechanisms to control what happens where Run functions anywhere funcX deployed Access data anywhere Globus Connect deployed * See also: Integrated Research Infrastructure, computing continuum, grid Globus Connect
  • 23. Globus hybrid “SaaS” model: Data fabric
  • 24. Globus hybrid “SaaS” model: Compute fabric funcX agent funcX agent Customer owned and administered computer with funcX agent running on it funcX service orchestrates function execution via communication with funcX agent
  • 25. Polaris Bebop Cluster Argonne Leadership Computing Facility Laboratory Computing Research Center Eagle store APS Computing Orthros Cluster APS DM system Portal server Portal server Theta Advanced Photon Source Key: funcX agent Globus Connect agent API API API User-defined flows Globus-accessible storage and computing (10,000s of systems) Globus Automation Services
  • 26. Building computationally-enhanced instruments: There is much more to be done! • We have worked so far with light sources and data ingest pipelines • We are pleased with adaptability and reliability • Work required in capability (e.g., iteration) and performance • Others are applying tools to microscopes and other instruments • New action providers are needed for instrument control • We are eager to find partners who want to work with us on developing and/or applying these methods and tools!
  • 27. Thanks to talented colleagues! Linking Scientific Instruments & HPC: Patterns, Technologies, Experiences Globus Automation Services: Research process automation across the space-time continuum Rachana Ananthakrishnan Josh Bryan Kyle Chard Ryan Chard Kurt McKee Jim Pruyne Brigitte Raumann https://arxiv.org/abs/2204.05128 https://arxiv.org/abs/2208.09513 Raf Vescovi Ryan Chard Nick Saint Ben Blaiszik Jim Pruyne Tekin Bicer Alex Lavens Zhengchun Liu Mike Papka Suresh Narayanan Nicholas Schwarz Kyle Chard and And sponsors And the rest of the ALCF, APS, & Globus teams
  • 28. Recap: Enabling new instruments Reusable flows composed from an extensible set of actions Built on global auth, compute, data fabric Join us in applying these methods! https://arxiv.org/abs/2204.05128 https://arxiv.org/abs/2208.09513 https://www.globus.org/platform/services/flows

Hinweis der Redaktion

  1. Probe. Instrument. Meter.
  2. Metacomputing revisited 1010 x faster 105 x more tasks 106 x more data Link HPC, AI, instruments c still 3 x 108 m/s 
  3. Metacomputing revisited 1010 x faster 105 x more tasks 106 x more data Link HPC, AI, instruments c still 3 x 108 m/s 
  4. Metacomputing revisited 1010 x faster 105 x more tasks 106 x more data Link HPC, AI, instruments c still 3 x 108 m/s 
  5. Need to mention other Braid people! Eliu Huerta Bogdan Nicolae Justin Wozniak MENTION Eliu work?