SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Downloaden Sie, um offline zu lesen
Janus
Provenance


         Workflows for Information Integration
                         in the Life Sciences

        Part 2: Workflows in context

                     Paolo Missier
            Information Management Group
School of Computer Science, University of Manchester, UK



                Search Computing workshop
                 Como, Italy, May 28th, 2010
Purpose and outline
• In the first part we have seen a case study in system biology
  – showcasing scientific workflows at work


• Two key questions:
  1.Workflows live within an eco-system of models, tools and technologies.
    Can any of these benefit / complement the SeCo paradigm?
  2.What is the relationship between a dataflow model (Taverna) and a SeCo
    query plan?


• To inform the discussion, we focus of some elements of a
  workflow’s lifecycle
  – importing services
  – benefits of domain-specific service collections
  – collecting and querying provenance traces




                                                                             2
An eco-system for workflows
Process-centric science lifecycle
An eco-system for workflows
Process-centric science lifecycle




Service discovery
   and import
An eco-system for workflows
Process-centric science lifecycle




Service discovery
   and import       Data
                                   Metadata        Methods
                    - inputs
                                   - provenance    - the workflow
                    - parameters
                                   - annotations
                    - results
An eco-system for workflows
Process-centric science lifecycle




Service discovery
   and import       Data
                                   Metadata        Methods
                    - inputs
                                   - provenance    - the workflow
                    - parameters
                                   - annotations
                    - results
WSDL to Taverna processors
                                    P11   P12 ...

               +     +    - ...           P1
          Op1(p11, p12, p13, ...)   O1         ...
WSDL      ...
service
               +    -    +...
spec                                Pk1 Pk3 ...
          Opk(pk1, pk2, pk3, ...)
                                          Pk
                                    O2         ...




                                                     4
WSDL to Taverna processors
                                    P11   P12 ...

               +     +    - ...           P1
          Op1(p11, p12, p13, ...)   O1         ...
WSDL      ...
service
               +    -    +...
spec                                Pk1 Pk3 ...
          Opk(pk1, pk2, pk3, ...)
                                          Pk
                                    O2         ...




                                                     4
WSDL to Taverna processors
                                    P11   P12 ...

               +     +    - ...           P1
          Op1(p11, p12, p13, ...)   O1         ...
WSDL      ...
service
               +    -    +...
spec                                Pk1 Pk3 ...
          Opk(pk1, pk2, pk3, ...)
                                          Pk
                                    O2         ...




                                                     4
WSDL to Taverna processors
                                    P11   P12 ...

               +     +    - ...           P1
          Op1(p11, p12, p13, ...)   O1         ...
WSDL      ...
service
               +    -    +...
spec                                Pk1 Pk3 ...
          Opk(pk1, pk2, pk3, ...)
                                          Pk
                                    O2         ...




                                                     4
Service composition requires adapters
Example: SBML model optimisation workflow (see part I) -- designed by Peter Li
http://www.myexperiment.org/workflows/1201
Service composition requires adapters
Example: SBML model optimisation workflow (see part I) -- designed by Peter Li
http://www.myexperiment.org/workflows/1201
Service composition requires adapters
Example: SBML model optimisation workflow (see part I) -- designed by Peter Li
http://www.myexperiment.org/workflows/1201

                                                                    String[] lines = inStr.split("n");
                                                                    StringBuffer sb = new StringBuffer();
                                                                    for(i = 1; i < lines.length -1; i++)
                                                                    {
                                                                      String str = lines[i];
                                                                      str = str.replaceAll("<result>", "");
                                                                      str = str.replaceAll("</result>", "");
                                                                      sb.append(str.trim() + "n");
                                                                    }

                                                                    String outStr = sb.toString();


                                       Url -> content (built-in shell script)


                                            import java.util.regex.Pattern;
                                            import java.util.regex.Matcher;

                                            sb = new StringBuffer();
                                            p = "CHEBI:[0-9]+";

                                            Pattern pattern = Pattern.compile(p);
                                            Matcher matcher = pattern.matcher(sbrml);
                                            while (matcher.find())
                                            {
                                              sb.append("urn:miriam:obo.chebi:" + matcher.group() + ",");
                                            }
                                            String out = sb.toString();
                                            //Clean up
                                            if(out.endsWith(","))
                                              out = out.substring(0, out.length()-1);

                                            chebiIds = out.split(",");
Services in the wild and “in the pen”
Example 1:
BioMoby / Taverna plugin
– provides a platform to
   • exchange common data
     representation formats
   • provide methods for service
     discovery
– offers more than 800 data retrieval
  and analysis services




                                                              6
Services in the wild and “in the pen”
Example 1:                               Example 2:
BioMoby / Taverna plugin                 caGrid
– provides a platform to                  – part of caBIG (cancer Biomedical
   • exchange common data                   Informatics Grid)
     representation formats               – a US project to carry out
   • provide methods for service            eScience and bioinformatics in
     discovery                              cancer research
– offers more than 800 data retrieval
  and analysis services




                                                                     6
Targeting specific service collections
     Well-behaved service collections exist in specific domains
• Cost: may require dedicated access plugins
• Benefits:
  – well-curated ➔ easier to discover
  – designed to work together ➔ easier to compose
Targeting specific service collections
     Well-behaved service collections exist in specific domains
• Cost: may require dedicated access plugins
• Benefits:
  – well-curated ➔ easier to discover
  – designed to work together ➔ easier to compose

   ChemTaverna: a blend of generic + chemistry-specific components
    required components reflect best practices in the specific domain
Targeting specific service collections
     Well-behaved service collections exist in specific domains
• Cost: may require dedicated access plugins
• Benefits:
  – well-curated ➔ easier to discover
  – designed to work together ➔ easier to compose

   ChemTaverna: a blend of generic + chemistry-specific components
    required components reflect best practices in the specific domain

      • Data I/O:
        ‣e.g. loading input from / writing results to spreadsheets
      • Data visualisation library
      • Data manipulation:
        ‣removal, filtering, splitting, merging, transposition
      • Analysis and format transformation
        ‣(PubChem, KEGG, ..., R scripts)
      • Service composition and seamless data flow:
        ‣dedicated library of reusable adapters (as opposed to ad hoc)
Complementary models: provenance




Service discovery
   and import       Data
                                   Metadata        Methods
                    - inputs
                                   - provenance    - the workflow
                    - parameters
                                   - annotations
                    - results
Complementary models: provenance




Service discovery
   and import       Data
                                   Metadata        Methods
                    - inputs
                                   - provenance    - the workflow
                    - parameters
                                   - annotations
                    - results
Taverna workflow provenance
A detailed trace of workflow execution
- data dependencies
- order of processor execution
- processors’ inputs/outputs
- union of these forms a DAG


Model realisation:
- relational (native)
- RDF (based on a provenance ontology)
- Open Provenance Model (XML or RDF)




                                 9
Taverna workflow provenance
A detailed trace of workflow execution
- data dependencies
- order of processor execution
- processors’ inputs/outputs
- union of these forms a DAG


Model realisation:
- relational (native)
- RDF (based on a provenance ontology)
- Open Provenance Model (XML or RDF)




                                 9
Taverna workflow provenance
                                  A detailed trace of workflow execution
                                  - data dependencies
                                  - order of processor execution

        lister
                                  - processors’ inputs/outputs
                                  - union of these forms a DAG
                  get pathways
                   by genes1

                                  Model realisation:
                 merge pathways
                                  - relational (native)
     gene_id                      - RDF (based on a provenance ontology)
    concat gene pathway ids
                                  - Open Provenance Model (XML or RDF)
        output




pathway_genes
                                                                   9
Potential benefits of provenance metadata
• To establish quality, relevance, trust
• To track information attribution through complex transformations
• To describe one’s experiment to others, for understanding / reuse
• To provide evidence in support of scientific claims
• To enable post hoc process analysis for improvement, re-design

  More specifically:
• Causal relations:
   - which input values contributed to computing an output value?
   - which process(es) caused data to be incorrect?
   - which data caused a process to fail?
• Process and data analytics:
   – analyze variations in output vs an input parameter sweep
   – how often has my favourite service been executed? on what inputs?
   – who produced this data?

 See use cases and requirements from the W3C Incubator on Provenance
 http://www.w3.org/2005/Incubator/prov/wiki                            10
Querying Taverna provenance
                      provenance access                  Provenance
                          API (Java)                       capture
<select>
 <workflow name="lymphoma"> Query
   <processor name="fetch">
    <outputPort name="value"
                index="[1,2]" />
   </processor>
   ...
</select>
<focus>                                                 Provenance
  <workflow name="exprAnalysis">           lineage          DB
    <processor name="p1" />
  </workflow>                                query
   ...
</focus>
                                          processor
           Native query answer               [1]
              (Java objects)

            RDF query answer
     [2]     (Janus ontology)

                                                         Tupelo
                      OPM graph
                      (RDF/XML)                         OPM RDF
                                          OPM graph      graph
     [3]
                                           manager      manager
                                                             11
Taverna workflows and SeCo queries
   Is there a possible relationship between a dataflow model
   (Taverna) and a SeCo query plan?


• Could any of the bioinformaitcs / sysmo workflows
  presented earlier have been expressed as SeCo
  queries?
• More generally:
  – under what assumptions can a hand-coded workflow be
    replaced by a query plan?

• In other words:
  – is the SeCo query model interesting from the point of view
    of system biology(or other bioinf domain) research?
  – would it let scientists express their service compositions at
    a higher level?
                                                                    12
Taverna computational model (very briefly)
                                 [ [ mmu:26416, ... ],
                                   [ mmu:328788, ... ] ]


                                             • Collection processing
                                             • Simple type system
                                               • no record / tuple structure
                                             • data driven computation
                                               • with optional processor synchronisation
                                             • parallel processor activation
                                               • greedy (no scheduler)



                                            [ path:mmu04010 MAPK signaling,
                                              path:mmu04370 VEGF signaling ]

[ [ path:mmu04210 Apoptosis, path:mmu04010 MAPK signaling, ...],
  [ path:mmu04010 MAPK signaling , path:mmu04620 Toll-like receptor, ...] ]

                                                               EDBT, Lausanne, March 2010
Parallelism in Taverna

                 • intra-processor: implicit iteration over collections
                 • inter-processor: pipelining

[ a, b, c,...]

[ (echo_1 a),       (echo_1 b),     (echo_1 c)]


(echo_2 (echo_1 a))
                  (echo_2 (echo_1 b))
                           (echo_2 (echo_1 c))




                 But:
                 no “chunking”
                 no repeated stateful calls to processors
                 no ranking
                                                               14
Summary, challenges, and opportunities
• Service composition requires adapters
• Target well-behaved collections of services
   – potentially low-hanging fruits
• The Taverna workflow enactor comes with a provenance capture
  and query component

    • Two key questions:
       1.Workflows live within eco-system of models, tools and technologies.
         Can any of these benefit / complement the SeCo paradigm?
       2.what is the relationship between a dataflow model (Taverna) and a
         SeCo query plan?


• how does the SeCo query model deal with data integration?
  • i.e., the adapters that account for the reality of data heterogeneity (in
     format and content)
• Is there a benefit in enhancing Taverna to support SeCo query plans?
• can the SeCo model take advantage of existing provenance models?

                                                                               15
(Shameless self) references
(1) P. Missier, N. Paton, and K. Belhajjame, "Fine-grained and efficient lineage
   querying of collection-based workflow provenance," Procs. EDBT, Lausanne,
   Switzerland: 2010.
(2) P. Missier, S.S. Sahoo, J. Zhao, A. Sheth, and C. Goble, "Janus: from Workflows
   to Semantic Provenance and Linked Open Data," Procs. IPAW 2010, Troy, NY:
   2010.
(3) P. Missier and C. Goble, “Workflows to Open Provenance Graphs, round-trip”,
   Future Generation Computing Systems Journal, Special issue on the Open
   Provenance Model, submitted.
(4) P. Missier, S. Soiland-Reyes, S. Owen, W. Tan, A. Nenadic, I. Dunlop, A. Williams,
   T. Oinn, and C. Goble, "Taverna, reloaded," Procs. SSDBM 2010, M. Gertz, T. Hey,
   and B. Ludaescher, Heidelberg, Germany: 2010.




                                                                                         16

Weitere ähnliche Inhalte

Was ist angesagt?

Swift 3 Programming for iOS : Protocol
Swift 3 Programming for iOS : ProtocolSwift 3 Programming for iOS : Protocol
Swift 3 Programming for iOS : ProtocolKwang Woo NAM
 
The Ring programming language version 1.5.3 book - Part 87 of 184
The Ring programming language version 1.5.3 book - Part 87 of 184The Ring programming language version 1.5.3 book - Part 87 of 184
The Ring programming language version 1.5.3 book - Part 87 of 184Mahmoud Samir Fayed
 
The Evolution of Async-Programming (SD 2.0, JavaScript)
The Evolution of Async-Programming (SD 2.0, JavaScript)The Evolution of Async-Programming (SD 2.0, JavaScript)
The Evolution of Async-Programming (SD 2.0, JavaScript)jeffz
 
Aspect Mining for Large Systems
Aspect Mining for Large SystemsAspect Mining for Large Systems
Aspect Mining for Large SystemsThomas Zimmermann
 
Let's go Developer 2011 sendai Let's go Java Developer (Programming Language ...
Let's go Developer 2011 sendai Let's go Java Developer (Programming Language ...Let's go Developer 2011 sendai Let's go Java Developer (Programming Language ...
Let's go Developer 2011 sendai Let's go Java Developer (Programming Language ...Uehara Junji
 
A Playful Introduction to Rx
A Playful Introduction to RxA Playful Introduction to Rx
A Playful Introduction to RxAndrey Cheptsov
 
Apache PIG - User Defined Functions
Apache PIG - User Defined FunctionsApache PIG - User Defined Functions
Apache PIG - User Defined FunctionsChristoph Bauer
 
Gevent what's the point
Gevent what's the pointGevent what's the point
Gevent what's the pointseanmcq
 
Swift for TensorFlow - CoreML Personalization
Swift for TensorFlow - CoreML PersonalizationSwift for TensorFlow - CoreML Personalization
Swift for TensorFlow - CoreML PersonalizationJacopo Mangiavacchi
 
soft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coin
soft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coinsoft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coin
soft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coinsoft-shake.ch
 
The Ring programming language version 1.2 book - Part 57 of 84
The Ring programming language version 1.2 book - Part 57 of 84The Ring programming language version 1.2 book - Part 57 of 84
The Ring programming language version 1.2 book - Part 57 of 84Mahmoud Samir Fayed
 
Python & Stuff
Python & StuffPython & Stuff
Python & StuffJacob Perkins
 
Breaking the wall
Breaking the wallBreaking the wall
Breaking the wallTaras Kalapun
 
TWaver Web Performance Report
TWaver Web Performance ReportTWaver Web Performance Report
TWaver Web Performance Report253725291
 
04 - Qt Data
04 - Qt Data04 - Qt Data
04 - Qt DataAndreas Jakl
 
SDC - EinfĂźhrung in Scala
SDC - EinfĂźhrung in ScalaSDC - EinfĂźhrung in Scala
SDC - EinfĂźhrung in ScalaChristian Baranowski
 
Java 7 LavaJUG
Java 7 LavaJUGJava 7 LavaJUG
Java 7 LavaJUGjulien.ponge
 
Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...
Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...
Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...julien.ponge
 
Use of Apache Commons and Utilities
Use of Apache Commons and UtilitiesUse of Apache Commons and Utilities
Use of Apache Commons and UtilitiesPramod Kumar
 
Groovy 1.8の新機能について
Groovy 1.8の新機能についてGroovy 1.8の新機能について
Groovy 1.8の新機能についてUehara Junji
 

Was ist angesagt? (20)

Swift 3 Programming for iOS : Protocol
Swift 3 Programming for iOS : ProtocolSwift 3 Programming for iOS : Protocol
Swift 3 Programming for iOS : Protocol
 
The Ring programming language version 1.5.3 book - Part 87 of 184
The Ring programming language version 1.5.3 book - Part 87 of 184The Ring programming language version 1.5.3 book - Part 87 of 184
The Ring programming language version 1.5.3 book - Part 87 of 184
 
The Evolution of Async-Programming (SD 2.0, JavaScript)
The Evolution of Async-Programming (SD 2.0, JavaScript)The Evolution of Async-Programming (SD 2.0, JavaScript)
The Evolution of Async-Programming (SD 2.0, JavaScript)
 
Aspect Mining for Large Systems
Aspect Mining for Large SystemsAspect Mining for Large Systems
Aspect Mining for Large Systems
 
Let's go Developer 2011 sendai Let's go Java Developer (Programming Language ...
Let's go Developer 2011 sendai Let's go Java Developer (Programming Language ...Let's go Developer 2011 sendai Let's go Java Developer (Programming Language ...
Let's go Developer 2011 sendai Let's go Java Developer (Programming Language ...
 
A Playful Introduction to Rx
A Playful Introduction to RxA Playful Introduction to Rx
A Playful Introduction to Rx
 
Apache PIG - User Defined Functions
Apache PIG - User Defined FunctionsApache PIG - User Defined Functions
Apache PIG - User Defined Functions
 
Gevent what's the point
Gevent what's the pointGevent what's the point
Gevent what's the point
 
Swift for TensorFlow - CoreML Personalization
Swift for TensorFlow - CoreML PersonalizationSwift for TensorFlow - CoreML Personalization
Swift for TensorFlow - CoreML Personalization
 
soft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coin
soft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coinsoft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coin
soft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coin
 
The Ring programming language version 1.2 book - Part 57 of 84
The Ring programming language version 1.2 book - Part 57 of 84The Ring programming language version 1.2 book - Part 57 of 84
The Ring programming language version 1.2 book - Part 57 of 84
 
Python & Stuff
Python & StuffPython & Stuff
Python & Stuff
 
Breaking the wall
Breaking the wallBreaking the wall
Breaking the wall
 
TWaver Web Performance Report
TWaver Web Performance ReportTWaver Web Performance Report
TWaver Web Performance Report
 
04 - Qt Data
04 - Qt Data04 - Qt Data
04 - Qt Data
 
SDC - EinfĂźhrung in Scala
SDC - EinfĂźhrung in ScalaSDC - EinfĂźhrung in Scala
SDC - EinfĂźhrung in Scala
 
Java 7 LavaJUG
Java 7 LavaJUGJava 7 LavaJUG
Java 7 LavaJUG
 
Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...
Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...
Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...
 
Use of Apache Commons and Utilities
Use of Apache Commons and UtilitiesUse of Apache Commons and Utilities
Use of Apache Commons and Utilities
 
Groovy 1.8の新機能について
Groovy 1.8の新機能についてGroovy 1.8の新機能について
Groovy 1.8の新機能について
 

Andere mochten auch

bonny resume
bonny resumebonny resume
bonny resumeBonny Taylor
 
презентация Open document
презентация Open documentпрезентация Open document
презентация Open documentnanatatianuch
 
Введение в макроэкономику
Введение в макроэкономикуВведение в макроэкономику
Введение в макроэкономикуAlina Khazieva
 
ZAPATOS INDUSTRIALES
ZAPATOS INDUSTRIALESZAPATOS INDUSTRIALES
ZAPATOS INDUSTRIALESJose Chumpitaz
 
Xamarin Evolve 2016: Mobile search - making your mobile apps stand out
Xamarin Evolve 2016: Mobile search - making your mobile apps stand outXamarin Evolve 2016: Mobile search - making your mobile apps stand out
Xamarin Evolve 2016: Mobile search - making your mobile apps stand outJames Montemagno
 
건강보험 하나로 심층_학습(건강보험이해)
건강보험 하나로 심층_학습(건강보험이해)건강보험 하나로 심층_학습(건강보험이해)
건강보험 하나로 심층_학습(건강보험이해)New Progressive Party of Korea
 
Market orientation and organizational culture's impact on sme
Market orientation and organizational culture's impact on smeMarket orientation and organizational culture's impact on sme
Market orientation and organizational culture's impact on smeAlexander Decker
 
CA Varun Sethi - IFRS trainings - IFRIC 12 - Accounting for service concessi...
CA Varun Sethi - IFRS trainings -  IFRIC 12 - Accounting for service concessi...CA Varun Sethi - IFRS trainings -  IFRIC 12 - Accounting for service concessi...
CA Varun Sethi - IFRS trainings - IFRIC 12 - Accounting for service concessi...Varun Sethi
 
Traveling: The Places I've Been
Traveling: The Places I've BeenTraveling: The Places I've Been
Traveling: The Places I've BeenMikeLikesIE
 

Andere mochten auch (13)

Ban Pet Sales
Ban Pet SalesBan Pet Sales
Ban Pet Sales
 
bonny resume
bonny resumebonny resume
bonny resume
 
презентация Open document
презентация Open documentпрезентация Open document
презентация Open document
 
Введение в макроэкономику
Введение в макроэкономикуВведение в макроэкономику
Введение в макроэкономику
 
ZAPATOS INDUSTRIALES
ZAPATOS INDUSTRIALESZAPATOS INDUSTRIALES
ZAPATOS INDUSTRIALES
 
Xamarin Evolve 2016: Mobile search - making your mobile apps stand out
Xamarin Evolve 2016: Mobile search - making your mobile apps stand outXamarin Evolve 2016: Mobile search - making your mobile apps stand out
Xamarin Evolve 2016: Mobile search - making your mobile apps stand out
 
CreaciĂłn proyecto etwinning
CreaciĂłn proyecto etwinningCreaciĂłn proyecto etwinning
CreaciĂłn proyecto etwinning
 
Aan de rijn
Aan de rijnAan de rijn
Aan de rijn
 
Levels of editing
Levels of editingLevels of editing
Levels of editing
 
건강보험 하나로 심층_학습(건강보험이해)
건강보험 하나로 심층_학습(건강보험이해)건강보험 하나로 심층_학습(건강보험이해)
건강보험 하나로 심층_학습(건강보험이해)
 
Market orientation and organizational culture's impact on sme
Market orientation and organizational culture's impact on smeMarket orientation and organizational culture's impact on sme
Market orientation and organizational culture's impact on sme
 
CA Varun Sethi - IFRS trainings - IFRIC 12 - Accounting for service concessi...
CA Varun Sethi - IFRS trainings -  IFRIC 12 - Accounting for service concessi...CA Varun Sethi - IFRS trainings -  IFRIC 12 - Accounting for service concessi...
CA Varun Sethi - IFRS trainings - IFRIC 12 - Accounting for service concessi...
 
Traveling: The Places I've Been
Traveling: The Places I've BeenTraveling: The Places I've Been
Traveling: The Places I've Been
 

Ähnlich wie Invited talk: Second Search Computing workshop

Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimizationg3_nittala
 
Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...
Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...
Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...Takahiro Katagiri
 
Python for Scientists
Python for ScientistsPython for Scientists
Python for ScientistsAndreas Dewes
 
Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Romain Francois
 
Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Romain Francois
 
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course PROIDEA
 
CS4200 2019 | Lecture 5 | Transformation by Term Rewriting
CS4200 2019 | Lecture 5 | Transformation by Term RewritingCS4200 2019 | Lecture 5 | Transformation by Term Rewriting
CS4200 2019 | Lecture 5 | Transformation by Term RewritingEelco Visser
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data ManagementAlbert Bifet
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyTravis Oliphant
 
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014PyData
 
Video Transcoding at Scale for ABC iview (NDC Sydney)
Video Transcoding at Scale for ABC iview (NDC Sydney)Video Transcoding at Scale for ABC iview (NDC Sydney)
Video Transcoding at Scale for ABC iview (NDC Sydney)Daphne Chong
 
SPU Optimizations - Part 2
SPU Optimizations - Part 2SPU Optimizations - Part 2
SPU Optimizations - Part 2Naughty Dog
 
Online test program generator for RISC-V processors
Online test program generator for RISC-V processorsOnline test program generator for RISC-V processors
Online test program generator for RISC-V processorsRISC-V International
 
Python for Chemistry
Python for ChemistryPython for Chemistry
Python for Chemistryguest5929fa7
 
Python for Chemistry
Python for ChemistryPython for Chemistry
Python for Chemistrybaoilleach
 
Python Performance 101
Python Performance 101Python Performance 101
Python Performance 101Ankur Gupta
 
The Ring programming language version 1.7 book - Part 83 of 196
The Ring programming language version 1.7 book - Part 83 of 196The Ring programming language version 1.7 book - Part 83 of 196
The Ring programming language version 1.7 book - Part 83 of 196Mahmoud Samir Fayed
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...Data Con LA
 
From NumPy to PyTorch
From NumPy to PyTorchFrom NumPy to PyTorch
From NumPy to PyTorchMike Ruberry
 

Ähnlich wie Invited talk: Second Search Computing workshop (20)

Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimization
 
Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...
Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...
Towards Auto-tuning Facilities into Supercomputers in Operation - The FIBER a...
 
Python for Scientists
Python for ScientistsPython for Scientists
Python for Scientists
 
Swift for tensorflow
Swift for tensorflowSwift for tensorflow
Swift for tensorflow
 
Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Rcpp: Seemless R and C++
Rcpp: Seemless R and C++
 
Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Rcpp: Seemless R and C++
Rcpp: Seemless R and C++
 
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
 
CS4200 2019 | Lecture 5 | Transformation by Term Rewriting
CS4200 2019 | Lecture 5 | Transformation by Term RewritingCS4200 2019 | Lecture 5 | Transformation by Term Rewriting
CS4200 2019 | Lecture 5 | Transformation by Term Rewriting
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
 
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
 
Video Transcoding at Scale for ABC iview (NDC Sydney)
Video Transcoding at Scale for ABC iview (NDC Sydney)Video Transcoding at Scale for ABC iview (NDC Sydney)
Video Transcoding at Scale for ABC iview (NDC Sydney)
 
SPU Optimizations - Part 2
SPU Optimizations - Part 2SPU Optimizations - Part 2
SPU Optimizations - Part 2
 
Online test program generator for RISC-V processors
Online test program generator for RISC-V processorsOnline test program generator for RISC-V processors
Online test program generator for RISC-V processors
 
Python for Chemistry
Python for ChemistryPython for Chemistry
Python for Chemistry
 
Python for Chemistry
Python for ChemistryPython for Chemistry
Python for Chemistry
 
Python Performance 101
Python Performance 101Python Performance 101
Python Performance 101
 
The Ring programming language version 1.7 book - Part 83 of 196
The Ring programming language version 1.7 book - Part 83 of 196The Ring programming language version 1.7 book - Part 83 of 196
The Ring programming language version 1.7 book - Part 83 of 196
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
 
From NumPy to PyTorch
From NumPy to PyTorchFrom NumPy to PyTorch
From NumPy to PyTorch
 

Mehr von Paolo Missier

Towards explanations for Data-Centric AI using provenance records
Towards explanations for Data-Centric AI using provenance recordsTowards explanations for Data-Centric AI using provenance records
Towards explanations for Data-Centric AI using provenance recordsPaolo Missier
 
Interpretable and robust hospital readmission predictions from Electronic Hea...
Interpretable and robust hospital readmission predictions from Electronic Hea...Interpretable and robust hospital readmission predictions from Electronic Hea...
Interpretable and robust hospital readmission predictions from Electronic Hea...Paolo Missier
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...Paolo Missier
 
Realising the potential of Health Data Science: opportunities and challenges ...
Realising the potential of Health Data Science:opportunities and challenges ...Realising the potential of Health Data Science:opportunities and challenges ...
Realising the potential of Health Data Science: opportunities and challenges ...Paolo Missier
 
Provenance Week 2023 talk on DP4DS (Data Provenance for Data Science)
Provenance Week 2023 talk on DP4DS (Data Provenance for Data Science)Provenance Week 2023 talk on DP4DS (Data Provenance for Data Science)
Provenance Week 2023 talk on DP4DS (Data Provenance for Data Science)Paolo Missier
 
A Data-centric perspective on Data-driven healthcare: a short overview
A Data-centric perspective on Data-driven healthcare: a short overviewA Data-centric perspective on Data-driven healthcare: a short overview
A Data-centric perspective on Data-driven healthcare: a short overviewPaolo Missier
 
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Capturing and querying fine-grained provenance of preprocessing pipelines in ...Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Capturing and querying fine-grained provenance of preprocessing pipelines in ...Paolo Missier
 
Tracking trajectories of multiple long-term conditions using dynamic patient...
Tracking trajectories of  multiple long-term conditions using dynamic patient...Tracking trajectories of  multiple long-term conditions using dynamic patient...
Tracking trajectories of multiple long-term conditions using dynamic patient...Paolo Missier
 
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...Paolo Missier
 
Digital biomarkers for preventive personalised healthcare
Digital biomarkers for preventive personalised healthcareDigital biomarkers for preventive personalised healthcare
Digital biomarkers for preventive personalised healthcarePaolo Missier
 
Digital biomarkers for preventive personalised healthcare
Digital biomarkers for preventive personalised healthcareDigital biomarkers for preventive personalised healthcare
Digital biomarkers for preventive personalised healthcarePaolo Missier
 
Data Provenance for Data Science
Data Provenance for Data ScienceData Provenance for Data Science
Data Provenance for Data SciencePaolo Missier
 
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Capturing and querying fine-grained provenance of preprocessing pipelines in ...Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Capturing and querying fine-grained provenance of preprocessing pipelines in ...Paolo Missier
 
Quo vadis, provenancer?  Cui prodest?  our own trajectory: provenance of data...
Quo vadis, provenancer? Cui prodest? our own trajectory: provenance of data...Quo vadis, provenancer? Cui prodest? our own trajectory: provenance of data...
Quo vadis, provenancer?  Cui prodest?  our own trajectory: provenance of data...Paolo Missier
 
Data Science for (Health) Science: tales from a challenging front line, and h...
Data Science for (Health) Science:tales from a challenging front line, and h...Data Science for (Health) Science:tales from a challenging front line, and h...
Data Science for (Health) Science: tales from a challenging front line, and h...Paolo Missier
 
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Analytics of analytics pipelines:from optimising re-execution to general Dat...Analytics of analytics pipelines:from optimising re-execution to general Dat...
Analytics of analytics pipelines: from optimising re-execution to general Dat...Paolo Missier
 
ReComp: optimising the re-execution of analytics pipelines in response to cha...
ReComp: optimising the re-execution of analytics pipelines in response to cha...ReComp: optimising the re-execution of analytics pipelines in response to cha...
ReComp: optimising the re-execution of analytics pipelines in response to cha...Paolo Missier
 
ReComp, the complete story: an invited talk at Cardiff University
ReComp, the complete story:  an invited talk at Cardiff UniversityReComp, the complete story:  an invited talk at Cardiff University
ReComp, the complete story: an invited talk at Cardiff UniversityPaolo Missier
 
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...Paolo Missier
 
Decentralized, Trust-less Marketplace for Brokered IoT Data Trading using Blo...
Decentralized, Trust-less Marketplacefor Brokered IoT Data Tradingusing Blo...Decentralized, Trust-less Marketplacefor Brokered IoT Data Tradingusing Blo...
Decentralized, Trust-less Marketplace for Brokered IoT Data Trading using Blo...Paolo Missier
 

Mehr von Paolo Missier (20)

Towards explanations for Data-Centric AI using provenance records
Towards explanations for Data-Centric AI using provenance recordsTowards explanations for Data-Centric AI using provenance records
Towards explanations for Data-Centric AI using provenance records
 
Interpretable and robust hospital readmission predictions from Electronic Hea...
Interpretable and robust hospital readmission predictions from Electronic Hea...Interpretable and robust hospital readmission predictions from Electronic Hea...
Interpretable and robust hospital readmission predictions from Electronic Hea...
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
 
Realising the potential of Health Data Science: opportunities and challenges ...
Realising the potential of Health Data Science:opportunities and challenges ...Realising the potential of Health Data Science:opportunities and challenges ...
Realising the potential of Health Data Science: opportunities and challenges ...
 
Provenance Week 2023 talk on DP4DS (Data Provenance for Data Science)
Provenance Week 2023 talk on DP4DS (Data Provenance for Data Science)Provenance Week 2023 talk on DP4DS (Data Provenance for Data Science)
Provenance Week 2023 talk on DP4DS (Data Provenance for Data Science)
 
A Data-centric perspective on Data-driven healthcare: a short overview
A Data-centric perspective on Data-driven healthcare: a short overviewA Data-centric perspective on Data-driven healthcare: a short overview
A Data-centric perspective on Data-driven healthcare: a short overview
 
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Capturing and querying fine-grained provenance of preprocessing pipelines in ...Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
 
Tracking trajectories of multiple long-term conditions using dynamic patient...
Tracking trajectories of  multiple long-term conditions using dynamic patient...Tracking trajectories of  multiple long-term conditions using dynamic patient...
Tracking trajectories of multiple long-term conditions using dynamic patient...
 
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
 
Digital biomarkers for preventive personalised healthcare
Digital biomarkers for preventive personalised healthcareDigital biomarkers for preventive personalised healthcare
Digital biomarkers for preventive personalised healthcare
 
Digital biomarkers for preventive personalised healthcare
Digital biomarkers for preventive personalised healthcareDigital biomarkers for preventive personalised healthcare
Digital biomarkers for preventive personalised healthcare
 
Data Provenance for Data Science
Data Provenance for Data ScienceData Provenance for Data Science
Data Provenance for Data Science
 
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Capturing and querying fine-grained provenance of preprocessing pipelines in ...Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
 
Quo vadis, provenancer?  Cui prodest?  our own trajectory: provenance of data...
Quo vadis, provenancer? Cui prodest? our own trajectory: provenance of data...Quo vadis, provenancer? Cui prodest? our own trajectory: provenance of data...
Quo vadis, provenancer?  Cui prodest?  our own trajectory: provenance of data...
 
Data Science for (Health) Science: tales from a challenging front line, and h...
Data Science for (Health) Science:tales from a challenging front line, and h...Data Science for (Health) Science:tales from a challenging front line, and h...
Data Science for (Health) Science: tales from a challenging front line, and h...
 
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Analytics of analytics pipelines:from optimising re-execution to general Dat...Analytics of analytics pipelines:from optimising re-execution to general Dat...
Analytics of analytics pipelines: from optimising re-execution to general Dat...
 
ReComp: optimising the re-execution of analytics pipelines in response to cha...
ReComp: optimising the re-execution of analytics pipelines in response to cha...ReComp: optimising the re-execution of analytics pipelines in response to cha...
ReComp: optimising the re-execution of analytics pipelines in response to cha...
 
ReComp, the complete story: an invited talk at Cardiff University
ReComp, the complete story:  an invited talk at Cardiff UniversityReComp, the complete story:  an invited talk at Cardiff University
ReComp, the complete story: an invited talk at Cardiff University
 
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
 
Decentralized, Trust-less Marketplace for Brokered IoT Data Trading using Blo...
Decentralized, Trust-less Marketplacefor Brokered IoT Data Tradingusing Blo...Decentralized, Trust-less Marketplacefor Brokered IoT Data Tradingusing Blo...
Decentralized, Trust-less Marketplace for Brokered IoT Data Trading using Blo...
 

KĂźrzlich hochgeladen

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

KĂźrzlich hochgeladen (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Invited talk: Second Search Computing workshop

  • 1. Janus Provenance Workflows for Information Integration in the Life Sciences Part 2: Workflows in context Paolo Missier Information Management Group School of Computer Science, University of Manchester, UK Search Computing workshop Como, Italy, May 28th, 2010
  • 2. Purpose and outline • In the first part we have seen a case study in system biology – showcasing scientific workflows at work • Two key questions: 1.Workflows live within an eco-system of models, tools and technologies. Can any of these benefit / complement the SeCo paradigm? 2.What is the relationship between a dataflow model (Taverna) and a SeCo query plan? • To inform the discussion, we focus of some elements of a workflow’s lifecycle – importing services – benefits of domain-specific service collections – collecting and querying provenance traces 2
  • 3. An eco-system for workflows Process-centric science lifecycle
  • 4. An eco-system for workflows Process-centric science lifecycle Service discovery and import
  • 5. An eco-system for workflows Process-centric science lifecycle Service discovery and import Data Metadata Methods - inputs - provenance - the workflow - parameters - annotations - results
  • 6. An eco-system for workflows Process-centric science lifecycle Service discovery and import Data Metadata Methods - inputs - provenance - the workflow - parameters - annotations - results
  • 7. WSDL to Taverna processors P11 P12 ... + + - ... P1 Op1(p11, p12, p13, ...) O1 ... WSDL ... service + - +... spec Pk1 Pk3 ... Opk(pk1, pk2, pk3, ...) Pk O2 ... 4
  • 8. WSDL to Taverna processors P11 P12 ... + + - ... P1 Op1(p11, p12, p13, ...) O1 ... WSDL ... service + - +... spec Pk1 Pk3 ... Opk(pk1, pk2, pk3, ...) Pk O2 ... 4
  • 9. WSDL to Taverna processors P11 P12 ... + + - ... P1 Op1(p11, p12, p13, ...) O1 ... WSDL ... service + - +... spec Pk1 Pk3 ... Opk(pk1, pk2, pk3, ...) Pk O2 ... 4
  • 10. WSDL to Taverna processors P11 P12 ... + + - ... P1 Op1(p11, p12, p13, ...) O1 ... WSDL ... service + - +... spec Pk1 Pk3 ... Opk(pk1, pk2, pk3, ...) Pk O2 ... 4
  • 11. Service composition requires adapters Example: SBML model optimisation workflow (see part I) -- designed by Peter Li http://www.myexperiment.org/workflows/1201
  • 12. Service composition requires adapters Example: SBML model optimisation workflow (see part I) -- designed by Peter Li http://www.myexperiment.org/workflows/1201
  • 13. Service composition requires adapters Example: SBML model optimisation workflow (see part I) -- designed by Peter Li http://www.myexperiment.org/workflows/1201 String[] lines = inStr.split("n"); StringBuffer sb = new StringBuffer(); for(i = 1; i < lines.length -1; i++) { String str = lines[i]; str = str.replaceAll("<result>", ""); str = str.replaceAll("</result>", ""); sb.append(str.trim() + "n"); } String outStr = sb.toString(); Url -> content (built-in shell script) import java.util.regex.Pattern; import java.util.regex.Matcher; sb = new StringBuffer(); p = "CHEBI:[0-9]+"; Pattern pattern = Pattern.compile(p); Matcher matcher = pattern.matcher(sbrml); while (matcher.find()) { sb.append("urn:miriam:obo.chebi:" + matcher.group() + ","); } String out = sb.toString(); //Clean up if(out.endsWith(",")) out = out.substring(0, out.length()-1); chebiIds = out.split(",");
  • 14. Services in the wild and “in the pen” Example 1: BioMoby / Taverna plugin – provides a platform to • exchange common data representation formats • provide methods for service discovery – offers more than 800 data retrieval and analysis services 6
  • 15. Services in the wild and “in the pen” Example 1: Example 2: BioMoby / Taverna plugin caGrid – provides a platform to – part of caBIG (cancer Biomedical • exchange common data Informatics Grid) representation formats – a US project to carry out • provide methods for service eScience and bioinformatics in discovery cancer research – offers more than 800 data retrieval and analysis services 6
  • 16. Targeting specific service collections Well-behaved service collections exist in specific domains • Cost: may require dedicated access plugins • Benefits: – well-curated ➔ easier to discover – designed to work together ➔ easier to compose
  • 17. Targeting specific service collections Well-behaved service collections exist in specific domains • Cost: may require dedicated access plugins • Benefits: – well-curated ➔ easier to discover – designed to work together ➔ easier to compose ChemTaverna: a blend of generic + chemistry-specific components required components reflect best practices in the specific domain
  • 18. Targeting specific service collections Well-behaved service collections exist in specific domains • Cost: may require dedicated access plugins • Benefits: – well-curated ➔ easier to discover – designed to work together ➔ easier to compose ChemTaverna: a blend of generic + chemistry-specific components required components reflect best practices in the specific domain • Data I/O: ‣e.g. loading input from / writing results to spreadsheets • Data visualisation library • Data manipulation: ‣removal, filtering, splitting, merging, transposition • Analysis and format transformation ‣(PubChem, KEGG, ..., R scripts) • Service composition and seamless data flow: ‣dedicated library of reusable adapters (as opposed to ad hoc)
  • 19. Complementary models: provenance Service discovery and import Data Metadata Methods - inputs - provenance - the workflow - parameters - annotations - results
  • 20. Complementary models: provenance Service discovery and import Data Metadata Methods - inputs - provenance - the workflow - parameters - annotations - results
  • 21. Taverna workflow provenance A detailed trace of workflow execution - data dependencies - order of processor execution - processors’ inputs/outputs - union of these forms a DAG Model realisation: - relational (native) - RDF (based on a provenance ontology) - Open Provenance Model (XML or RDF) 9
  • 22. Taverna workflow provenance A detailed trace of workflow execution - data dependencies - order of processor execution - processors’ inputs/outputs - union of these forms a DAG Model realisation: - relational (native) - RDF (based on a provenance ontology) - Open Provenance Model (XML or RDF) 9
  • 23. Taverna workflow provenance A detailed trace of workflow execution - data dependencies - order of processor execution lister - processors’ inputs/outputs - union of these forms a DAG get pathways by genes1 Model realisation: merge pathways - relational (native) gene_id - RDF (based on a provenance ontology) concat gene pathway ids - Open Provenance Model (XML or RDF) output pathway_genes 9
  • 24. Potential benefits of provenance metadata • To establish quality, relevance, trust • To track information attribution through complex transformations • To describe one’s experiment to others, for understanding / reuse • To provide evidence in support of scientific claims • To enable post hoc process analysis for improvement, re-design More specifically: • Causal relations: - which input values contributed to computing an output value? - which process(es) caused data to be incorrect? - which data caused a process to fail? • Process and data analytics: – analyze variations in output vs an input parameter sweep – how often has my favourite service been executed? on what inputs? – who produced this data? See use cases and requirements from the W3C Incubator on Provenance http://www.w3.org/2005/Incubator/prov/wiki 10
  • 25. Querying Taverna provenance provenance access Provenance API (Java) capture <select> <workflow name="lymphoma"> Query <processor name="fetch"> <outputPort name="value" index="[1,2]" /> </processor> ... </select> <focus> Provenance <workflow name="exprAnalysis"> lineage DB <processor name="p1" /> </workflow> query ... </focus> processor Native query answer [1] (Java objects) RDF query answer [2] (Janus ontology) Tupelo OPM graph (RDF/XML) OPM RDF OPM graph graph [3] manager manager 11
  • 26. Taverna workflows and SeCo queries Is there a possible relationship between a dataflow model (Taverna) and a SeCo query plan? • Could any of the bioinformaitcs / sysmo workflows presented earlier have been expressed as SeCo queries? • More generally: – under what assumptions can a hand-coded workflow be replaced by a query plan? • In other words: – is the SeCo query model interesting from the point of view of system biology(or other bioinf domain) research? – would it let scientists express their service compositions at a higher level? 12
  • 27. Taverna computational model (very briefly) [ [ mmu:26416, ... ], [ mmu:328788, ... ] ] • Collection processing • Simple type system • no record / tuple structure • data driven computation • with optional processor synchronisation • parallel processor activation • greedy (no scheduler) [ path:mmu04010 MAPK signaling, path:mmu04370 VEGF signaling ] [ [ path:mmu04210 Apoptosis, path:mmu04010 MAPK signaling, ...], [ path:mmu04010 MAPK signaling , path:mmu04620 Toll-like receptor, ...] ] EDBT, Lausanne, March 2010
  • 28. Parallelism in Taverna • intra-processor: implicit iteration over collections • inter-processor: pipelining [ a, b, c,...] [ (echo_1 a), (echo_1 b), (echo_1 c)] (echo_2 (echo_1 a)) (echo_2 (echo_1 b)) (echo_2 (echo_1 c)) But: no “chunking” no repeated stateful calls to processors no ranking 14
  • 29. Summary, challenges, and opportunities • Service composition requires adapters • Target well-behaved collections of services – potentially low-hanging fruits • The Taverna workflow enactor comes with a provenance capture and query component • Two key questions: 1.Workflows live within eco-system of models, tools and technologies. Can any of these benefit / complement the SeCo paradigm? 2.what is the relationship between a dataflow model (Taverna) and a SeCo query plan? • how does the SeCo query model deal with data integration? • i.e., the adapters that account for the reality of data heterogeneity (in format and content) • Is there a benefit in enhancing Taverna to support SeCo query plans? • can the SeCo model take advantage of existing provenance models? 15
  • 30. (Shameless self) references (1) P. Missier, N. Paton, and K. Belhajjame, "Fine-grained and efficient lineage querying of collection-based workflow provenance," Procs. EDBT, Lausanne, Switzerland: 2010. (2) P. Missier, S.S. Sahoo, J. Zhao, A. Sheth, and C. Goble, "Janus: from Workflows to Semantic Provenance and Linked Open Data," Procs. IPAW 2010, Troy, NY: 2010. (3) P. Missier and C. Goble, “Workflows to Open Provenance Graphs, round-trip”, Future Generation Computing Systems Journal, Special issue on the Open Provenance Model, submitted. (4) P. Missier, S. Soiland-Reyes, S. Owen, W. Tan, A. Nenadic, I. Dunlop, A. Williams, T. Oinn, and C. Goble, "Taverna, reloaded," Procs. SSDBM 2010, M. Gertz, T. Hey, and B. Ludaescher, Heidelberg, Germany: 2010. 16