SlideShare ist ein Scribd-Unternehmen logo
1 von 45
Downloaden Sie, um offline zu lesen
1




       The open source ISA metadata tracking
  framework: from data curation and management at
        the source, to the linked data universe

     BOSC, Long Beach, July 13-14, 2012

     Philippe Rocca-Serra (Ph. D)
     ISA Team
     twitter: @isatools.org




                                          philippe.rocca-serra@oerc.ox.ac.uk
                                                     http://www.isa-tools.org

Friday, 13 July 2012
3




                                MAIN THEME:
         It is all about structuring experimental information to make it
          available to computer and software agents to enable mining.

                         But let’s proceed gradually…




Friday, 13 July 2012
3




                                             MAIN THEME:
         It is all about structuring experimental information to make it
          available to computer and software agents to enable mining.

                                    But let’s proceed gradually…




                       Notes in Lab Books
                       (information for humans)




Friday, 13 July 2012
3




                                              MAIN THEME:
         It is all about structuring experimental information to make it
          available to computer and software agents to enable mining.

                                    But let’s proceed gradually…




                       Notes in Lab Books       Spreadsheets and Tables
                       (information for humans) ( the compromise)




Friday, 13 July 2012
3




                                              MAIN THEME:
         It is all about structuring experimental information to make it
          available to computer and software agents to enable mining.

                                    But let’s proceed gradually…




                       Notes in Lab Books       Spreadsheets and Tables   Facts as RDF statements
                       (information for humans) ( the compromise)         (information for machines)




Friday, 13 July 2012
9




                               Observations
         • Experiments are expensive, often publicly funded, still
           many fail to see the light.
         • Spreadsheets are the most common vehicle for so-called
           ‘omics’ (functional genomics) experimental metadata
           tracking
         • technology centric repositories form de facto silos
         • conversions are required to allow for deposition to public
           databases.
         • submitting to common information across a series of
           repositories is inefficient



Friday, 13 July 2012
10




                       Case Study




Friday, 13 July 2012
13


                       Many ontologies, Many Formats, Many
                                Requirements…


                                        Grr…Where are the
                                        tools!?!




                                Credits:	
  h/p://liverpoolsolfed.wordpress.com/resources/image-­‐bank/demonstraAon/




Friday, 13 July 2012
14


                       ISA framework overview




Friday, 13 July 2012
Why ISA format and Tools?

           – Supporting data provenance tracking
           – Node/Edge underlying concept
           – Tabular as a compromise: a presentation layer inspired by Object
             model (FuGE,MAGE-OM)
           – A Generic representation, applied to:
              • microarray based experiments (MAGE)
              • sequencing based experiments (SRA)
              • flow cytometry based experiments (FuGE-Flow Cyt)
              • mass spectrometry and NMR spectroscopy experiments




Friday, 13 July 2012
Why ISA format and Tools?


                                       investigation                       investigation
                                                                            high  level  concept  to  link          H1                 H. Sapiens       35       Years   H1.sample1    Labeling         H1.sample1.labeled        h1-s1.cel
                                                                            related  studies                        H1                 H. Sapiens       35       Years   H1.sample2                                               h1-s2.cel
                                                                                                                    H2                 H. Sapiens       33       Years   H2.sample1    Labeling         H2.sample1.labeled        h2-s1.cel
                                                                           study
                                                                            the  central  unit,  containing  
                                                                            information  on  the  subject  
                                                                            under  study,  its  characteristics                                     H1.sample1              Labeling              H1.sample1.labeled         h1-s1.cel
                                                                            and  any  treatments  applied.               H1
                                                                            a  study  has  associated  assays             H. Sapiens                H1.sample2                                                               h1-s2.cel
                                                                                                                          35 Years


                                                                           assay                                         H2                         H2.sample1              Labeling              H2.sample1.labeled         h2-s1.cel
                                                                             test  performed  either  on                  H. Sapiens
                                                                                                                          33 Years
                                                                             material  taken  from  the  sub-­
                                                                             ject  or  on  the  whole  initial  
                                                                             subject,  which  produce  quali-­
                                                                             tative  or  quantitative  meas-­            ISA metadata specifications:
                                                                             urements  (data)
                                                                                                                         •workflow and process orientated
                                                                                                                         •compatible with checklist enforcement
                                                                                                                         •compatible with external vocabulary resources
                       assay(s)                                 assay(s)                                                 •compatible by design with existing schemas
                                   pointers  to  data  file                                                                   MAGE-Tab
                                     names/location
                                                                                                                                                             Pride-xml
                                                                                                                                                                                   SRA-xml

                                    external  files  in                                                                                                                      Currently finalizing conversion to RDF to explore
                                  native  or  other  for-­
                                          mats
                                                                                                                                                                             the growing Linked Data universe, in collaboration
                                                                                                                                                                             with the W3C HCLSIG, Toxbank Consortium)
                         data                                      data




Friday, 13 July 2012
ISA syntax and Table definition

• Material Transformations:
     – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.)




          Material Node                                        Material Node




    Characteristics[…]
    Factor Value[…] (independent          Protocol REF                   Characteristics[…]
    variables)
                                                                         Factor Value[…] (independent
    Material Type
                                   Parameter Value                       variables)
    Comment[…]
                                   […]                                   Material Type
                                                                         Comment[…]


                                                         Performer (operator
                                                         effect)
                                                          Date (day effect)




                                                                                                                           9

Friday, 13 July 2012
ISA syntax and Table definition

• Material Transformations:
     – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.)




          Material Node                                        Material Node




    Characteristics[…]
    Factor Value[…] (independent          Protocol REF                   Characteristics[…]
    variables)
                                                                         Factor Value[…] (independent
    Material Type
                                   Parameter Value                       variables)
    Comment[…]
                                   […]                                   Material Type
                                                                         Comment[…]


                                                         Performer (operator
                                                         effect)
                                                          Date (day effect)




                                                                                                                           9

Friday, 13 July 2012
ISA syntax and Table definition

• Material Transformations:
     – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.)



                                                                    Data File Node
          Material Node                                        Material Node




    Characteristics[…]
    Factor Value[…] (independent          Protocol REF                   Characteristics[…]
    variables)
                                                                         Factor Value[…] (independent
    Material Type
                                   Parameter Value                       variables)
    Comment[…]
                                   […]                                   Material Type
                                                                         Comment[…]


                                                         Performer (operator
                                                         effect)
                                                          Date (day effect)




                                                                                                                           9

Friday, 13 July 2012
ISA syntax and Table definition

• Material Transformations:
     – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.)



                                                                    Data File Node
          Material Node                                        Material Node




                                                                                    Comment[…]
    Characteristics[…]
    Factor Value[…] (independent          Protocol REF                   Characteristics[…]
    variables)
                                                                         Factor Value[…] (independent
    Material Type
                                   Parameter Value                       variables)
    Comment[…]
                                   […]                                   Material Type
                                                                         Comment[…]


                                                         Performer (operator
                                                         effect)
                                                          Date (day effect)




                                                                                                                           9

Friday, 13 July 2012
ISA syntax and Table definition

• Material Transformations:
     – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.)



                                                                    Data File Node
          Material Node                                        Material Node




                                                                                    Comment[…]
    Characteristics[…]
    Factor Value[…] (independent          Protocol REF                   Characteristics[…]
    variables)
                                                                         Factor Value[…] (independent
    Material Type
                                   Parameter Value                       variables)
    Comment[…]
                                   […]                                   Material Type
                                                                         Comment[…]


                                                         Performer (operator
                                                         effect)
                                                          Date (day effect)




                                                                                                                           9

Friday, 13 July 2012
19


                       ISAconfigurator Tables




Friday, 13 July 2012
20


                       ISAconfigurator Tables




Friday, 13 July 2012
22




             How do ISA tools access Ontology servers?




Friday, 13 July 2012
The ISAcreator...




                              isacreator
  Developed to be a user friendly way to
  enter standards-compliant metadata: it
  has lots of features...


    But these are just some of them...we
    also have a data entry wizard and an
    import utility...




Friday, 13 July 2012
24



                       Select and Annotate in ISAcreator




Friday, 13 July 2012
Extending ISAcreator
                           The Plugin Archictecture




Friday, 13 July 2012
Plugins in ISAcreator

     In ISAcreator, we use the Apache Felix implementation of the OSGi framework...it’s really good.

     •Plugins can be developed for 3 different purposes:




     Search (adds extra search space                 Custom cell editors                    Extra general functionality
     for ontology tool)                              (for spreadsheet)                      (which appears in a plugin
                                                                                            menu)
     •2 Examples of ISA plugins:
        • Access to local metadata stores: Novartis Plugin to Ontology Widget
          • Annotation of findings: Metabolite Identification Plugin (Metabolights Repository contribution to ISA project).




Friday, 13 July 2012
Plugins...example 1      Novartis Metastore Search




                           Search function on the Novartis
                           Metastore... integrates search results
                           on the metastore in the Ontology
                           search tool.

                           So, with the Novartis plugin in your
                           Plugin directory, you’ll be able to
                           search the Novartis metastore
                           directly within ISAcreator, and it will
                           handle all the tasks involved with
                           recording term source, etc.




Friday, 13 July 2012
Plugins Example 2 - Metabolite Identification plugin




 5
     Credits: Kenneth Haug: Metabolights


Friday, 13 July 2012
30




                       Potential Issues and known hurdles


         • The problem of conflicting versions
           – especially high when working with big consortia
           – distributed, decentralized groups of users
         • Lack of version control and history
         • Absence of collaborative features

               – Looking for new solutions while retaining the features !
                  • OntoMaton: Bringing Google Doc, NCBO Bioportal and
                    ISA-TAB together !


Friday, 13 July 2012
Friday, 13 July 2012
OntoMaton: Searching




Friday, 13 July 2012
OntoMaton: Tagging




Friday, 13 July 2012
OntoMaton
                       • Public release: http://goo.gl/2OKFV
                       • Can be used in any Google Spreadsheet
                         document

                       • Application:
                        • Annotating data records
                        • Supporting ontology development (see OBI
                           Quick Term Templates)



Friday, 13 July 2012
31



                             ISA2RDF work in progress
         • Use case on W3C HCLS scientific discourse list
               – deciding on the granularity of representation
               – building on previous experience
               – Evaluating alternative representations.
         • Participitation to the Biohackathon 2011
               – http://blogs.openaccesscentral.com/blogs/bmcblog/entry/
                 biohackathon_2011_number_1
               – Discussing best practices
                       • PURL uri and identifiers.org as identifiers
             • Openphacts guidelines                (http://www.nanopub.org/guidelines/
                       OpenPHACTS_Nanopublication_Guidlines_v1.8.1.pdf)

                       •
Friday, 13 July 2012
Preparing for Linked Open Data
                   ✴   ISA2RDF (Toxbank collaboration) contribution to an
                       ecosystem of software tools supporting the ISA syntax
                   ✴   reliance to internet resolvable identifiers
                   ✴   W3C bio/life science Note on Gene Expression RDF -
                       (PMID: 22449719)
                   ✴   TODO:
                       ✴   Specify comparator groups + analysis methods and
                           resulting measurements and statistical measures


Friday, 13 July 2012
Preparing for Linked Open Data
                   ✴   ISA2RDF (Toxbank collaboration) contribution to an
                       ecosystem of software tools supporting the ISA syntax
                   ✴   reliance to internet resolvable identifiers
                   ✴   W3C bio/life science Note on Gene Expression RDF -
                       (PMID: 22449719)
                   ✴   TODO:
                       ✴   Specify comparator groups + analysis methods and
                           resulting measurements and statistical measures


Friday, 13 July 2012
Preparing for Linked Open Data
                   ✴   ISA2RDF (Toxbank collaboration) contribution to an
                       ecosystem of software tools supporting the ISA syntax
                   ✴   reliance to internet resolvable identifiers
                   ✴   W3C bio/life science Note on Gene Expression RDF -
                       (PMID: 22449719)
                   ✴   TODO:
                       ✴   Specify comparator groups + analysis methods and
                           resulting measurements and statistical measures


Friday, 13 July 2012
32




                                    ISA2RDF: work in progress




                       jeliazkova.nina
                       [toxbank project]
Friday, 13 July 2012
32




                                    ISA2RDF: work in progress




                       jeliazkova.nina
                       [toxbank project]
Friday, 13 July 2012
ISA2OWL

                       • OWLAPI
                       • ISA Parser (in memory BII object store objects)

                       • Mapping ISA syntax into target Ontological Space
                       • Decoupling Mapping from Conversion Engine
                        • avoid to be tied to a semantic framework

Friday, 13 July 2012
ISA2OWL: mapping in the
                       BFO space as starting point




Friday, 13 July 2012
ISA2OWL: mapping in the
                       BFO space as starting point




Friday, 13 July 2012
ISA2OWL: mapping issues

                       • Stability over time
                       • Keeping track of resource versions
                       • Gaps in coverage
                           • Use of local extensions
                           • Direct requests/contributions

Friday, 13 July 2012
ISA2OWL: development

                       • include graph metadata (graph provenance to aid
                         indexing)

                       • extend semantic validation of ISA archive
                       • augment annotation by suggesting additions
                       • facilitate curation work
                       • create new mappings to other frameworks
                         (OPML model, SIO,)


Friday, 13 July 2012
33




        Publication...



                       ISA software suite: supporting standards-compliant
                       experimental annotation and enabling curation at the
                       community level
                       Philippe Rocca-Serra; Marco Brandizi; Eamonn Maguire; Nataliya Sklyar; Chris Taylor; Kimberly Begley; Dawn Field; Stephen Harris;
                       Winston Hide; Oliver Hofmann; Steffen Neumann; Peter Sterk; Weida Tong; Susanna-Assunta Sansone
                       BioinformaAcs	
  2010	
  26:	
  2354-­‐2356




Friday, 13 July 2012
34




            Acknowledgements

         Groups and individuals participating in:
         MIBBI http://mibbi.org
         ISA-­‐Tab	
  format http://isatab.sf.net
         OBO	
  Foundry http://obofoundry.org
         OBI: http://obi-ontology.org/page/Main_Page
                                                               collaborators at:
         ISA Infrastructure Team:                                Cambridge University
         Alejandra Gonzalez-­‐Beltran	
  (Oxford)                               EuNuGO
                                                       Harvard School for Public Health
         Eamonn Maguire	
  (Oxford)                                        FDAs NCTR
         Philippe Rocca-­‐Serra	
  (Oxford)                      Leibniz Plant Institute
                                                                         NERCs NEBC
                                                                            SIDR,	
  INIST
                                                              Metabolights,	
  EMBL-­‐EBI
                                                                            Funders:
                                                           EU Carcinogenomics Project
                                                                          UK	
  BBSRC

Friday, 13 July 2012
35




                       Groups and individuals participating in:
                       Winston Hide: HSPH
                       Oliver Hoffman: HSPH
                       Shannan Ho Sui : HSPH
                       Brad Chapman: HSPH
                       Christoph Steinbeck: Metabolights
                       Kenneth Haug: Metabolights
                       Paula de Matos: Metabolights
                       Magali Roux: INIST
                       Florian Mazur: INIST
                       Alain Zasadzinki: INIST
                       Marie Christine Jacquemot: INIST
                       Nina Jeliazkova: ToxBank

                       And many more who have to forgive us!


Friday, 13 July 2012
36




                       Questions:




Friday, 13 July 2012

Weitere ähnliche Inhalte

Ähnlich wie P Rocca-Serra - The open source ISA metadata tracking framework: from data curation and management at the source, to the linked data universe

Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...
Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...
Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...GigaScience, BGI Hong Kong
 
Pal gov.tutorial2.session13 1.data schema integration
Pal gov.tutorial2.session13 1.data schema integrationPal gov.tutorial2.session13 1.data schema integration
Pal gov.tutorial2.session13 1.data schema integrationMustafa Jarrar
 
White Paper: Hadoop in Life Sciences — An Introduction
White Paper: Hadoop in Life Sciences — An Introduction   White Paper: Hadoop in Life Sciences — An Introduction
White Paper: Hadoop in Life Sciences — An Introduction EMC
 
Towards Computational Research Objects
Towards Computational Research ObjectsTowards Computational Research Objects
Towards Computational Research ObjectsDavid De Roure
 
Pal gov.tutorial2.session7
Pal gov.tutorial2.session7Pal gov.tutorial2.session7
Pal gov.tutorial2.session7Mustafa Jarrar
 
Pal gov.tutorial2.session7.owl
Pal gov.tutorial2.session7.owlPal gov.tutorial2.session7.owl
Pal gov.tutorial2.session7.owlMustafa Jarrar
 
Pal gov.tutorial2.session13 2.gav and lav integration
Pal gov.tutorial2.session13 2.gav and lav integrationPal gov.tutorial2.session13 2.gav and lav integration
Pal gov.tutorial2.session13 2.gav and lav integrationMustafa Jarrar
 
SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...Natalie Stanford
 
Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1iotest
 
Pal gov.tutorial2.session10.sparql
Pal gov.tutorial2.session10.sparqlPal gov.tutorial2.session10.sparql
Pal gov.tutorial2.session10.sparqlMustafa Jarrar
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Alexandru Iosup
 
Pal gov.tutorial2.session9.lab rdf-stores
Pal gov.tutorial2.session9.lab rdf-storesPal gov.tutorial2.session9.lab rdf-stores
Pal gov.tutorial2.session9.lab rdf-storesMustafa Jarrar
 
Research Object Model in Sepublica
Research Object Model in SepublicaResearch Object Model in Sepublica
Research Object Model in SepublicaKhalid Belhajjame
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Carole Goble
 
Pal gov.tutorial2.session15 1.linkeddata
Pal gov.tutorial2.session15 1.linkeddataPal gov.tutorial2.session15 1.linkeddata
Pal gov.tutorial2.session15 1.linkeddataMustafa Jarrar
 
Ml pluss ejan2013
Ml pluss ejan2013Ml pluss ejan2013
Ml pluss ejan2013CS, NcState
 
Pal gov.tutorial2.session11.oracle
Pal gov.tutorial2.session11.oraclePal gov.tutorial2.session11.oracle
Pal gov.tutorial2.session11.oracleMustafa Jarrar
 

Ähnlich wie P Rocca-Serra - The open source ISA metadata tracking framework: from data curation and management at the source, to the linked data universe (20)

Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...
Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...
Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...
 
TranSMART ISA-june2012
TranSMART ISA-june2012TranSMART ISA-june2012
TranSMART ISA-june2012
 
Pal gov.tutorial2.session13 1.data schema integration
Pal gov.tutorial2.session13 1.data schema integrationPal gov.tutorial2.session13 1.data schema integration
Pal gov.tutorial2.session13 1.data schema integration
 
White Paper: Hadoop in Life Sciences — An Introduction
White Paper: Hadoop in Life Sciences — An Introduction   White Paper: Hadoop in Life Sciences — An Introduction
White Paper: Hadoop in Life Sciences — An Introduction
 
Towards Computational Research Objects
Towards Computational Research ObjectsTowards Computational Research Objects
Towards Computational Research Objects
 
Pal gov.tutorial2.session7
Pal gov.tutorial2.session7Pal gov.tutorial2.session7
Pal gov.tutorial2.session7
 
Pal gov.tutorial2.session7.owl
Pal gov.tutorial2.session7.owlPal gov.tutorial2.session7.owl
Pal gov.tutorial2.session7.owl
 
Pal gov.tutorial2.session13 2.gav and lav integration
Pal gov.tutorial2.session13 2.gav and lav integrationPal gov.tutorial2.session13 2.gav and lav integration
Pal gov.tutorial2.session13 2.gav and lav integration
 
Forschungsdaten & OpenAIREPlus
Forschungsdaten & OpenAIREPlusForschungsdaten & OpenAIREPlus
Forschungsdaten & OpenAIREPlus
 
SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...
 
Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1
 
Pal gov.tutorial2.session10.sparql
Pal gov.tutorial2.session10.sparqlPal gov.tutorial2.session10.sparql
Pal gov.tutorial2.session10.sparql
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.
 
Pal gov.tutorial2.session9.lab rdf-stores
Pal gov.tutorial2.session9.lab rdf-storesPal gov.tutorial2.session9.lab rdf-stores
Pal gov.tutorial2.session9.lab rdf-stores
 
Research Object Model in Sepublica
Research Object Model in SepublicaResearch Object Model in Sepublica
Research Object Model in Sepublica
 
Why Workflows Break
Why Workflows BreakWhy Workflows Break
Why Workflows Break
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
 
Pal gov.tutorial2.session15 1.linkeddata
Pal gov.tutorial2.session15 1.linkeddataPal gov.tutorial2.session15 1.linkeddata
Pal gov.tutorial2.session15 1.linkeddata
 
Ml pluss ejan2013
Ml pluss ejan2013Ml pluss ejan2013
Ml pluss ejan2013
 
Pal gov.tutorial2.session11.oracle
Pal gov.tutorial2.session11.oraclePal gov.tutorial2.session11.oracle
Pal gov.tutorial2.session11.oracle
 

Mehr von Jan Aerts

VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationJan Aerts
 
Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Jan Aerts
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Jan Aerts
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Jan Aerts
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Jan Aerts
 
Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data AnalysisJan Aerts
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualizationJan Aerts
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsJan Aerts
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...Jan Aerts
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumJan Aerts
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJan Aerts
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloudJan Aerts
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisJan Aerts
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...Jan Aerts
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...Jan Aerts
 
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...Jan Aerts
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsJan Aerts
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesJan Aerts
 
B Kinoshita - Creating biology pipelines with BioUno
B Kinoshita - Creating biology pipelines with BioUnoB Kinoshita - Creating biology pipelines with BioUno
B Kinoshita - Creating biology pipelines with BioUnoJan Aerts
 
D Baker - Galaxy Update
D Baker - Galaxy UpdateD Baker - Galaxy Update
D Baker - Galaxy UpdateJan Aerts
 

Mehr von Jan Aerts (20)

VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic Variation
 
Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)
 
Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data Analysis
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformatics
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing Consortium
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis Framework
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysis
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...
 
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining components
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
 
B Kinoshita - Creating biology pipelines with BioUno
B Kinoshita - Creating biology pipelines with BioUnoB Kinoshita - Creating biology pipelines with BioUno
B Kinoshita - Creating biology pipelines with BioUno
 
D Baker - Galaxy Update
D Baker - Galaxy UpdateD Baker - Galaxy Update
D Baker - Galaxy Update
 

Kürzlich hochgeladen

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Kürzlich hochgeladen (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

P Rocca-Serra - The open source ISA metadata tracking framework: from data curation and management at the source, to the linked data universe

  • 1. 1 The open source ISA metadata tracking framework: from data curation and management at the source, to the linked data universe BOSC, Long Beach, July 13-14, 2012 Philippe Rocca-Serra (Ph. D) ISA Team twitter: @isatools.org philippe.rocca-serra@oerc.ox.ac.uk http://www.isa-tools.org Friday, 13 July 2012
  • 2. 3 MAIN THEME: It is all about structuring experimental information to make it available to computer and software agents to enable mining. But let’s proceed gradually… Friday, 13 July 2012
  • 3. 3 MAIN THEME: It is all about structuring experimental information to make it available to computer and software agents to enable mining. But let’s proceed gradually… Notes in Lab Books (information for humans) Friday, 13 July 2012
  • 4. 3 MAIN THEME: It is all about structuring experimental information to make it available to computer and software agents to enable mining. But let’s proceed gradually… Notes in Lab Books Spreadsheets and Tables (information for humans) ( the compromise) Friday, 13 July 2012
  • 5. 3 MAIN THEME: It is all about structuring experimental information to make it available to computer and software agents to enable mining. But let’s proceed gradually… Notes in Lab Books Spreadsheets and Tables Facts as RDF statements (information for humans) ( the compromise) (information for machines) Friday, 13 July 2012
  • 6. 9 Observations • Experiments are expensive, often publicly funded, still many fail to see the light. • Spreadsheets are the most common vehicle for so-called ‘omics’ (functional genomics) experimental metadata tracking • technology centric repositories form de facto silos • conversions are required to allow for deposition to public databases. • submitting to common information across a series of repositories is inefficient Friday, 13 July 2012
  • 7. 10 Case Study Friday, 13 July 2012
  • 8. 13 Many ontologies, Many Formats, Many Requirements… Grr…Where are the tools!?! Credits:  h/p://liverpoolsolfed.wordpress.com/resources/image-­‐bank/demonstraAon/ Friday, 13 July 2012
  • 9. 14 ISA framework overview Friday, 13 July 2012
  • 10. Why ISA format and Tools? – Supporting data provenance tracking – Node/Edge underlying concept – Tabular as a compromise: a presentation layer inspired by Object model (FuGE,MAGE-OM) – A Generic representation, applied to: • microarray based experiments (MAGE) • sequencing based experiments (SRA) • flow cytometry based experiments (FuGE-Flow Cyt) • mass spectrometry and NMR spectroscopy experiments Friday, 13 July 2012
  • 11. Why ISA format and Tools? investigation investigation high  level  concept  to  link   H1 H. Sapiens 35 Years H1.sample1 Labeling H1.sample1.labeled h1-s1.cel related  studies H1 H. Sapiens 35 Years H1.sample2 h1-s2.cel H2 H. Sapiens 33 Years H2.sample1 Labeling H2.sample1.labeled h2-s1.cel study the  central  unit,  containing   information  on  the  subject   under  study,  its  characteristics   H1.sample1 Labeling H1.sample1.labeled h1-s1.cel and  any  treatments  applied. H1 a  study  has  associated  assays H. Sapiens H1.sample2 h1-s2.cel 35 Years assay H2 H2.sample1 Labeling H2.sample1.labeled h2-s1.cel test  performed  either  on   H. Sapiens 33 Years material  taken  from  the  sub-­ ject  or  on  the  whole  initial   subject,  which  produce  quali-­ tative  or  quantitative  meas-­ ISA metadata specifications: urements  (data) •workflow and process orientated •compatible with checklist enforcement •compatible with external vocabulary resources assay(s) assay(s) •compatible by design with existing schemas pointers  to  data  file   MAGE-Tab names/location Pride-xml SRA-xml external  files  in   Currently finalizing conversion to RDF to explore native  or  other  for-­ mats the growing Linked Data universe, in collaboration with the W3C HCLSIG, Toxbank Consortium) data data Friday, 13 July 2012
  • 12. ISA syntax and Table definition • Material Transformations: – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.) Material Node Material Node Characteristics[…] Factor Value[…] (independent Protocol REF Characteristics[…] variables) Factor Value[…] (independent Material Type Parameter Value variables) Comment[…] […] Material Type Comment[…] Performer (operator effect) Date (day effect) 9 Friday, 13 July 2012
  • 13. ISA syntax and Table definition • Material Transformations: – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.) Material Node Material Node Characteristics[…] Factor Value[…] (independent Protocol REF Characteristics[…] variables) Factor Value[…] (independent Material Type Parameter Value variables) Comment[…] […] Material Type Comment[…] Performer (operator effect) Date (day effect) 9 Friday, 13 July 2012
  • 14. ISA syntax and Table definition • Material Transformations: – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.) Data File Node Material Node Material Node Characteristics[…] Factor Value[…] (independent Protocol REF Characteristics[…] variables) Factor Value[…] (independent Material Type Parameter Value variables) Comment[…] […] Material Type Comment[…] Performer (operator effect) Date (day effect) 9 Friday, 13 July 2012
  • 15. ISA syntax and Table definition • Material Transformations: – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.) Data File Node Material Node Material Node Comment[…] Characteristics[…] Factor Value[…] (independent Protocol REF Characteristics[…] variables) Factor Value[…] (independent Material Type Parameter Value variables) Comment[…] […] Material Type Comment[…] Performer (operator effect) Date (day effect) 9 Friday, 13 July 2012
  • 16. ISA syntax and Table definition • Material Transformations: – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.) Data File Node Material Node Material Node Comment[…] Characteristics[…] Factor Value[…] (independent Protocol REF Characteristics[…] variables) Factor Value[…] (independent Material Type Parameter Value variables) Comment[…] […] Material Type Comment[…] Performer (operator effect) Date (day effect) 9 Friday, 13 July 2012
  • 17. 19 ISAconfigurator Tables Friday, 13 July 2012
  • 18. 20 ISAconfigurator Tables Friday, 13 July 2012
  • 19. 22 How do ISA tools access Ontology servers? Friday, 13 July 2012
  • 20. The ISAcreator... isacreator Developed to be a user friendly way to enter standards-compliant metadata: it has lots of features... But these are just some of them...we also have a data entry wizard and an import utility... Friday, 13 July 2012
  • 21. 24 Select and Annotate in ISAcreator Friday, 13 July 2012
  • 22. Extending ISAcreator The Plugin Archictecture Friday, 13 July 2012
  • 23. Plugins in ISAcreator In ISAcreator, we use the Apache Felix implementation of the OSGi framework...it’s really good. •Plugins can be developed for 3 different purposes: Search (adds extra search space Custom cell editors Extra general functionality for ontology tool) (for spreadsheet) (which appears in a plugin menu) •2 Examples of ISA plugins: • Access to local metadata stores: Novartis Plugin to Ontology Widget • Annotation of findings: Metabolite Identification Plugin (Metabolights Repository contribution to ISA project). Friday, 13 July 2012
  • 24. Plugins...example 1 Novartis Metastore Search Search function on the Novartis Metastore... integrates search results on the metastore in the Ontology search tool. So, with the Novartis plugin in your Plugin directory, you’ll be able to search the Novartis metastore directly within ISAcreator, and it will handle all the tasks involved with recording term source, etc. Friday, 13 July 2012
  • 25. Plugins Example 2 - Metabolite Identification plugin 5 Credits: Kenneth Haug: Metabolights Friday, 13 July 2012
  • 26. 30 Potential Issues and known hurdles • The problem of conflicting versions – especially high when working with big consortia – distributed, decentralized groups of users • Lack of version control and history • Absence of collaborative features – Looking for new solutions while retaining the features ! • OntoMaton: Bringing Google Doc, NCBO Bioportal and ISA-TAB together ! Friday, 13 July 2012
  • 30. OntoMaton • Public release: http://goo.gl/2OKFV • Can be used in any Google Spreadsheet document • Application: • Annotating data records • Supporting ontology development (see OBI Quick Term Templates) Friday, 13 July 2012
  • 31. 31 ISA2RDF work in progress • Use case on W3C HCLS scientific discourse list – deciding on the granularity of representation – building on previous experience – Evaluating alternative representations. • Participitation to the Biohackathon 2011 – http://blogs.openaccesscentral.com/blogs/bmcblog/entry/ biohackathon_2011_number_1 – Discussing best practices • PURL uri and identifiers.org as identifiers • Openphacts guidelines (http://www.nanopub.org/guidelines/ OpenPHACTS_Nanopublication_Guidlines_v1.8.1.pdf) • Friday, 13 July 2012
  • 32. Preparing for Linked Open Data ✴ ISA2RDF (Toxbank collaboration) contribution to an ecosystem of software tools supporting the ISA syntax ✴ reliance to internet resolvable identifiers ✴ W3C bio/life science Note on Gene Expression RDF - (PMID: 22449719) ✴ TODO: ✴ Specify comparator groups + analysis methods and resulting measurements and statistical measures Friday, 13 July 2012
  • 33. Preparing for Linked Open Data ✴ ISA2RDF (Toxbank collaboration) contribution to an ecosystem of software tools supporting the ISA syntax ✴ reliance to internet resolvable identifiers ✴ W3C bio/life science Note on Gene Expression RDF - (PMID: 22449719) ✴ TODO: ✴ Specify comparator groups + analysis methods and resulting measurements and statistical measures Friday, 13 July 2012
  • 34. Preparing for Linked Open Data ✴ ISA2RDF (Toxbank collaboration) contribution to an ecosystem of software tools supporting the ISA syntax ✴ reliance to internet resolvable identifiers ✴ W3C bio/life science Note on Gene Expression RDF - (PMID: 22449719) ✴ TODO: ✴ Specify comparator groups + analysis methods and resulting measurements and statistical measures Friday, 13 July 2012
  • 35. 32 ISA2RDF: work in progress jeliazkova.nina [toxbank project] Friday, 13 July 2012
  • 36. 32 ISA2RDF: work in progress jeliazkova.nina [toxbank project] Friday, 13 July 2012
  • 37. ISA2OWL • OWLAPI • ISA Parser (in memory BII object store objects) • Mapping ISA syntax into target Ontological Space • Decoupling Mapping from Conversion Engine • avoid to be tied to a semantic framework Friday, 13 July 2012
  • 38. ISA2OWL: mapping in the BFO space as starting point Friday, 13 July 2012
  • 39. ISA2OWL: mapping in the BFO space as starting point Friday, 13 July 2012
  • 40. ISA2OWL: mapping issues • Stability over time • Keeping track of resource versions • Gaps in coverage • Use of local extensions • Direct requests/contributions Friday, 13 July 2012
  • 41. ISA2OWL: development • include graph metadata (graph provenance to aid indexing) • extend semantic validation of ISA archive • augment annotation by suggesting additions • facilitate curation work • create new mappings to other frameworks (OPML model, SIO,) Friday, 13 July 2012
  • 42. 33 Publication... ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level Philippe Rocca-Serra; Marco Brandizi; Eamonn Maguire; Nataliya Sklyar; Chris Taylor; Kimberly Begley; Dawn Field; Stephen Harris; Winston Hide; Oliver Hofmann; Steffen Neumann; Peter Sterk; Weida Tong; Susanna-Assunta Sansone BioinformaAcs  2010  26:  2354-­‐2356 Friday, 13 July 2012
  • 43. 34 Acknowledgements Groups and individuals participating in: MIBBI http://mibbi.org ISA-­‐Tab  format http://isatab.sf.net OBO  Foundry http://obofoundry.org OBI: http://obi-ontology.org/page/Main_Page collaborators at: ISA Infrastructure Team: Cambridge University Alejandra Gonzalez-­‐Beltran  (Oxford) EuNuGO Harvard School for Public Health Eamonn Maguire  (Oxford) FDAs NCTR Philippe Rocca-­‐Serra  (Oxford) Leibniz Plant Institute NERCs NEBC SIDR,  INIST Metabolights,  EMBL-­‐EBI Funders: EU Carcinogenomics Project UK  BBSRC Friday, 13 July 2012
  • 44. 35 Groups and individuals participating in: Winston Hide: HSPH Oliver Hoffman: HSPH Shannan Ho Sui : HSPH Brad Chapman: HSPH Christoph Steinbeck: Metabolights Kenneth Haug: Metabolights Paula de Matos: Metabolights Magali Roux: INIST Florian Mazur: INIST Alain Zasadzinki: INIST Marie Christine Jacquemot: INIST Nina Jeliazkova: ToxBank And many more who have to forgive us! Friday, 13 July 2012
  • 45. 36 Questions: Friday, 13 July 2012