Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Open interoperability standards, tools and services at EMBL-EBI

242 Aufrufe

Veröffentlicht am

In this webinar Dr Henriette Harmse from EMBL-EBI presents how they are using their ontology services at EMBL-EBI to scale up the annotation of data and deliver added value through ontologies and semantics to their users.

Veröffentlicht in: Gesundheit & Medizin
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Open interoperability standards, tools and services at EMBL-EBI

  1. 1. Henriette Harmse, PhD (Artificial Intelligence) Ontology Tools Lead Samples, Phenotypes and Ontologies Team EMBL-EBI European Bioinformatics Institute Open Interoperability Standards, Tools and Services at EMBL-EBI 14 November 2019
  2. 2. • European Bioinformatics Institute (EBI). • Part of the European Molecular Biology Laboratory. • Located at Wellcome Genome Campus 10 miles south of Cambridge, UK. • We are a trusted source for biological and biomolecular data. • Our core mission is to enable life science research and its translation to medicine, agriculture, industry and society. • We have 780 staff members from 66 nations. • EMBL is an international organisation funded by over 20 member states. EMBL-EBI: Who are we? https://www.ebi.ac.uk/about/digital-bookshelf/publications/EMBL-EBI_Scientific_Report-2018.pdf
  3. 3. • 270+ petabytes of raw data • 60 million daily requests • Data Information Knowledge Applications Data Sources at EMBL-EBI
  4. 4. There‘s a lot of metadata... tissues cell lines diseases
  5. 5. Challenges: Different Words refer to the Same Thing Different ways to say "female".
  6. 6. Tibia used in different contexts Challenges: The Same Word refers to Different Things
  7. 7. Ontologies as controlled vocabularies on steroids: • Globally unique identifiers for concepts and relations, e.g. URI, IRI, PURL • Machine readable syntax, e.g. XML, JSON-LD • Generic data model able to describe arbitrary content: RDF triples • <s, p, o> expresses that subject(s) and object(o) is related via predicate(p). • Query language for RDF: SPARQL • Equiped with formal semantics based on mathematical logic, which enable artificial intelligence reasoning procedures to infer implicit knowledge from explicit knowledge. E.g. RDFS and OWL. • JSON-LD, RDF, SPARQL, RDFS and OWL are W3C standards. Semantic Web Technologies A. Hogan, Linked Data & the Semantic WebStandards., Linked Data Management (A. Harth, K. Hose, and R. Schenkel, eds.), Chapman and Hall/CRC, 2014, pp. 3–48.
  8. 8. Open Biological and Biomedical Ontology Foundry OBO Foundry • Provides over 100 free ontologies, • adhering to the principles of • open use, • collaborative development, • non-overlapping and strictly scoped content, • using a common syntax • and common relations. • There are many biological and biomedical terminology standards that reside outside of the OBO Foundary. 239 ontologies are hosted on OLS of which about have comes from OBO Foundry. http://www.obofoundry.org/
  9. 9. What we do EMBL-EBI Ontology Services Team • We build services to make ontologies accessible by humans (biological curators) and machines (pipelines). • We ensure that a consistent set of interoperable ontologies are used across public datasets to maximize interoperability. • We need ways to scale this up so that ontology terms can be assigned to meta data at scale. • Once data is aligned with the ontologies, we work with software developers to help them utilize ontologies.
  10. 10. The Result: Integrated Data with Semantic Search
  11. 11. Aligning your data to ontologies Organism: Homo sapiens cell type: Mast cell Disease: Type II diabetes mellitus Organism part: pancreas Cell type ontology Where do you start?
  12. 12. Typical questions • How do I access ontologies? • How do I annotate data with ontologies? • Which ontologies should I use? • What about data that doesn’t map easily? • How can I translate from one ontology to another? • How do I build “ontology aware” applications?
  13. 13. The Ontology Toolkit https://github.com/EBISPOT Open Source Software http://www.ebi.ac.uk/spot/ontology
  14. 14. Ontology Lookup Service (OLS) https://www.ebi.ac.uk/ols GitHub: https://github.com/EBISPOT/OLS
  15. 15. Query Expansion Ontology Lookup Service (OLS) • Internally we use Solr and Neo4J. • Solr indexes concept decriptions and synonyms of concepts. • The Neo4J graph encodes subclass relations and arbitary relations that exist between concepts.
  16. 16. The problem with just an ontology lookup …knowing what you’re looking for
  17. 17. Data annotation services • Supporting data curation to map to the “right” terms • Based on what other databases are doing • Collect mappings from 10 databases at EBI and use as a training set to predict how new unseen data should map to ontologies http://www.ebi.ac.uk/spot/zooma GitHub: https://github.com/EBISPOT/zooma “mast cell” CL:000097 + Context (where, when?)
  18. 18. • Using previously curated data sources https://www.ebi.ac.uk/spot/zooma/
  19. 19. • Using only ontologies • Curators review output and feedback into Zooma https://www.ebi.ac.uk/spot/zooma/ Reviewers
  20. 20. • We are increasingly seeing data that is described using ontologies • But we don’t always agree on the ontologies to use Datasource 1 Datasource 2 ? EFOMappings Ontology Mapping Service (OxO) http://www.ebi.ac.uk/spot/oxo GitHub: https://github.com/EBISPOT/OXO
  21. 21. The Ontology X-ref Service • Database of x-refs from public ontologies and databases • Not a mapping prediction service! • Access to existing mappings using distance controller • Default = asserted mappings https://www.ebi.ac.uk/spot/oxo/
  22. 22. The Ontology X-ref Service https://www.ebi.ac.uk/spot/oxo/
  23. 23. The Ontology X-ref Service https://www.ebi.ac.uk/spot/oxo/
  24. 24. The Ontology X-ref Service https://www.ebi.ac.uk/spot/oxo/
  25. 25. Publishing the data • EBI RDF platform contains 7 EBI databases connected by shared ontologies • SPARQL access to a subset of EBI data • But maintenance is hard as it’s not the source of truth for the data http://rdf.ebi.ac.uk GitHub: https://github.com/EBISPOT/RDF-platform
  26. 26. RDF Platform schema
  27. 27. What we’ve learnt along the way • The data we see is getting better as the ontologies have matured and consensus has grown around which ontologies should be used • Crowdsourcing through tools like Zooma and OxO have good economies of scale with respect to data curation • Retrofitting the semantics in this way has limits, there’s still a long tail of data that we miss. • OWL semantics are essential for building and maintaining our ontologies, but we’ve had to devise custom ways to utilize the ontologies when building applications and populating databases • Developers want more conventional access to semantics (i.e. REST+JSON)
  28. 28. 1. https://www.ebi.ac.uk 2. https://www.ebi.ac.uk/about/digital-bookshelf/publications/EMBL-EBI_Scientific_Report-2018.pdf 3. OLS: https://www.ebi.ac.uk/ols 4. https://github.com/EBISPOT/OLS 5. Zooma: https://www.ebi.ac.uk/spot/zooma 6. https://github.com/EBISPOT/zooma 7. Oxo: https://www.ebi.ac.uk/spot/oxo 8. https://github.com/EBISPOT/OXO 9. RDF Platform: http://rdf.ebi.ac.uk 10. https://github.com/EBISPOT/RDF-platform 11. https://www.obofoundry.org 12. A. Hogan, Linked Data & the Semantic WebStandards., Linked Data Management (A. Harth, K. Hose, and R. Schenkel, eds.), Chapman and Hall/CRC, 2014, pp. 3–48. 13. GWAS: https://www.ebi.ac.uk/gwas/ 14. Expression Atlas: https://www.ebi.ac.uk/gxa/home 15. Open Targets: https://www.opentargets.org References