SlideShare ist ein Scribd-Unternehmen logo
1 von 52
Downloaden Sie, um offline zu lesen
An Introduction to Ontology as a
 Strategy for Data Integration

          Barry Smith




                                   1
The problem

• legacy idiosyncracies in handling data
complicated progressively by
• changes in available hardware and software
• turnover of personnel and of collaborations
• explosion of data
• need to get funding (inhibits reuse)




                                                2
The result: balkanization
•   systems are poorly integrated
•   deliver redundant capabilities
•   foster error and waste
•   prevent comparison and aggregation
•   prevent secondary use of data
•   lowers ROI on software


                                         3
The proposed solution
• vocabulary and meanings change more
  slowly than hardware and software (and
  scientific theory*)
• semantic interoperability has high initial
  cost (governance, commitment) but
  considerable long-term value
*atom, electron, cell, bacteria, organism …




                                               4/24
How to do it right?
• how create an incremental, evolutionary
  process, where what is good survives, and
  what is bad fails
• create a scenario in which people will find it
  profitable to reuse ontologies, terminologies
  and coding systems which have been tried and
  tested


                                               6
Uses of ‘ontology’ in PubMed abstracts




                                         7/24
By far the most successful: GO (Gene Ontology)




                                           8
GO provides a controlled vocabulary of terms
 for use in annotating (describing, tagging) data

• multi-species, multi-disciplinary, open source
• contributing to the cumulativity of scientific
  results obtained by distinct research
  communities
• compare use of kilograms, meters, seconds in
  formulating experimental results
• natural language and logical definitions for all
  terms to support consistent human application
  and computational exploitation
                                                     9
What is the key to GO’s success?
• multi-species, multi-disciplinary, open source
• clear rules for ontology development and
  maintenance
• over 11 million annotations relating gene
  products described in the UniProt, Ensembl and
  other databases to terms in the GO




                                                   10
Extending GO’s success to other fields
   Open Biological and Biomedical Ontologies
                 (OBO) Foundry

   •   Best practice principles
   •   Governance
   •   Review process
   •   Two-tier membership


                http://obofoundry.org
                                               11
http://ontology.buffalo.edu/smith




                                12
CONTINUANT                     OCCURRENT
     RELATION
      TO TIME


                  INDEPENDENT               DEPENDENT
GRANULARITY


                            Anatomical
                 Organism                 Organ
  ORGAN AND                    Entity
                  (NCBI                  Function
   ORGANISM                    (FMA,
                Taxonomy)              (FMP, CPRO) Phenotypic      Biological
                              CARO)                 Quality         Process
                                                     (PaTO)          (GO)
   CELL AND                   Cellular   Cellular
                  Cell
   CELLULAR                 Component Function
                  (CL)
  COMPONENT                 (FMA, GO)     (GO)
                     Molecule
                                         Molecular Function     Molecular Process
  MOLECULE          (ChEBI, SO,
                                               (GO)                  (GO)
                    RnaO, PrO)


OBO (Open Biomedical Ontology) Foundry proposal
                    (Gene Ontology in yellow)                               13
CONTINUANT                    OCCURRENT
    RELATION
     TO TIME

                  INDEPENDENT             DEPENDENT

GRANULARITY

 COMPLEX OF     Family, Community,                Population      Population
 ORGANISMS       Deme, Population                 Phenotype        Process
                         Anatomical    Organ
 ORGAN AND      Organism    Entity    Function
  ORGANISM       (NCBI      (FMA,   (FMP, CPRO) Phenotypic
               Taxonomy)                                          Biological
                           CARO)                 Quality
                                                                   Process
                                                  (PaTO)
                                                                     (GO)
  CELL AND                 Cellular   Cellular
                  Cell
  CELLULAR               Component Function
                  (CL)
 COMPONENT               (FMA, GO)     (GO)
                     Molecule
                                        Molecular Function     Molecular Process
  MOLECULE          (ChEBI, SO,
                                              (GO)                  (GO)
                    RnaO, PrO)



               Population-level ontologies                                     14
CONTINUANT                                 OCCURRENT
     RELATION
     TO TIME

                         INDEPENDENT                           DEPENDENT

GRANULARITY


                            Anatomical
                 Organism                                    Organ
 ORGAN AND                    Entity




                                           environments
                  (NCBI                                     Function
  ORGANISM                    (FMA,
                Taxonomy)                                 (FMP, CPRO) Phenotypic      Biological
                             CARO)
                                                                       Quality         Process
                                                                        (PaTO)          (GO)
  CELL AND                    Cellular                      Cellular
                  Cell
  CELLULAR                  Component                       Function
                  (CL)
 COMPONENT                  (FMA, GO)                        (GO)

                     Molecule
                                                             Molecular Function    Molecular Process
  MOLECULE          (ChEBI, SO,
                                                                   (GO)                 (GO)
                    RnaO, PrO)



                     Environment Ontology
                                                                                                15
The Environment Ontology

         Barry Smith
http://ontology.buffalo.edu/smith
                                    17
The Spatial-Structural Niche
A Hole Story
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consortium Meeting), September 2011
Places are holes




                   20
21
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consortium Meeting), September 2011
23
24
DIGESTIVE
SYSTEM




the interior of your gut: an
environment for more
than1013 microorganisms
Positive and negative parts

                               negative
                               part


                              or hole

                              (not made
                               of matter)

 positive
 part


     (made of matter)

26
A site
intuitively: a spatial entity that can contain
a material entity




                                                 28
A spatial environment
 is a site that
 1. contains a medium (air, water)
 2. can contain an organism or a
 population of organisms

Some sites are supported and demarcated
 by some solid object

                                          29
Stationary Sites


      1            2            3            4



1: your office when the door is closed; a closed
mouth
2: a rabbit hole; an open mouth
3: the surface of a leaf
4: the Klingon Empire
                                                   30
Mobile Sites



  1          2         3          4


1: a womb; a spaceship
2: a snail’s shell; a
3: the home range of a migrating
herd of buffalo;
4: the niche around a flying buzzard   31
At any given instant
a site is coincident with some spatial region

But because there are mobile sites

 not: site ≡ spatial region

For stationary sites we can associate
 latitute/longitude specifications
                                                32
Double hole structure of a
  Spatial Environment

                Retainer
                (a boundary of some
                surrounding structure)

                Medium
                (filling the environing hole)


                Tenant
                (occupying the central hole)




                                           33
…
…
… (soil, cheese …)
top level               Basic Formal Ontology (BFO)

                                           Ontology for
          Information Artifact
                                           Biomedical      Spatial Ontology
mid-level      Ontology
                                          Investigations        (BSPO)
                (IAO)
                                              (OBI)
               Anatomy Ontology
                (FMA*, CARO)                        Infectious
                                                     Disease
                                      Environment   Ontology
                         Cellular
              Cell                      Ontology      (IDO*)
                        Component
            Ontology                     (EnvO)
 domain       (CL)
                         Ontology
                                                    Phenotypic     Biological
                       (FMA*, GO*)
   level                                              Quality       Process
                                                     Ontology    Ontology (GO*)
             Subcellular Anatomy Ontology (SAO)       (PaTO)
                      Sequence Ontology
                             (SO*)                  Molecular
                                                    Function
                       Protein Ontology
                                                     (GO*)
                            (PRO*)
        Extension Strategy + Modular Organization                           40
How to fit EnvO under BFO
• http://www.ifomis.org/bfo/
Populating downwards from BFO

         Continuant            Occurrent
                            (Process, Event)


Independent    Dependent
 Continuant    Continuant
Basic Formal Ontology

         Continuant            Occurrent
                            (Process, Event)


Independent    Dependent
 Continuant    Continuant



organism
RELATION                      CONTINUANT                     OCCURRENT
      TO TIME

GRANULARITY       INDEPENDENT               DEPENDENT



                            Anatomical
                Organism                  Organ                 Organism-Level
  ORGAN AND                    Entity
                  (NCBI                  Function                  Process
   ORGANISM                    (FMA,
                Taxonomy)              (FMP, CPRO) Phenotypic       (GO)
                              CARO)                 Quality
                                                     (PaTO)
  CELL AND                    Cellular   Cellular
                  Cell                                          Cellular Process
  CELLULAR                  Component Function
                  (CL)                                               (GO)
 COMPONENT                  (FMA, GO)      (GO)

                     Molecule                                     Molecular
                                         Molecular Function
  MOLECULE          (ChEBI, SO,                                    Process
                                               (GO)
                    RnaO, PrO)                                      (GO)




                         obofoundry.org
Hydraulic System
CIRCULATORY
SYSTEM
(Principal Organs)
47
Genus-species definitions

System =def. an independent continuant
 which is composed of interacting material
 entities forming an integrated whole

Ecosystem =def. a system which includes
 organisms and the site in which they live
 as components

                                             48
Biome =def. An ecosystem which contains
  populations adapted to the environmental
  conditions conserved over its spatial
  extent.
Microbiome =def. A biome which contains
  the totality of microscopic organisms, their
  genetic elements, and interactions in a
  given environment.




                                             49
Aligning EnvO to the Basic Formal Ontology
habitat

Habitat =def. An ecosystem which can
support the life of a given organism,
population, or community

Realized niche =def. An ecosystem which
is that part of a habitat which supports the
life of a given organism, population or
community
Aligning EnvO to the Basic Formal Ontology
Hutchinsonion niche
(niche as volume in a functionally
defined hyperspace)

=def. an n-dimensional hyper-volume
whose dimensions correspond to resource
gradients over which species are
distributed
– degree of slope, exposure to sunlight, soil
 fertility, foliage density, salinity...
G.E. Hutchinson (1957, 1965)
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consortium Meeting), September 2011
Aligning EnvO to the Basic Formal Ontology



                                 part_of
58
59
GAZ. An open source gazetteer
based on ontological principles


 http://gensc.org/gc_wiki/index.php/GAZ_Project




                                              60
Applications of EnvO in biology




                              61

Weitere ähnliche Inhalte

Ähnlich wie ENVO: The Environment Ontology (Presentation at the Genomics Standards Consortium Meeting), September 2011

Introduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyBarry Smith
 
Cross Product Extensions to the Gene Ontology
Cross Product Extensions to the Gene OntologyCross Product Extensions to the Gene Ontology
Cross Product Extensions to the Gene OntologyChris Mungall
 
Detection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesDetection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesKlaas Vandepoele
 
P
 Systems 
Model 
Optimisation 
by
 Means 
of 
Evolutionary 
Based 
Search
 ...
P
 Systems 
Model 
Optimisation 
by
 Means 
of 
Evolutionary 
Based 
Search
 ...P
 Systems 
Model 
Optimisation 
by
 Means 
of 
Evolutionary 
Based 
Search
 ...
P
 Systems 
Model 
Optimisation 
by
 Means 
of 
Evolutionary 
Based 
Search
 ...Natalio Krasnogor
 
Structural Systems Pharmacology
Structural Systems PharmacologyStructural Systems Pharmacology
Structural Systems PharmacologyPhilip Bourne
 
Plant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesPlant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesLeighton Pritchard
 
Reasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descentReasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descentHilmar Lapp
 
Function and Phenotype Prediction through Data and Knowledge Fusion
Function and Phenotype Prediction through Data and Knowledge FusionFunction and Phenotype Prediction through Data and Knowledge Fusion
Function and Phenotype Prediction through Data and Knowledge FusionKarin Verspoor
 
Following the Evolution of New Protein Folds via Protodomains
Following the Evolution of New Protein Folds via ProtodomainsFollowing the Evolution of New Protein Folds via Protodomains
Following the Evolution of New Protein Folds via ProtodomainsSpencer Bliven
 
Challenges and opportunities in personal omics profiling
Challenges and opportunities in personal omics profilingChallenges and opportunities in personal omics profiling
Challenges and opportunities in personal omics profilingSenthil Natesan
 
University of Toronto Chemistry Librarians Workshop June 2012
University of Toronto Chemistry Librarians Workshop June 2012University of Toronto Chemistry Librarians Workshop June 2012
University of Toronto Chemistry Librarians Workshop June 2012Brock University
 
Research report (alternative splicing, protein structure; retinitis pigmentosa)
Research report (alternative splicing, protein structure; retinitis pigmentosa)Research report (alternative splicing, protein structure; retinitis pigmentosa)
Research report (alternative splicing, protein structure; retinitis pigmentosa)avalgar
 
Is microbial ecology driven by roaming genes?
Is microbial ecology driven by roaming genes?Is microbial ecology driven by roaming genes?
Is microbial ecology driven by roaming genes?beiko
 
IEA - Expressão gênica: é complexo mesmo ou só estão faltando partes?
IEA - Expressão gênica: é complexo mesmo ou só estão faltando partes?IEA - Expressão gênica: é complexo mesmo ou só estão faltando partes?
IEA - Expressão gênica: é complexo mesmo ou só estão faltando partes?Instituto de Estudos Avançados - USP
 
Hidden in plain sight
Hidden in plain sightHidden in plain sight
Hidden in plain sightValerie Wood
 
Computational Protein Design. 1. Challenges in Protein Engineering
Computational Protein Design. 1. Challenges in Protein EngineeringComputational Protein Design. 1. Challenges in Protein Engineering
Computational Protein Design. 1. Challenges in Protein EngineeringPablo Carbonell
 
Bio-ontologies in bioinformatics: Growing up challenges
Bio-ontologies in bioinformatics: Growing up challengesBio-ontologies in bioinformatics: Growing up challenges
Bio-ontologies in bioinformatics: Growing up challengesJanna Hastings
 
MDC Connects Series 2021 | A Guide to Complex Medicines: The Early Assessment...
MDC Connects Series 2021 | A Guide to Complex Medicines: The Early Assessment...MDC Connects Series 2021 | A Guide to Complex Medicines: The Early Assessment...
MDC Connects Series 2021 | A Guide to Complex Medicines: The Early Assessment...Medicines Discovery Catapult
 

Ähnlich wie ENVO: The Environment Ontology (Presentation at the Genomics Standards Consortium Meeting), September 2011 (20)

Introduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental Biology
 
bioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics databioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics data
 
Cross Product Extensions to the Gene Ontology
Cross Product Extensions to the Gene OntologyCross Product Extensions to the Gene Ontology
Cross Product Extensions to the Gene Ontology
 
Detection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesDetection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomes
 
P
 Systems 
Model 
Optimisation 
by
 Means 
of 
Evolutionary 
Based 
Search
 ...
P
 Systems 
Model 
Optimisation 
by
 Means 
of 
Evolutionary 
Based 
Search
 ...P
 Systems 
Model 
Optimisation 
by
 Means 
of 
Evolutionary 
Based 
Search
 ...
P
 Systems 
Model 
Optimisation 
by
 Means 
of 
Evolutionary 
Based 
Search
 ...
 
Structural Systems Pharmacology
Structural Systems PharmacologyStructural Systems Pharmacology
Structural Systems Pharmacology
 
Plant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesPlant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In Sequences
 
Reasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descentReasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descent
 
Function and Phenotype Prediction through Data and Knowledge Fusion
Function and Phenotype Prediction through Data and Knowledge FusionFunction and Phenotype Prediction through Data and Knowledge Fusion
Function and Phenotype Prediction through Data and Knowledge Fusion
 
Following the Evolution of New Protein Folds via Protodomains
Following the Evolution of New Protein Folds via ProtodomainsFollowing the Evolution of New Protein Folds via Protodomains
Following the Evolution of New Protein Folds via Protodomains
 
Pathogen Genome Data
Pathogen Genome DataPathogen Genome Data
Pathogen Genome Data
 
Challenges and opportunities in personal omics profiling
Challenges and opportunities in personal omics profilingChallenges and opportunities in personal omics profiling
Challenges and opportunities in personal omics profiling
 
University of Toronto Chemistry Librarians Workshop June 2012
University of Toronto Chemistry Librarians Workshop June 2012University of Toronto Chemistry Librarians Workshop June 2012
University of Toronto Chemistry Librarians Workshop June 2012
 
Research report (alternative splicing, protein structure; retinitis pigmentosa)
Research report (alternative splicing, protein structure; retinitis pigmentosa)Research report (alternative splicing, protein structure; retinitis pigmentosa)
Research report (alternative splicing, protein structure; retinitis pigmentosa)
 
Is microbial ecology driven by roaming genes?
Is microbial ecology driven by roaming genes?Is microbial ecology driven by roaming genes?
Is microbial ecology driven by roaming genes?
 
IEA - Expressão gênica: é complexo mesmo ou só estão faltando partes?
IEA - Expressão gênica: é complexo mesmo ou só estão faltando partes?IEA - Expressão gênica: é complexo mesmo ou só estão faltando partes?
IEA - Expressão gênica: é complexo mesmo ou só estão faltando partes?
 
Hidden in plain sight
Hidden in plain sightHidden in plain sight
Hidden in plain sight
 
Computational Protein Design. 1. Challenges in Protein Engineering
Computational Protein Design. 1. Challenges in Protein EngineeringComputational Protein Design. 1. Challenges in Protein Engineering
Computational Protein Design. 1. Challenges in Protein Engineering
 
Bio-ontologies in bioinformatics: Growing up challenges
Bio-ontologies in bioinformatics: Growing up challengesBio-ontologies in bioinformatics: Growing up challenges
Bio-ontologies in bioinformatics: Growing up challenges
 
MDC Connects Series 2021 | A Guide to Complex Medicines: The Early Assessment...
MDC Connects Series 2021 | A Guide to Complex Medicines: The Early Assessment...MDC Connects Series 2021 | A Guide to Complex Medicines: The Early Assessment...
MDC Connects Series 2021 | A Guide to Complex Medicines: The Early Assessment...
 

Mehr von Barry Smith

Towards an Ontology of Philosophy
Towards an Ontology of PhilosophyTowards an Ontology of Philosophy
Towards an Ontology of PhilosophyBarry Smith
 
An application of Basic Formal Ontology to the Ontology of Services and Commo...
An application of Basic Formal Ontology to the Ontology of Services and Commo...An application of Basic Formal Ontology to the Ontology of Services and Commo...
An application of Basic Formal Ontology to the Ontology of Services and Commo...Barry Smith
 
Ways of Worldmarking: The Ontology of the Eruv
Ways of Worldmarking: The Ontology of the EruvWays of Worldmarking: The Ontology of the Eruv
Ways of Worldmarking: The Ontology of the EruvBarry Smith
 
The Division of Deontic Labor
The Division of Deontic LaborThe Division of Deontic Labor
The Division of Deontic LaborBarry Smith
 
Ontology of Aging (August 2014)
Ontology of Aging (August 2014)Ontology of Aging (August 2014)
Ontology of Aging (August 2014)Barry Smith
 
The Fifth Cycle of Philosophy
The Fifth Cycle of PhilosophyThe Fifth Cycle of Philosophy
The Fifth Cycle of PhilosophyBarry Smith
 
Ontology of Poker
Ontology of PokerOntology of Poker
Ontology of PokerBarry Smith
 
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Barry Smith
 
Enhancing the Quality of ImmPort Data
Enhancing the Quality of ImmPort DataEnhancing the Quality of ImmPort Data
Enhancing the Quality of ImmPort DataBarry Smith
 
The Philosophome: An Exercise in the Ontology of the Humanities
The Philosophome: An Exercise in the Ontology of the HumanitiesThe Philosophome: An Exercise in the Ontology of the Humanities
The Philosophome: An Exercise in the Ontology of the HumanitiesBarry Smith
 
IAO-Intel: An Ontology of Information Artifacts in the Intelligence Domain
IAO-Intel: An Ontology of Information Artifacts in the Intelligence DomainIAO-Intel: An Ontology of Information Artifacts in the Intelligence Domain
IAO-Intel: An Ontology of Information Artifacts in the Intelligence DomainBarry Smith
 
Science of Emerging Social Media
Science of Emerging Social MediaScience of Emerging Social Media
Science of Emerging Social MediaBarry Smith
 
Ethics, Informatics and Obamacare
Ethics, Informatics and ObamacareEthics, Informatics and Obamacare
Ethics, Informatics and ObamacareBarry Smith
 
e‐Human Beings: The contribution of internet ranking systems to the developme...
e‐Human Beings: The contribution of internet ranking systems to the developme...e‐Human Beings: The contribution of internet ranking systems to the developme...
e‐Human Beings: The contribution of internet ranking systems to the developme...Barry Smith
 
Ontology of aging and death
Ontology of aging and deathOntology of aging and death
Ontology of aging and deathBarry Smith
 
Ontology in-buffalo-2013
Ontology in-buffalo-2013Ontology in-buffalo-2013
Ontology in-buffalo-2013Barry Smith
 
ImmPort strategies to enhance discoverability of clinical trial data
ImmPort strategies to enhance discoverability of clinical trial dataImmPort strategies to enhance discoverability of clinical trial data
ImmPort strategies to enhance discoverability of clinical trial dataBarry Smith
 
Ontology of Documents (2005)
Ontology of Documents (2005)Ontology of Documents (2005)
Ontology of Documents (2005)Barry Smith
 
Ontology and the National Cancer Institute Thesaurus (2005)
Ontology and the National Cancer Institute Thesaurus (2005)Ontology and the National Cancer Institute Thesaurus (2005)
Ontology and the National Cancer Institute Thesaurus (2005)Barry Smith
 

Mehr von Barry Smith (20)

Towards an Ontology of Philosophy
Towards an Ontology of PhilosophyTowards an Ontology of Philosophy
Towards an Ontology of Philosophy
 
An application of Basic Formal Ontology to the Ontology of Services and Commo...
An application of Basic Formal Ontology to the Ontology of Services and Commo...An application of Basic Formal Ontology to the Ontology of Services and Commo...
An application of Basic Formal Ontology to the Ontology of Services and Commo...
 
Ways of Worldmarking: The Ontology of the Eruv
Ways of Worldmarking: The Ontology of the EruvWays of Worldmarking: The Ontology of the Eruv
Ways of Worldmarking: The Ontology of the Eruv
 
The Division of Deontic Labor
The Division of Deontic LaborThe Division of Deontic Labor
The Division of Deontic Labor
 
Ontology of Aging (August 2014)
Ontology of Aging (August 2014)Ontology of Aging (August 2014)
Ontology of Aging (August 2014)
 
Meaningful Use
Meaningful UseMeaningful Use
Meaningful Use
 
The Fifth Cycle of Philosophy
The Fifth Cycle of PhilosophyThe Fifth Cycle of Philosophy
The Fifth Cycle of Philosophy
 
Ontology of Poker
Ontology of PokerOntology of Poker
Ontology of Poker
 
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
 
Enhancing the Quality of ImmPort Data
Enhancing the Quality of ImmPort DataEnhancing the Quality of ImmPort Data
Enhancing the Quality of ImmPort Data
 
The Philosophome: An Exercise in the Ontology of the Humanities
The Philosophome: An Exercise in the Ontology of the HumanitiesThe Philosophome: An Exercise in the Ontology of the Humanities
The Philosophome: An Exercise in the Ontology of the Humanities
 
IAO-Intel: An Ontology of Information Artifacts in the Intelligence Domain
IAO-Intel: An Ontology of Information Artifacts in the Intelligence DomainIAO-Intel: An Ontology of Information Artifacts in the Intelligence Domain
IAO-Intel: An Ontology of Information Artifacts in the Intelligence Domain
 
Science of Emerging Social Media
Science of Emerging Social MediaScience of Emerging Social Media
Science of Emerging Social Media
 
Ethics, Informatics and Obamacare
Ethics, Informatics and ObamacareEthics, Informatics and Obamacare
Ethics, Informatics and Obamacare
 
e‐Human Beings: The contribution of internet ranking systems to the developme...
e‐Human Beings: The contribution of internet ranking systems to the developme...e‐Human Beings: The contribution of internet ranking systems to the developme...
e‐Human Beings: The contribution of internet ranking systems to the developme...
 
Ontology of aging and death
Ontology of aging and deathOntology of aging and death
Ontology of aging and death
 
Ontology in-buffalo-2013
Ontology in-buffalo-2013Ontology in-buffalo-2013
Ontology in-buffalo-2013
 
ImmPort strategies to enhance discoverability of clinical trial data
ImmPort strategies to enhance discoverability of clinical trial dataImmPort strategies to enhance discoverability of clinical trial data
ImmPort strategies to enhance discoverability of clinical trial data
 
Ontology of Documents (2005)
Ontology of Documents (2005)Ontology of Documents (2005)
Ontology of Documents (2005)
 
Ontology and the National Cancer Institute Thesaurus (2005)
Ontology and the National Cancer Institute Thesaurus (2005)Ontology and the National Cancer Institute Thesaurus (2005)
Ontology and the National Cancer Institute Thesaurus (2005)
 

ENVO: The Environment Ontology (Presentation at the Genomics Standards Consortium Meeting), September 2011

  • 1. An Introduction to Ontology as a Strategy for Data Integration Barry Smith 1
  • 2. The problem • legacy idiosyncracies in handling data complicated progressively by • changes in available hardware and software • turnover of personnel and of collaborations • explosion of data • need to get funding (inhibits reuse) 2
  • 3. The result: balkanization • systems are poorly integrated • deliver redundant capabilities • foster error and waste • prevent comparison and aggregation • prevent secondary use of data • lowers ROI on software 3
  • 4. The proposed solution • vocabulary and meanings change more slowly than hardware and software (and scientific theory*) • semantic interoperability has high initial cost (governance, commitment) but considerable long-term value *atom, electron, cell, bacteria, organism … 4/24
  • 5. How to do it right? • how create an incremental, evolutionary process, where what is good survives, and what is bad fails • create a scenario in which people will find it profitable to reuse ontologies, terminologies and coding systems which have been tried and tested 6
  • 6. Uses of ‘ontology’ in PubMed abstracts 7/24
  • 7. By far the most successful: GO (Gene Ontology) 8
  • 8. GO provides a controlled vocabulary of terms for use in annotating (describing, tagging) data • multi-species, multi-disciplinary, open source • contributing to the cumulativity of scientific results obtained by distinct research communities • compare use of kilograms, meters, seconds in formulating experimental results • natural language and logical definitions for all terms to support consistent human application and computational exploitation 9
  • 9. What is the key to GO’s success? • multi-species, multi-disciplinary, open source • clear rules for ontology development and maintenance • over 11 million annotations relating gene products described in the UniProt, Ensembl and other databases to terms in the GO 10
  • 10. Extending GO’s success to other fields Open Biological and Biomedical Ontologies (OBO) Foundry • Best practice principles • Governance • Review process • Two-tier membership http://obofoundry.org 11
  • 12. CONTINUANT OCCURRENT RELATION TO TIME INDEPENDENT DEPENDENT GRANULARITY Anatomical Organism Organ ORGAN AND Entity (NCBI Function ORGANISM (FMA, Taxonomy) (FMP, CPRO) Phenotypic Biological CARO) Quality Process (PaTO) (GO) CELL AND Cellular Cellular Cell CELLULAR Component Function (CL) COMPONENT (FMA, GO) (GO) Molecule Molecular Function Molecular Process MOLECULE (ChEBI, SO, (GO) (GO) RnaO, PrO) OBO (Open Biomedical Ontology) Foundry proposal (Gene Ontology in yellow) 13
  • 13. CONTINUANT OCCURRENT RELATION TO TIME INDEPENDENT DEPENDENT GRANULARITY COMPLEX OF Family, Community, Population Population ORGANISMS Deme, Population Phenotype Process Anatomical Organ ORGAN AND Organism Entity Function ORGANISM (NCBI (FMA, (FMP, CPRO) Phenotypic Taxonomy) Biological CARO) Quality Process (PaTO) (GO) CELL AND Cellular Cellular Cell CELLULAR Component Function (CL) COMPONENT (FMA, GO) (GO) Molecule Molecular Function Molecular Process MOLECULE (ChEBI, SO, (GO) (GO) RnaO, PrO) Population-level ontologies 14
  • 14. CONTINUANT OCCURRENT RELATION TO TIME INDEPENDENT DEPENDENT GRANULARITY Anatomical Organism Organ ORGAN AND Entity environments (NCBI Function ORGANISM (FMA, Taxonomy) (FMP, CPRO) Phenotypic Biological CARO) Quality Process (PaTO) (GO) CELL AND Cellular Cellular Cell CELLULAR Component Function (CL) COMPONENT (FMA, GO) (GO) Molecule Molecular Function Molecular Process MOLECULE (ChEBI, SO, (GO) (GO) RnaO, PrO) Environment Ontology 15
  • 15. The Environment Ontology Barry Smith http://ontology.buffalo.edu/smith 17
  • 19. 21
  • 21. 23
  • 22. 24
  • 23. DIGESTIVE SYSTEM the interior of your gut: an environment for more than1013 microorganisms
  • 24. Positive and negative parts negative part or hole (not made of matter) positive part (made of matter) 26
  • 25. A site intuitively: a spatial entity that can contain a material entity 28
  • 26. A spatial environment is a site that 1. contains a medium (air, water) 2. can contain an organism or a population of organisms Some sites are supported and demarcated by some solid object 29
  • 27. Stationary Sites 1 2 3 4 1: your office when the door is closed; a closed mouth 2: a rabbit hole; an open mouth 3: the surface of a leaf 4: the Klingon Empire 30
  • 28. Mobile Sites 1 2 3 4 1: a womb; a spaceship 2: a snail’s shell; a 3: the home range of a migrating herd of buffalo; 4: the niche around a flying buzzard 31
  • 29. At any given instant a site is coincident with some spatial region But because there are mobile sites not: site ≡ spatial region For stationary sites we can associate latitute/longitude specifications 32
  • 30. Double hole structure of a Spatial Environment Retainer (a boundary of some surrounding structure) Medium (filling the environing hole) Tenant (occupying the central hole) 33
  • 32. top level Basic Formal Ontology (BFO) Ontology for Information Artifact Biomedical Spatial Ontology mid-level Ontology Investigations (BSPO) (IAO) (OBI) Anatomy Ontology (FMA*, CARO) Infectious Disease Environment Ontology Cellular Cell Ontology (IDO*) Component Ontology (EnvO) domain (CL) Ontology Phenotypic Biological (FMA*, GO*) level Quality Process Ontology Ontology (GO*) Subcellular Anatomy Ontology (SAO) (PaTO) Sequence Ontology (SO*) Molecular Function Protein Ontology (GO*) (PRO*) Extension Strategy + Modular Organization 40
  • 33. How to fit EnvO under BFO • http://www.ifomis.org/bfo/
  • 34. Populating downwards from BFO Continuant Occurrent (Process, Event) Independent Dependent Continuant Continuant
  • 35. Basic Formal Ontology Continuant Occurrent (Process, Event) Independent Dependent Continuant Continuant organism
  • 36. RELATION CONTINUANT OCCURRENT TO TIME GRANULARITY INDEPENDENT DEPENDENT Anatomical Organism Organ Organism-Level ORGAN AND Entity (NCBI Function Process ORGANISM (FMA, Taxonomy) (FMP, CPRO) Phenotypic (GO) CARO) Quality (PaTO) CELL AND Cellular Cellular Cell Cellular Process CELLULAR Component Function (CL) (GO) COMPONENT (FMA, GO) (GO) Molecule Molecular Molecular Function MOLECULE (ChEBI, SO, Process (GO) RnaO, PrO) (GO) obofoundry.org
  • 39. 47
  • 40. Genus-species definitions System =def. an independent continuant which is composed of interacting material entities forming an integrated whole Ecosystem =def. a system which includes organisms and the site in which they live as components 48
  • 41. Biome =def. An ecosystem which contains populations adapted to the environmental conditions conserved over its spatial extent. Microbiome =def. A biome which contains the totality of microscopic organisms, their genetic elements, and interactions in a given environment. 49
  • 42. Aligning EnvO to the Basic Formal Ontology
  • 43. habitat Habitat =def. An ecosystem which can support the life of a given organism, population, or community Realized niche =def. An ecosystem which is that part of a habitat which supports the life of a given organism, population or community
  • 44. Aligning EnvO to the Basic Formal Ontology
  • 45. Hutchinsonion niche (niche as volume in a functionally defined hyperspace) =def. an n-dimensional hyper-volume whose dimensions correspond to resource gradients over which species are distributed – degree of slope, exposure to sunlight, soil fertility, foliage density, salinity...
  • 48. Aligning EnvO to the Basic Formal Ontology part_of
  • 49. 58
  • 50. 59
  • 51. GAZ. An open source gazetteer based on ontological principles http://gensc.org/gc_wiki/index.php/GAZ_Project 60
  • 52. Applications of EnvO in biology 61

Hinweis der Redaktion

  1. Ivan Herman
  2. http://www.w3.org/People/Ivan/CorePresentations/HighLevelIntro/
  3. http://www.pnas.org/misc/archive011904.html Roberto Casati and Achille Varzi, Holes, and Other Superficialities , MIT Press, 1994
  4. Inner Gorge, Colorado River, Granite Rapids, looking west from Tonto Trail just west of Salt Creek http//www.kaibab.org/gc/images/img0072.jpg
  5. http://www.sacsplash.org/cimages/Solitarybee.jpg
  6. http://www.tandus.com/content/product-solutions/powerbond/healthy-environment
  7. http://static.freepik.com/free-photo/open-mouth_19-96488.jpg
  8. * = dedicated NIH funding
  9. http://www.sedris.org/stc/2004/tu/edcs/sld024.htm
  10. http://www.geobabble.org/~hnw/esri99/
  11. http://www.geobabble.org/~hnw/esri99/
  12. http://www.geobabble.org/~hnw/esri99/
  13. http://www.stankievech.net/projectsFrame.html