SlideShare a Scribd company logo
1 of 29
Reasoning over multiple open bio-
ontologies to make machines and
humans happy
Chris Mungall
cjmungall@lbl.gov
@chrismungall
http://bit.ly/mungall-us2ts-2019
Biological data management is hard.
There are many named things.
Drugs 10k
Chemicals 1-50m?
Species
~9 million
Diseases and
Phenotypes
10-50k/species
Cells
1000s+ types
per species)
Experiments
Raw data
Genes 20k/species
Genetic
variants
3m (human)
There are many ways to
categorize the things
Genes 20k/species
Gene Ontology
45k functional descriptor
classes
Knowledge Graph Edges
~7m
There are many ontologies
to categorize the things
762 ontologies
How do we manage this?
MODULARITY REASONING
How do we manage this?
MODULARITY REASONING
EL (Elk,
Whelk)
DL (Hermit,
FACT++)
● OBO
● Rector Normalization
● Design Patterns
● Relation Ontology
● ROBOT
Open Biological Ontologies (OBO)
http://obofoundry.org
1. Well-integrated
Modular ontologies
(SUBSET of bioportal)
2. Provide technical and
sociotechnological
framework for
cooperation
4. Allow us to
curate all of the
things
3. Provide tools,
best practices and
infrastructure for
forging new
ontologies
@obofoundry
Assembling
the Jigsaw
RECTOR
NORMALIZATION
Rector 2003
Modularisation of domain ontologies
implemented in description logics and related
formalisms including owl.
+ =
http://www.cs.man.ac.uk/~rector/papers/rector-modularisation-kcap-2003-distrib.pdf
Minimal Constructs Needed for
Reactor Normalization
Some
Values From
Intersection
Of
EquivalentTo
SubClassOf
OBO Relation Ontology: glue
within and between ontologies
http://obofoundry.org/ontology/ro
Spatial Reasoning OWL design
patterns
nucleus
> spatially_disjoint_with.yaml
axiom:
Text: (part-of some %s)
DisjointWith
(part_of some %s)
Vars:
- component1
- component2
Ontology:
(part-of some nucleus)
DisjointWith
(part-of some cytosol)
http://robot.obolibrary.org
Managing ontology release
Workflows with ODK and ROBOT
● Configure ontology
repo with yaml
● Reasoning + QC
checks via Travis-CI
https://github.com/INCATools/ontology-development-kit
Reasoning detects annotation
errors
Genes are often assigned
functions automatically based on
homology. This is error-prone.
Previous errors include:
• Genes in slime mold
responsible for dorsal fin
development
• Genes in chicken responsible
for lactation
Reasoning detects annotation
errors
Genes are often assigned
functions automatically based on
homology. This is error-prone.
Previous errors include:
• Genes in chicken responsible
for lactation
• Genes in slime mold responsible
for dorsal fin development
Dorsal Fin SubClassOf Fin
Fin SubClassOf part-of some Vertebrate
(Part-of some Animal) DisjointWith (part-of some Slime Mold)
Exomiser + OwlSim
OWL reasoning used
in clinical applications
to diagnose patients
Challenges
SOLVED
STILL VERY
HARD
Machine Reasoning Human Reasoning about
Machine Reasoning
Pop quick: what OWL profile is this?
'DNA extent' EquivalentTo
'sequence molecular entity extent' and
('has part' only
('deoxyribonucleotide residue' or
(('chemical entity' or
'biological sequence entity') and
(not ('biological sequence unit')))))
Combining transitive properties and universal
restrictions can take you strange places
'DNA extent' EquivalentTo
'sequence molecular entity extent' and
('has part' only
('deoxyribonucleotide residue' or
(('chemical entity' or
'biological sequence entity') and
(not ('biological sequence unit'))
)
))
Avoid going mad with complex nested boolean
expressions
KEEP IT SIMPLE,
SAPIENS
Disjoint
Classes
Some
Values From
Intersection
Of
Use with caution:
1. Only
2. Not
3. Cardinality
4. Levels of nesting requiring
parentheses
Generally not needed for bio-
ontology T-Box reasoning
1. Data Properties
2. Keys
BIG BUCKET OF
MIXED AXIOMS
I've giv'n her all
she's got captain, an'
I canna give her no
more!
1
BIG BUCKET OF
MIXED AXIOMS
I've giv'n her all
she's got captain, an'
I canna give her no
more!
WEE BUCKET
OF HARD
AXIOMS
BIG BUCKET OF
EASY AXIOMS
Let me just shoogle these
axioms aroond a wee bit
1
2
HARD: Erythrocyte SubClassOf has_part exactly 0 nucleus
⇒
HARD: Anucleate EquivalentTo has_part exactly 0 nucleus
EASY: Erythrocyte SubClassOf Anucleate
BIG BUCKET OF
MIXED AXIOMS
I've giv'n her all
she's got captain, an'
I canna give her no
more!
WEE BUCKET
OF HARD
AXIOMS
BIG BUCKET OF
EASY AXIOMS
Let me just shoogle these
axioms aroond a wee bit
Och aye that’s
just aboot right
1
2
3
BIG BUCKET OF
MIXED AXIOMS
I've giv'n her all
she's got captain, an'
I canna give her no
more!
WEE BUCKET
OF HARD
AXIOMS
BIG BUCKET OF
EASY AXIOMS
Let me just shoogle these
axioms aroond a wee bit
Och aye that’s
just aboot right
1
2
3
Now I’ll hand these over to ma
pal the Elk, he’s pure dead fast
4
I’m traveling at the speed of light that’s
why they call me Mr Farenheit
5
THE
END
What happens when the pieces
don’t fit together?
Making the pieces fit together: GO
and CHEBI
GO CHEBI
• Some relationships didn’t make
sense
• E.g. nucleotide isa
carbohydrate
• Acids ⬄ conjugate
bases
Making the pieces fit together: GO
and CHEBI
Hill, D. P., Adams, N., Bada, M., Batchelor, C., Berardini, T. Z., Dietze, H., … Lomax, J. (2013). Dovetailing biology and
chemistry: integrating the Gene Ontology with the ChEBI chemical ontology. BMC Genomics, 14(1), 513.
GO CHEBI
• Fixed many is-as
• E.g. nucleotide isa
carbohydrate
• Acids ⬄ conjugate
bases
+ OWL reasoning
Harold Drabkin
David Hill
Jane Lomax
Tanya Berardini
Janna Hastings
GO CHEBI
+ Design
Patterns
https://douroucouli.wordpress.com
Conclusions
● Maintaining > ~100 classes benefits from reasoning
● Maintaining > ~10000 classes: you will be in maintenance hell without
reasoning
● Reasoning is dead easy for computers
● Reasoning can be hard for humans
○ Keep it simple
○ Use Design Patterns / Templates
○ Use software engineering paradigms
○ Avoid unneccessary complexity
● Sociotechnological aspects of reasoning are hardest
○ “I don’t like the entailments I get when I use your ontology”
http://bit.ly/mungall-us2ts-2019

More Related Content

Similar to US2TS: Reasoning over multiple open bio-ontologies to make machines and humans happy

Essential Biology 04.1 Chromosomes, Genes, Alleles, Mutations
Essential Biology 04.1   Chromosomes, Genes, Alleles, MutationsEssential Biology 04.1   Chromosomes, Genes, Alleles, Mutations
Essential Biology 04.1 Chromosomes, Genes, Alleles, MutationsStephen Taylor
 
Genetics Year 11
Genetics Year 11Genetics Year 11
Genetics Year 11ngibellini
 
Drug-discovery knowledge integration and analysis using OWL and reasoners
Drug-discovery knowledge integration and analysis using OWL and reasonersDrug-discovery knowledge integration and analysis using OWL and reasoners
Drug-discovery knowledge integration and analysis using OWL and reasonersSamuel Croset
 
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...Hilmar Lapp
 
Biology Review
Biology ReviewBiology Review
Biology ReviewErin Mucci
 
Biology review
Biology reviewBiology review
Biology reviewErin Mucci
 
In-class introduction to basic Punnett square set-up and problem s.docx
In-class introduction to basic Punnett square set-up and problem s.docxIn-class introduction to basic Punnett square set-up and problem s.docx
In-class introduction to basic Punnett square set-up and problem s.docxbradburgess22840
 
100 Science Genetics
100 Science Genetics100 Science Genetics
100 Science Geneticsngibellini
 
Plant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesPlant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesLeighton Pritchard
 
Origins oflifestations day1and2.ppt
Origins oflifestations day1and2.pptOrigins oflifestations day1and2.ppt
Origins oflifestations day1and2.pptjsanchez17
 
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Chris Mungall
 
Building an efficient infrastructure, standards and data flow for metabolomics
Building an efficient infrastructure, standards and data flow for metabolomicsBuilding an efficient infrastructure, standards and data flow for metabolomics
Building an efficient infrastructure, standards and data flow for metabolomicsChristoph Steinbeck
 
Why Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About ItWhy Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About ItAnita de Waard
 
Why life is so complicated
Why life is so complicatedWhy life is so complicated
Why life is so complicatedAnita de Waard
 
Basic Formal Ontology (BFO) and Disease
 Basic Formal Ontology (BFO) and Disease Basic Formal Ontology (BFO) and Disease
Basic Formal Ontology (BFO) and DiseaseBarry Smith
 

Similar to US2TS: Reasoning over multiple open bio-ontologies to make machines and humans happy (20)

Essential Biology 04.1 Chromosomes, Genes, Alleles, Mutations
Essential Biology 04.1   Chromosomes, Genes, Alleles, MutationsEssential Biology 04.1   Chromosomes, Genes, Alleles, Mutations
Essential Biology 04.1 Chromosomes, Genes, Alleles, Mutations
 
Genetics Year 11
Genetics Year 11Genetics Year 11
Genetics Year 11
 
Drug-discovery knowledge integration and analysis using OWL and reasoners
Drug-discovery knowledge integration and analysis using OWL and reasonersDrug-discovery knowledge integration and analysis using OWL and reasoners
Drug-discovery knowledge integration and analysis using OWL and reasoners
 
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
 
Bioreview2011 s
Bioreview2011 sBioreview2011 s
Bioreview2011 s
 
Biology Review
Biology ReviewBiology Review
Biology Review
 
Biology homework help
Biology homework helpBiology homework help
Biology homework help
 
Biology review
Biology reviewBiology review
Biology review
 
In-class introduction to basic Punnett square set-up and problem s.docx
In-class introduction to basic Punnett square set-up and problem s.docxIn-class introduction to basic Punnett square set-up and problem s.docx
In-class introduction to basic Punnett square set-up and problem s.docx
 
100 Science Genetics
100 Science Genetics100 Science Genetics
100 Science Genetics
 
TAKS Objective 2
TAKS Objective 2TAKS Objective 2
TAKS Objective 2
 
CE-Symm jLBR talk
CE-Symm jLBR talkCE-Symm jLBR talk
CE-Symm jLBR talk
 
Test14slides
Test14slidesTest14slides
Test14slides
 
Plant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesPlant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In Sequences
 
Origins oflifestations day1and2.ppt
Origins oflifestations day1and2.pptOrigins oflifestations day1and2.ppt
Origins oflifestations day1and2.ppt
 
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
 
Building an efficient infrastructure, standards and data flow for metabolomics
Building an efficient infrastructure, standards and data flow for metabolomicsBuilding an efficient infrastructure, standards and data flow for metabolomics
Building an efficient infrastructure, standards and data flow for metabolomics
 
Why Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About ItWhy Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About It
 
Why life is so complicated
Why life is so complicatedWhy life is so complicated
Why life is so complicated
 
Basic Formal Ontology (BFO) and Disease
 Basic Formal Ontology (BFO) and Disease Basic Formal Ontology (BFO) and Disease
Basic Formal Ontology (BFO) and Disease
 

More from Chris Mungall

MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxChris Mungall
 
Scaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesChris Mungall
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOChris Mungall
 
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxChris Mungall
 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)Chris Mungall
 
LinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupLinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupChris Mungall
 
Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Chris Mungall
 
All together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeAll together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeChris Mungall
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeChris Mungall
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in UberonChris Mungall
 
SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)Chris Mungall
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Chris Mungall
 
Uberon: opening up to community contributions
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributionsChris Mungall
 
Modeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesChris Mungall
 
Causal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyChris Mungall
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyChris Mungall
 
Introduction to the BioLink datamodel
Introduction to the BioLink datamodelIntroduction to the BioLink datamodel
Introduction to the BioLink datamodelChris Mungall
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Chris Mungall
 

More from Chris Mungall (20)

MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptx
 
Scaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciences
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
 
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptx
 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)
 
LinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupLinkML presentation to Yosemite Group
LinkML presentation to Yosemite Group
 
Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...
 
All together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeAll together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of life
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of Life
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in Uberon
 
SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019
 
Uberon: opening up to community contributions
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributions
 
Modeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologies
 
Causal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation Ontology
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene Ontology
 
Introduction to the BioLink datamodel
Introduction to the BioLink datamodelIntroduction to the BioLink datamodel
Introduction to the BioLink datamodel
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015
 
ENVO GSC 2015
ENVO GSC 2015ENVO GSC 2015
ENVO GSC 2015
 
Kboom phenoday-2016
Kboom phenoday-2016Kboom phenoday-2016
Kboom phenoday-2016
 

Recently uploaded

Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 

Recently uploaded (20)

Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 

US2TS: Reasoning over multiple open bio-ontologies to make machines and humans happy

  • 1. Reasoning over multiple open bio- ontologies to make machines and humans happy Chris Mungall cjmungall@lbl.gov @chrismungall http://bit.ly/mungall-us2ts-2019
  • 2. Biological data management is hard. There are many named things. Drugs 10k Chemicals 1-50m? Species ~9 million Diseases and Phenotypes 10-50k/species Cells 1000s+ types per species) Experiments Raw data Genes 20k/species Genetic variants 3m (human)
  • 3. There are many ways to categorize the things Genes 20k/species Gene Ontology 45k functional descriptor classes Knowledge Graph Edges ~7m
  • 4. There are many ontologies to categorize the things 762 ontologies
  • 5. How do we manage this? MODULARITY REASONING
  • 6. How do we manage this? MODULARITY REASONING EL (Elk, Whelk) DL (Hermit, FACT++) ● OBO ● Rector Normalization ● Design Patterns ● Relation Ontology ● ROBOT
  • 7. Open Biological Ontologies (OBO) http://obofoundry.org 1. Well-integrated Modular ontologies (SUBSET of bioportal) 2. Provide technical and sociotechnological framework for cooperation 4. Allow us to curate all of the things 3. Provide tools, best practices and infrastructure for forging new ontologies @obofoundry
  • 9. RECTOR NORMALIZATION Rector 2003 Modularisation of domain ontologies implemented in description logics and related formalisms including owl. + = http://www.cs.man.ac.uk/~rector/papers/rector-modularisation-kcap-2003-distrib.pdf
  • 10. Minimal Constructs Needed for Reactor Normalization Some Values From Intersection Of EquivalentTo SubClassOf
  • 11. OBO Relation Ontology: glue within and between ontologies http://obofoundry.org/ontology/ro
  • 12. Spatial Reasoning OWL design patterns nucleus > spatially_disjoint_with.yaml axiom: Text: (part-of some %s) DisjointWith (part_of some %s) Vars: - component1 - component2 Ontology: (part-of some nucleus) DisjointWith (part-of some cytosol)
  • 13. http://robot.obolibrary.org Managing ontology release Workflows with ODK and ROBOT ● Configure ontology repo with yaml ● Reasoning + QC checks via Travis-CI https://github.com/INCATools/ontology-development-kit
  • 14. Reasoning detects annotation errors Genes are often assigned functions automatically based on homology. This is error-prone. Previous errors include: • Genes in slime mold responsible for dorsal fin development • Genes in chicken responsible for lactation
  • 15. Reasoning detects annotation errors Genes are often assigned functions automatically based on homology. This is error-prone. Previous errors include: • Genes in chicken responsible for lactation • Genes in slime mold responsible for dorsal fin development Dorsal Fin SubClassOf Fin Fin SubClassOf part-of some Vertebrate (Part-of some Animal) DisjointWith (part-of some Slime Mold)
  • 16. Exomiser + OwlSim OWL reasoning used in clinical applications to diagnose patients
  • 17. Challenges SOLVED STILL VERY HARD Machine Reasoning Human Reasoning about Machine Reasoning
  • 18. Pop quick: what OWL profile is this? 'DNA extent' EquivalentTo 'sequence molecular entity extent' and ('has part' only ('deoxyribonucleotide residue' or (('chemical entity' or 'biological sequence entity') and (not ('biological sequence unit')))))
  • 19. Combining transitive properties and universal restrictions can take you strange places 'DNA extent' EquivalentTo 'sequence molecular entity extent' and ('has part' only ('deoxyribonucleotide residue' or (('chemical entity' or 'biological sequence entity') and (not ('biological sequence unit')) ) ))
  • 20. Avoid going mad with complex nested boolean expressions KEEP IT SIMPLE, SAPIENS Disjoint Classes Some Values From Intersection Of Use with caution: 1. Only 2. Not 3. Cardinality 4. Levels of nesting requiring parentheses Generally not needed for bio- ontology T-Box reasoning 1. Data Properties 2. Keys
  • 21. BIG BUCKET OF MIXED AXIOMS I've giv'n her all she's got captain, an' I canna give her no more! 1
  • 22. BIG BUCKET OF MIXED AXIOMS I've giv'n her all she's got captain, an' I canna give her no more! WEE BUCKET OF HARD AXIOMS BIG BUCKET OF EASY AXIOMS Let me just shoogle these axioms aroond a wee bit 1 2 HARD: Erythrocyte SubClassOf has_part exactly 0 nucleus ⇒ HARD: Anucleate EquivalentTo has_part exactly 0 nucleus EASY: Erythrocyte SubClassOf Anucleate
  • 23. BIG BUCKET OF MIXED AXIOMS I've giv'n her all she's got captain, an' I canna give her no more! WEE BUCKET OF HARD AXIOMS BIG BUCKET OF EASY AXIOMS Let me just shoogle these axioms aroond a wee bit Och aye that’s just aboot right 1 2 3
  • 24. BIG BUCKET OF MIXED AXIOMS I've giv'n her all she's got captain, an' I canna give her no more! WEE BUCKET OF HARD AXIOMS BIG BUCKET OF EASY AXIOMS Let me just shoogle these axioms aroond a wee bit Och aye that’s just aboot right 1 2 3 Now I’ll hand these over to ma pal the Elk, he’s pure dead fast 4 I’m traveling at the speed of light that’s why they call me Mr Farenheit 5 THE END
  • 25. What happens when the pieces don’t fit together?
  • 26. Making the pieces fit together: GO and CHEBI GO CHEBI • Some relationships didn’t make sense • E.g. nucleotide isa carbohydrate • Acids ⬄ conjugate bases
  • 27. Making the pieces fit together: GO and CHEBI Hill, D. P., Adams, N., Bada, M., Batchelor, C., Berardini, T. Z., Dietze, H., … Lomax, J. (2013). Dovetailing biology and chemistry: integrating the Gene Ontology with the ChEBI chemical ontology. BMC Genomics, 14(1), 513. GO CHEBI • Fixed many is-as • E.g. nucleotide isa carbohydrate • Acids ⬄ conjugate bases + OWL reasoning Harold Drabkin David Hill Jane Lomax Tanya Berardini Janna Hastings GO CHEBI + Design Patterns
  • 29. Conclusions ● Maintaining > ~100 classes benefits from reasoning ● Maintaining > ~10000 classes: you will be in maintenance hell without reasoning ● Reasoning is dead easy for computers ● Reasoning can be hard for humans ○ Keep it simple ○ Use Design Patterns / Templates ○ Use software engineering paradigms ○ Avoid unneccessary complexity ● Sociotechnological aspects of reasoning are hardest ○ “I don’t like the entailments I get when I use your ontology” http://bit.ly/mungall-us2ts-2019