SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Downloaden Sie, um offline zu lesen
Reasoning with the RNA Ontology
                    Chris Mungall


        Lawrence Berkeley National Laboratory




                                                Reasoning with the RNA Ontology – p.1/28
What is a reasoner?
A reasoner implements a generalized decision procedure
which takes a collection of logical axioms and finds the
entailments of these axioms and whether or not the axioms
are satisfiable

   An ontology can be considered as a collection of
   axioms (in contrast to a terminology)
   1. Relationships: is a (SubClass), partOf, ...
   2. Definitions
   3. Constraints
   We can also treat data as collections of axioms




                                               Reasoning with the RNA Ontology – p.2/28
Examples of Ontology Axioms
GNRATetraloop is a Tetraloop
Tetraloop is a RNAStructure




                               Reasoning with the RNA Ontology – p.3/28
Examples of Ontology Axioms
GNRATetraloop is a Tetraloop
Tetraloop is a RNAStructure
A translation to first-order predicate logic:

      GNRATetraloop(x) →         Tetraloop(x)
          Tetraloop(x) →        RNAStructure(x)




                                               Reasoning with the RNA Ontology – p.3/28
Examples of Ontology Axioms
GNRATetraloop is a Tetraloop
Tetraloop is a RNAStructure
A translation to first-order predicate logic:

      GNRATetraloop(x) →         Tetraloop(x)
          Tetraloop(x) →        RNAStructure(x)

Set theoretic:
GNRATetraloop ⊆ Tetraloop ⊆ RNAStructure




                                               Reasoning with the RNA Ontology – p.3/28
Examples of Ontology Axioms
GNRATetraloop is a Tetraloop
Tetraloop is a RNAStructure
A translation to first-order predicate logic:

      GNRATetraloop(x) →         Tetraloop(x)
          Tetraloop(x) →        RNAStructure(x)

Set theoretic:
GNRATetraloop ⊆ Tetraloop ⊆ RNAStructure
Entailment: GNRATetraloop is a RNAStructure




                                               Reasoning with the RNA Ontology – p.3/28
The reasoning square

           Classifying        Validation


                                 Finding
              Inference
                              inconsistent
Ontology     of unstated
                                axioms in
           relationships in
                              the ontology
            the ontology


                              Determining
              Inference
                              if a dataset
 Data        of unstated
                                 is valid
            facts in data


                                             Reasoning with the RNA Ontology – p.4/28
The reasoning square

           Classifying              Validation
                                            disjoint
                       N   N   N
           Tetraloop
                       N   N   N
                                   Purine           Pyramidine

Ontology
                       N   A   R                A
                                                        X
            GNRA
           Tetraloop   N   G   N

                                               GNRA
                                              Tetraloop

                                                        X
 Data                  G   A   A
           T therm
           23SRNA                           G       A   G
                       C   G   A
            region
                                            C       G   A



                                                            Reasoning with the RNA Ontology – p.5/28
Ontology Languages
First Order Logic (Common Logic ISO standard)
   Highly Expressive
   Undecidable : No tractable decision procedures
OWL and Description Logics
  Restricted subset of FOL with highly convenient
  constructs for describing classes
  Reasoners are heavily tested on existing ontologies
OBO
  Initially an ad-hoc format for the Gene Ontology
  Now an alternate syntax for Common Logic
  Reasoners based on rule application


                                            Reasoning with the RNA Ontology – p.6/28
Common Logic
Common Logic is an ISO specification for First Order Logic
(FOL)
   Syntaxes
     CLIF - Lisp-like (derived from KIF)
     XCL - XML
     CG - Conceptual Graphs
   A CL text consists of CL sentences (axioms)
   Sentences can be atomic, boolean or logically
   quantified
     Atomic sentence: a predicate followed by zero or
     more arguments
     Boolean sentence: and, or, if ( → ), iff ( ↔ )
     Quantified sentence: forall (∀), exists(∃)
                                                 Reasoning with the RNA Ontology – p.7/28
Common Logic Examples

Textbook syntax                     CLIF

                                      (forall (x)
                                        (if (GNRATetraloop x)
∀x : GNRATetraloop(x) → T etraloop(x)
                                            (Tetraloop x)))




                                    (forall (x)
                                      (if (Purine x)
 ∀x : Purine(x) → ¬P yramidine(x)
                                          (not Pyramidine x)))



                                       (forall (x)
∀x : Intron(x) → ∃yExon(y) ∧ adjacent_to(x, y) (Intron x)
                                         (if
                                               (exists (y)
                                                  (and (Exon y)
                                                       (adjacentTo x y)))))

                                                              Reasoning with the RNA Ontology – p.8/28
Reasoning with FOL
Undecidable.
FOL Theorem provers are not guaranteed to terminate
The Horn logic subset has desirable computational
properties
   Head ← Body
   Logic Programming
   SWRL
   Datalog
   Relational Model, Relational Algebra
   non-monotonic and probabilistic extensions




                                          Reasoning with the RNA Ontology – p.9/28
OWL-DL
OWL belongs to a family of logic known as Description
Logics, circumscribed subsets of FOL that are guaranteed
to be decidable
   Variety of notations (syntaxes):
      RDF-XML - Default, but it’s a mess
      OWL-XML - Easier to manipulate computationally
      Manchester Syntax - Easy on the eye
   Constructs
     Property (relation) unary predicates: Functional,
     Transitive, Symmetric, ...
     Class Axioms: SubClass, EquivalentClass,
     DisjointWith, ...
     Descriptions
   OWL2 has lots of tool and reasoners to choose from
                                               Reasoning with the RNA Ontology – p.10/28
Descriptions in OWL
A Description is a (possibly recursive) tree structure that
formally identifies membership criteria for a class.
    Can be combined using logical connectives: AND, OR,
    NOT
       AND : intersectionOf
       OR : unionOf
       NOT : complementOf
    Restrictions
    Restrict class membership based on some property
       ONLY : example (paired with CWWONLY Guanine)
       SOME :
       Quantified cardinality restrictions

Example: CWWAGBasePair = hasPart only (A and pair-
                                                      Reasoning with the RNA Ontology – p.11/28
OWL Reasoners
Decision Procedure based on tableau calculus
Refutation-based, repeated applications of de-Morgan’s
law




                                           Reasoning with the RNA Ontology – p.12/28
OWL Reasoners
Decision Procedure based on tableau calculus
Refutation-based, repeated applications of de-Morgan’s
law
Widely used and tested on ontologies
Many reasoners can now classify the larger biological
ontologies in acceptable time




                                           Reasoning with the RNA Ontology – p.12/28
OWL Reasoners
Decision Procedure based on tableau calculus
Refutation-based, repeated applications of de-Morgan’s
law
Widely used and tested on ontologies
Many reasoners can now classify the larger biological
ontologies in acceptable time
Less widely used on data
RDF triplestores are commonly used but these lack key
OWL constructs.
OWLGRES is a promising technology here.




                                           Reasoning with the RNA Ontology – p.12/28
No Unique Name Assumption
Classes and instances are potentially equivalent unless
declared otherwise. Given ontology axiom:
Functional(fivePrimeTo)
An instance axioms:
A(b1)
A(b2)
A(b3)
b1 fivePrimeTo b2
b1 fivePrimeTo b3
A reasoner will not say this is inconsistent. It will infer that
b2=b3. To get a reasoner to detect the inconsistency we
must explicitly declare all base instances to be distinct:
b1 differentFrom b2
b1 differentFrom b3
b2 differentFrom b3
                                                     Reasoning with the RNA Ontology – p.13/28
The Open World Assumption
Unstated facts are not assumed to be false. Given ontology
axioms
A SubClassOf Base
UnpairedBase equivalentTo some
   (Base that pairedWith 0 Base)
An instance axioms:
A(b1)
A(b2)
A(b3)
b1 fivePrimeTo b2
b2 fivePrimeTo b3
A reasoner will not infer b1, b2 or b3 to be UnpairedBases.
We need to explicitly declare this:
UnpairedBase(b1)
UnpairedBase(b2)
                                                Reasoning with the RNA Ontology – p.14/28
OBO
Initially an ad-hoc format for the Gene Ontology
    Graph-centric
    Terminological features
Formal Semantics
   Initially lacked formal semantics. Formal definition
   written in natural language in Relations Ontology.
   Translation to OWL-DL (Horrocks et al)
      With OBO 1.3, every OBO document is a
      Common Logic Text
      OBO-Core consists only of atomic sentences
      OBO-CL allows arbitrary logical formulae
      OBO-H OBO-Core plus horn rules


                                             Reasoning with the RNA Ontology – p.15/28
Reasoning over OBO ontologies
Strategies
   convert to OWL and use an OWL reasoner
   convert to CL and use a FOL theorem prover
   Use a rule-based reasoner
     Java implementation: OBO-Edit
     Prolog implementation: Easy to extend
     SQL implementation: slow but scales over massive
     ontologies and datasets
     Limitations: limited support for negation




                                            Reasoning with the RNA Ontology – p.16/28
Are Description Logics enough?
Some things that cannot be done in OWL-2:
   Define relations using arithmetic:
   Define relations using intersection, union and negation
   Declare relations with > 2 arguments
   Makes reasoning about change harder
   Model cyclic structures
   Any structure with an acyclic path through some
   combination of relations (Carbon rings, RNA molecules)




                                              Reasoning with the RNA Ontology – p.17/28
Arithmetic in relations
We cannot express this in OWL:

             upstreamOf (x, y) ← end(x) < start(y)

In OWL we must:
    explicitly name all the bases, and declare a 5’ to 3’ connection
    relation between them
    declare < as the transitive version of the 5’ to 3’ relation

This is feasible with RNA, but not DNA




                                                               Reasoning with the RNA Ontology – p.18/28
Relation Boolean Constructs
We cannot express this in OWL:

    overlaps = ends.af ter.startOf ∩ starts.bef ore.endOf

                 disconnected = ¬overlaps
This severely limits OWL when applied to instance data
involving intervals




                                                 Reasoning with the RNA Ontology – p.19/28
N-ary relations and time
In OWL, all relations must be binary. N-ary relations are
useful for reasoning about change.
   As the RNA molecule folds, unpaired bases become
   paired:
                ¬paired with CWW(b1, b5, t0)
                    paired with CWW(b1, b5, t1)

                instance of (b1, UnpairedBase, t0)

                 instance of (b1, PairedBase, t1)

   There are a variety of (awkward) techniques for
   translating N-ary relations to binary

                 ¬paired with CWW(b1@t0, b5@t0)

                  paired with CWW(b1@t1, b5@t1)      Reasoning with the RNA Ontology – p.20/28
Cyclic descriptions
OWL Descriptions are tree-like. Cyclic descriptions are
required for RNA Structures. Proposed def of GNRA
Tetraloop:
GNRA TetraloopMotif =
  hasPart some
     ( Nucleobase and
       fivePrimeTo some
       (G and fivePrimeTo some
            (Nucleobase and fivePrimeTo some
                 (Purine and fivePrimeTo some
                      (A and fivePrimeTo some
                           (Nucleobase and pairsWithCWW som
                       and pairsWithTHS some G)))
            and pairsWithTSH some A)
        and pairsWithCWW some Nucleobase)
                                                Reasoning with the RNA Ontology – p.21/28
Tree-like classification structure

                            GNRA TetraloopMotif = hasPart so
                            and fivePrimeTo some (G and
    N
                            (Nucleobase and fivePrimeTo some
N       G
                            fivePrimeTo some (A and fivePrimeTo
    A       N
                            and pairsWithCWW some Nucleoba
                R
                            THS some G))) and pairsWithTSH
                    A
                            sWithCWW some Nucleobase)
                G       N

                    N




                                       Reasoning with the RNA Ontology – p.22/28
Tree-like classification structure

                                GNRA TetraloopMotif = hasPart so
                                and fivePrimeTo some (G and
    N
                                (Nucleobase and fivePrimeTo some
N       G
                                fivePrimeTo some (A and fivePrimeTo
    A       N
                                and pairsWithCWW some Nucleoba
                R
                                THS some G))) and pairsWithTSH
                    A
                                sWithCWW some Nucleobase)
                G           N

                        N




C       G           A



G       A           A


                                           Reasoning with the RNA Ontology – p.22/28
Tree-like classification structure

                                        GNRA TetraloopMotif = hasPart so
                                        and fivePrimeTo some (G and
        N
                                        (Nucleobase and fivePrimeTo some
    N       G
                                        fivePrimeTo some (A and fivePrimeTo
        A           N
                                        and pairsWithCWW some Nucleoba
                        R
                                        THS some G))) and pairsWithTSH
                            A
                                        sWithCWW some Nucleobase)
                        G       N

                            N




C       G       A       A       A   G



G       A       A       A       G   C


                                                   Reasoning with the RNA Ontology – p.22/28
Labeled sub-descriptions
We would like to do something like this, if it were possible in
OWL:
GNRATetraloopMotif =
  hasPart some
   (Nucleobase[1] and fivePrimeTo some
       (G[2] and fivePrimeTo some
            (Nucleobase[3] and fivePrimeTo some
                 (Nucleobase[4] and fivePrimeTo some
                       (A[5] and fivePrimeTo some
                             (Nucleobase[6] and pairsWithCW
                             and pairsWithTHS some G[2])))
              and pairsWithTSH some A[5])
        and pairsWithCWW some Nucleobase[6])



                                                  Reasoning with the RNA Ontology – p.23/28
Rules
SWRL (Semantic Web Rule Language) extends OWL with
rules. We can add this to the ontology:
nucleotide(?b0),
g(?b1),
nucleotide(?b2),
purine(?b3),
a(?b4),
nucleotide(?b5),
followedBy(?b0, ?b1),
followedBy(?b1, ?b2),
followedBy(?b2, ?b3),
followedBy(?b3, ?b4),
followedBy(?b4, ?b5),
pairedWithTHS(?b4, ?b1),
pairedWithCWW(?b5, ?b0)
--> partOfGNRATetraloop(?b0)




                                         Reasoning with the RNA Ontology – p.24/28
Is SWRL the answer?
Bonus: Can be extended with arithmetic operators (to
define upstreamOf)
Negative: only binary relations
Negative: only instance classification
We cannot use the previous definition for ontology
classification
Negative: we cannot infer the existence of undeclared
entities
We can tell a base is part of a tetraloop motif, but we
can’t infer the tetraloop motif instance




                                             Reasoning with the RNA Ontology – p.25/28
Description Graphs
An extension of OWL to allow representation of cyclic
structures[?].
   Possibly part of OWL3?
   Implemented in HermiT reasoner
   Largely new and untested




                                               Reasoning with the RNA Ontology – p.26/28
OBO Graphs
Cyclic structures can be described in OBO, the graph is
translated to simple rules. These rules can be executed us-
ing LP or even SQL.




                                               Reasoning with the RNA Ontology – p.27/28
OBO Graphs




             Reasoning with the RNA Ontology – p.27/28
Conclusions
There is no one single ideal subset of FOL for reasoning
The RNA Ontology should employ as expressive a logic
as it needs
But first the RNAO must exist




                                            Reasoning with the RNA Ontology – p.28/28
Conclusions
There is no one single ideal subset of FOL for
reasoning
   All subsets have limitations.
   DLs cannot express a lot of what we need for
   primary and secondary sequence structures
The RNA Ontology should employ as expressive a logic
as it needs
But first the RNAO must exist




                                           Reasoning with the RNA Ontology – p.28/28
Conclusions
There is no one single ideal subset of FOL for reasoning
The RNA Ontology should employ as expressive a logic
as it needs
   An incorrect formally specified definition is worse
   than a correct informally specified definition
   Hybrid reasoning approaches are feasible
   The basic instance classification problem is just not
   that hard (compared to RNA bioinformatics as a
   whole)
   Special purpose algorithms will probably beat
   general purpose reasoners
But first the RNAO must exist


                                            Reasoning with the RNA Ontology – p.28/28
Conclusions
There is no one single ideal subset of FOL for reasoning
The RNA Ontology should employ as expressive a logic
as it needs
But first the RNAO must exist
  Perhaps its too early to worry too much about
  reasoning
  Priority: simple term lists, basic isa hierarchy, with
  definitions written for humans, plus motif definitions
  in some compact notation




                                             Reasoning with the RNA Ontology – p.28/28

Weitere ähnliche Inhalte

Andere mochten auch

Uberon lausanne-2012
Uberon lausanne-2012Uberon lausanne-2012
Uberon lausanne-2012
Chris Mungall
 

Andere mochten auch (9)

Uberon lausanne-2012
Uberon lausanne-2012Uberon lausanne-2012
Uberon lausanne-2012
 
Neufeld ISME14
Neufeld ISME14Neufeld ISME14
Neufeld ISME14
 
Ontologies and Continuous Integration
Ontologies and Continuous IntegrationOntologies and Continuous Integration
Ontologies and Continuous Integration
 
An application of Basic Formal Ontology to the Ontology of Services and Commo...
An application of Basic Formal Ontology to the Ontology of Services and Commo...An application of Basic Formal Ontology to the Ontology of Services and Commo...
An application of Basic Formal Ontology to the Ontology of Services and Commo...
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysis
 
Towards an Ontology of Philosophy
Towards an Ontology of PhilosophyTowards an Ontology of Philosophy
Towards an Ontology of Philosophy
 
How to give a good scientific oral presentation
How to give a good scientific oral presentationHow to give a good scientific oral presentation
How to give a good scientific oral presentation
 
So you want to be an academic?
So you want to be an academic?So you want to be an academic?
So you want to be an academic?
 
Tesi sesb
Tesi sesbTesi sesb
Tesi sesb
 

Ähnlich wie Reasoning with RNA

Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdfKernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
grssieee
 
A formal ontology of sequences
A formal ontology of sequencesA formal ontology of sequences
A formal ontology of sequences
Robert Hoehndorf
 
Practicum Pressentation PDF
Practicum Pressentation PDFPracticum Pressentation PDF
Practicum Pressentation PDF
Gui Chen
 
Exploring the Solution Space of Sorting by Reversals: A New Approach
Exploring the Solution Space of Sorting by Reversals: A New ApproachExploring the Solution Space of Sorting by Reversals: A New Approach
Exploring the Solution Space of Sorting by Reversals: A New Approach
IDES Editor
 

Ähnlich wie Reasoning with RNA (20)

Knowledge Extraction
Knowledge ExtractionKnowledge Extraction
Knowledge Extraction
 
PAGOdA poster
PAGOdA posterPAGOdA poster
PAGOdA poster
 
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdfKernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
 
A formal ontology of sequences
A formal ontology of sequencesA formal ontology of sequences
A formal ontology of sequences
 
從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論
 
Tutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and SystemsTutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and Systems
 
RuleML 2015
RuleML 2015RuleML 2015
RuleML 2015
 
Master Thesis on the Mathematial Analysis of Neural Networks
Master Thesis on the Mathematial Analysis of Neural NetworksMaster Thesis on the Mathematial Analysis of Neural Networks
Master Thesis on the Mathematial Analysis of Neural Networks
 
L03 ai - knowledge representation using logic
L03 ai - knowledge representation using logicL03 ai - knowledge representation using logic
L03 ai - knowledge representation using logic
 
Practicum Pressentation PDF
Practicum Pressentation PDFPracticum Pressentation PDF
Practicum Pressentation PDF
 
Sara el hassad
Sara el hassadSara el hassad
Sara el hassad
 
#8 formal methods – pro logic
#8 formal methods – pro logic#8 formal methods – pro logic
#8 formal methods – pro logic
 
Franz et. al. 2012. Reconciling Succeeding Classifications, ESA 2012
Franz et. al. 2012. Reconciling Succeeding Classifications, ESA 2012Franz et. al. 2012. Reconciling Succeeding Classifications, ESA 2012
Franz et. al. 2012. Reconciling Succeeding Classifications, ESA 2012
 
Slides4
Slides4Slides4
Slides4
 
ESSLLI2016 DTS Lecture Day 5-2: Proof-theoretic Turn
ESSLLI2016 DTS Lecture Day 5-2: Proof-theoretic TurnESSLLI2016 DTS Lecture Day 5-2: Proof-theoretic Turn
ESSLLI2016 DTS Lecture Day 5-2: Proof-theoretic Turn
 
Framester: A Wide Coverage Linguistic Linked Data Hub
Framester: A Wide Coverage Linguistic Linked Data HubFramester: A Wide Coverage Linguistic Linked Data Hub
Framester: A Wide Coverage Linguistic Linked Data Hub
 
Data exploration and graphics with R
Data exploration and graphics with RData exploration and graphics with R
Data exploration and graphics with R
 
Apollo : A workshop for the Manakin Research Coordination Network
Apollo: A workshop for the Manakin Research Coordination NetworkApollo: A workshop for the Manakin Research Coordination Network
Apollo : A workshop for the Manakin Research Coordination Network
 
First order predicate logic(fopl)
First order predicate logic(fopl)First order predicate logic(fopl)
First order predicate logic(fopl)
 
Exploring the Solution Space of Sorting by Reversals: A New Approach
Exploring the Solution Space of Sorting by Reversals: A New ApproachExploring the Solution Space of Sorting by Reversals: A New Approach
Exploring the Solution Space of Sorting by Reversals: A New Approach
 

Mehr von Chris Mungall

Mehr von Chris Mungall (20)

MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptx
 
Scaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciences
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
 
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptx
 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)
 
LinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupLinkML presentation to Yosemite Group
LinkML presentation to Yosemite Group
 
Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...
 
All together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeAll together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of life
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of Life
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in Uberon
 
SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019
 
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
 
Uberon: opening up to community contributions
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributions
 
Modeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologies
 
Causal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation Ontology
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene Ontology
 
Introduction to the BioLink datamodel
Introduction to the BioLink datamodelIntroduction to the BioLink datamodel
Introduction to the BioLink datamodel
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015
 
ENVO GSC 2015
ENVO GSC 2015ENVO GSC 2015
ENVO GSC 2015
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Reasoning with RNA

  • 1. Reasoning with the RNA Ontology Chris Mungall Lawrence Berkeley National Laboratory Reasoning with the RNA Ontology – p.1/28
  • 2. What is a reasoner? A reasoner implements a generalized decision procedure which takes a collection of logical axioms and finds the entailments of these axioms and whether or not the axioms are satisfiable An ontology can be considered as a collection of axioms (in contrast to a terminology) 1. Relationships: is a (SubClass), partOf, ... 2. Definitions 3. Constraints We can also treat data as collections of axioms Reasoning with the RNA Ontology – p.2/28
  • 3. Examples of Ontology Axioms GNRATetraloop is a Tetraloop Tetraloop is a RNAStructure Reasoning with the RNA Ontology – p.3/28
  • 4. Examples of Ontology Axioms GNRATetraloop is a Tetraloop Tetraloop is a RNAStructure A translation to first-order predicate logic: GNRATetraloop(x) → Tetraloop(x) Tetraloop(x) → RNAStructure(x) Reasoning with the RNA Ontology – p.3/28
  • 5. Examples of Ontology Axioms GNRATetraloop is a Tetraloop Tetraloop is a RNAStructure A translation to first-order predicate logic: GNRATetraloop(x) → Tetraloop(x) Tetraloop(x) → RNAStructure(x) Set theoretic: GNRATetraloop ⊆ Tetraloop ⊆ RNAStructure Reasoning with the RNA Ontology – p.3/28
  • 6. Examples of Ontology Axioms GNRATetraloop is a Tetraloop Tetraloop is a RNAStructure A translation to first-order predicate logic: GNRATetraloop(x) → Tetraloop(x) Tetraloop(x) → RNAStructure(x) Set theoretic: GNRATetraloop ⊆ Tetraloop ⊆ RNAStructure Entailment: GNRATetraloop is a RNAStructure Reasoning with the RNA Ontology – p.3/28
  • 7. The reasoning square Classifying Validation Finding Inference inconsistent Ontology of unstated axioms in relationships in the ontology the ontology Determining Inference if a dataset Data of unstated is valid facts in data Reasoning with the RNA Ontology – p.4/28
  • 8. The reasoning square Classifying Validation disjoint N N N Tetraloop N N N Purine Pyramidine Ontology N A R A X GNRA Tetraloop N G N GNRA Tetraloop X Data G A A T therm 23SRNA G A G C G A region C G A Reasoning with the RNA Ontology – p.5/28
  • 9. Ontology Languages First Order Logic (Common Logic ISO standard) Highly Expressive Undecidable : No tractable decision procedures OWL and Description Logics Restricted subset of FOL with highly convenient constructs for describing classes Reasoners are heavily tested on existing ontologies OBO Initially an ad-hoc format for the Gene Ontology Now an alternate syntax for Common Logic Reasoners based on rule application Reasoning with the RNA Ontology – p.6/28
  • 10. Common Logic Common Logic is an ISO specification for First Order Logic (FOL) Syntaxes CLIF - Lisp-like (derived from KIF) XCL - XML CG - Conceptual Graphs A CL text consists of CL sentences (axioms) Sentences can be atomic, boolean or logically quantified Atomic sentence: a predicate followed by zero or more arguments Boolean sentence: and, or, if ( → ), iff ( ↔ ) Quantified sentence: forall (∀), exists(∃) Reasoning with the RNA Ontology – p.7/28
  • 11. Common Logic Examples Textbook syntax CLIF (forall (x) (if (GNRATetraloop x) ∀x : GNRATetraloop(x) → T etraloop(x) (Tetraloop x))) (forall (x) (if (Purine x) ∀x : Purine(x) → ¬P yramidine(x) (not Pyramidine x))) (forall (x) ∀x : Intron(x) → ∃yExon(y) ∧ adjacent_to(x, y) (Intron x) (if (exists (y) (and (Exon y) (adjacentTo x y))))) Reasoning with the RNA Ontology – p.8/28
  • 12. Reasoning with FOL Undecidable. FOL Theorem provers are not guaranteed to terminate The Horn logic subset has desirable computational properties Head ← Body Logic Programming SWRL Datalog Relational Model, Relational Algebra non-monotonic and probabilistic extensions Reasoning with the RNA Ontology – p.9/28
  • 13. OWL-DL OWL belongs to a family of logic known as Description Logics, circumscribed subsets of FOL that are guaranteed to be decidable Variety of notations (syntaxes): RDF-XML - Default, but it’s a mess OWL-XML - Easier to manipulate computationally Manchester Syntax - Easy on the eye Constructs Property (relation) unary predicates: Functional, Transitive, Symmetric, ... Class Axioms: SubClass, EquivalentClass, DisjointWith, ... Descriptions OWL2 has lots of tool and reasoners to choose from Reasoning with the RNA Ontology – p.10/28
  • 14. Descriptions in OWL A Description is a (possibly recursive) tree structure that formally identifies membership criteria for a class. Can be combined using logical connectives: AND, OR, NOT AND : intersectionOf OR : unionOf NOT : complementOf Restrictions Restrict class membership based on some property ONLY : example (paired with CWWONLY Guanine) SOME : Quantified cardinality restrictions Example: CWWAGBasePair = hasPart only (A and pair- Reasoning with the RNA Ontology – p.11/28
  • 15. OWL Reasoners Decision Procedure based on tableau calculus Refutation-based, repeated applications of de-Morgan’s law Reasoning with the RNA Ontology – p.12/28
  • 16. OWL Reasoners Decision Procedure based on tableau calculus Refutation-based, repeated applications of de-Morgan’s law Widely used and tested on ontologies Many reasoners can now classify the larger biological ontologies in acceptable time Reasoning with the RNA Ontology – p.12/28
  • 17. OWL Reasoners Decision Procedure based on tableau calculus Refutation-based, repeated applications of de-Morgan’s law Widely used and tested on ontologies Many reasoners can now classify the larger biological ontologies in acceptable time Less widely used on data RDF triplestores are commonly used but these lack key OWL constructs. OWLGRES is a promising technology here. Reasoning with the RNA Ontology – p.12/28
  • 18. No Unique Name Assumption Classes and instances are potentially equivalent unless declared otherwise. Given ontology axiom: Functional(fivePrimeTo) An instance axioms: A(b1) A(b2) A(b3) b1 fivePrimeTo b2 b1 fivePrimeTo b3 A reasoner will not say this is inconsistent. It will infer that b2=b3. To get a reasoner to detect the inconsistency we must explicitly declare all base instances to be distinct: b1 differentFrom b2 b1 differentFrom b3 b2 differentFrom b3 Reasoning with the RNA Ontology – p.13/28
  • 19. The Open World Assumption Unstated facts are not assumed to be false. Given ontology axioms A SubClassOf Base UnpairedBase equivalentTo some (Base that pairedWith 0 Base) An instance axioms: A(b1) A(b2) A(b3) b1 fivePrimeTo b2 b2 fivePrimeTo b3 A reasoner will not infer b1, b2 or b3 to be UnpairedBases. We need to explicitly declare this: UnpairedBase(b1) UnpairedBase(b2) Reasoning with the RNA Ontology – p.14/28
  • 20. OBO Initially an ad-hoc format for the Gene Ontology Graph-centric Terminological features Formal Semantics Initially lacked formal semantics. Formal definition written in natural language in Relations Ontology. Translation to OWL-DL (Horrocks et al) With OBO 1.3, every OBO document is a Common Logic Text OBO-Core consists only of atomic sentences OBO-CL allows arbitrary logical formulae OBO-H OBO-Core plus horn rules Reasoning with the RNA Ontology – p.15/28
  • 21. Reasoning over OBO ontologies Strategies convert to OWL and use an OWL reasoner convert to CL and use a FOL theorem prover Use a rule-based reasoner Java implementation: OBO-Edit Prolog implementation: Easy to extend SQL implementation: slow but scales over massive ontologies and datasets Limitations: limited support for negation Reasoning with the RNA Ontology – p.16/28
  • 22. Are Description Logics enough? Some things that cannot be done in OWL-2: Define relations using arithmetic: Define relations using intersection, union and negation Declare relations with > 2 arguments Makes reasoning about change harder Model cyclic structures Any structure with an acyclic path through some combination of relations (Carbon rings, RNA molecules) Reasoning with the RNA Ontology – p.17/28
  • 23. Arithmetic in relations We cannot express this in OWL: upstreamOf (x, y) ← end(x) < start(y) In OWL we must: explicitly name all the bases, and declare a 5’ to 3’ connection relation between them declare < as the transitive version of the 5’ to 3’ relation This is feasible with RNA, but not DNA Reasoning with the RNA Ontology – p.18/28
  • 24. Relation Boolean Constructs We cannot express this in OWL: overlaps = ends.af ter.startOf ∩ starts.bef ore.endOf disconnected = ¬overlaps This severely limits OWL when applied to instance data involving intervals Reasoning with the RNA Ontology – p.19/28
  • 25. N-ary relations and time In OWL, all relations must be binary. N-ary relations are useful for reasoning about change. As the RNA molecule folds, unpaired bases become paired: ¬paired with CWW(b1, b5, t0) paired with CWW(b1, b5, t1) instance of (b1, UnpairedBase, t0) instance of (b1, PairedBase, t1) There are a variety of (awkward) techniques for translating N-ary relations to binary ¬paired with CWW(b1@t0, b5@t0) paired with CWW(b1@t1, b5@t1) Reasoning with the RNA Ontology – p.20/28
  • 26. Cyclic descriptions OWL Descriptions are tree-like. Cyclic descriptions are required for RNA Structures. Proposed def of GNRA Tetraloop: GNRA TetraloopMotif = hasPart some ( Nucleobase and fivePrimeTo some (G and fivePrimeTo some (Nucleobase and fivePrimeTo some (Purine and fivePrimeTo some (A and fivePrimeTo some (Nucleobase and pairsWithCWW som and pairsWithTHS some G))) and pairsWithTSH some A) and pairsWithCWW some Nucleobase) Reasoning with the RNA Ontology – p.21/28
  • 27. Tree-like classification structure GNRA TetraloopMotif = hasPart so and fivePrimeTo some (G and N (Nucleobase and fivePrimeTo some N G fivePrimeTo some (A and fivePrimeTo A N and pairsWithCWW some Nucleoba R THS some G))) and pairsWithTSH A sWithCWW some Nucleobase) G N N Reasoning with the RNA Ontology – p.22/28
  • 28. Tree-like classification structure GNRA TetraloopMotif = hasPart so and fivePrimeTo some (G and N (Nucleobase and fivePrimeTo some N G fivePrimeTo some (A and fivePrimeTo A N and pairsWithCWW some Nucleoba R THS some G))) and pairsWithTSH A sWithCWW some Nucleobase) G N N C G A G A A Reasoning with the RNA Ontology – p.22/28
  • 29. Tree-like classification structure GNRA TetraloopMotif = hasPart so and fivePrimeTo some (G and N (Nucleobase and fivePrimeTo some N G fivePrimeTo some (A and fivePrimeTo A N and pairsWithCWW some Nucleoba R THS some G))) and pairsWithTSH A sWithCWW some Nucleobase) G N N C G A A A G G A A A G C Reasoning with the RNA Ontology – p.22/28
  • 30. Labeled sub-descriptions We would like to do something like this, if it were possible in OWL: GNRATetraloopMotif = hasPart some (Nucleobase[1] and fivePrimeTo some (G[2] and fivePrimeTo some (Nucleobase[3] and fivePrimeTo some (Nucleobase[4] and fivePrimeTo some (A[5] and fivePrimeTo some (Nucleobase[6] and pairsWithCW and pairsWithTHS some G[2]))) and pairsWithTSH some A[5]) and pairsWithCWW some Nucleobase[6]) Reasoning with the RNA Ontology – p.23/28
  • 31. Rules SWRL (Semantic Web Rule Language) extends OWL with rules. We can add this to the ontology: nucleotide(?b0), g(?b1), nucleotide(?b2), purine(?b3), a(?b4), nucleotide(?b5), followedBy(?b0, ?b1), followedBy(?b1, ?b2), followedBy(?b2, ?b3), followedBy(?b3, ?b4), followedBy(?b4, ?b5), pairedWithTHS(?b4, ?b1), pairedWithCWW(?b5, ?b0) --> partOfGNRATetraloop(?b0) Reasoning with the RNA Ontology – p.24/28
  • 32. Is SWRL the answer? Bonus: Can be extended with arithmetic operators (to define upstreamOf) Negative: only binary relations Negative: only instance classification We cannot use the previous definition for ontology classification Negative: we cannot infer the existence of undeclared entities We can tell a base is part of a tetraloop motif, but we can’t infer the tetraloop motif instance Reasoning with the RNA Ontology – p.25/28
  • 33. Description Graphs An extension of OWL to allow representation of cyclic structures[?]. Possibly part of OWL3? Implemented in HermiT reasoner Largely new and untested Reasoning with the RNA Ontology – p.26/28
  • 34. OBO Graphs Cyclic structures can be described in OBO, the graph is translated to simple rules. These rules can be executed us- ing LP or even SQL. Reasoning with the RNA Ontology – p.27/28
  • 35. OBO Graphs Reasoning with the RNA Ontology – p.27/28
  • 36. Conclusions There is no one single ideal subset of FOL for reasoning The RNA Ontology should employ as expressive a logic as it needs But first the RNAO must exist Reasoning with the RNA Ontology – p.28/28
  • 37. Conclusions There is no one single ideal subset of FOL for reasoning All subsets have limitations. DLs cannot express a lot of what we need for primary and secondary sequence structures The RNA Ontology should employ as expressive a logic as it needs But first the RNAO must exist Reasoning with the RNA Ontology – p.28/28
  • 38. Conclusions There is no one single ideal subset of FOL for reasoning The RNA Ontology should employ as expressive a logic as it needs An incorrect formally specified definition is worse than a correct informally specified definition Hybrid reasoning approaches are feasible The basic instance classification problem is just not that hard (compared to RNA bioinformatics as a whole) Special purpose algorithms will probably beat general purpose reasoners But first the RNAO must exist Reasoning with the RNA Ontology – p.28/28
  • 39. Conclusions There is no one single ideal subset of FOL for reasoning The RNA Ontology should employ as expressive a logic as it needs But first the RNAO must exist Perhaps its too early to worry too much about reasoning Priority: simple term lists, basic isa hierarchy, with definitions written for humans, plus motif definitions in some compact notation Reasoning with the RNA Ontology – p.28/28