SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Formality and Pragmatics in
Authoring Ontologies
Robert Stevens
ODLS 2016
School of Computer Science
The University of Manchester
Manchester
United Kingdom
M13 9PL
Robert.stevens@manchester.ac.uk
Acknowledgements
• On-going work with Phil Lord on normalising
the Gene Ontology
• The Gene Ontology folk for making GO
• Nico Matentzoglu for my slides
• Mercedes Casteleiro for numbers
Formality and Pragmatics
• Formality: Acting strictly according to
procedure or rules
– Ontological formality
– Representational formality
• Pragmatics: Behaviour driven by practical
consequences rather than dogma
• There’s a tension between the two
Gene Ontology Molecular Function
• D-alanyl carrier activity
• acetylcholine receptor regulator activity
• antioxidant activity
• binding
• calcium channel regulator activity
• catalytic activity
• channel regulator activity
• chemoattractant activity
• chemorepellent activity
• core DNA-dependent RNA polymerase binding
promoter specificity activity
• electron carrier activity
• enzyme regulator activity
• guanyl-nucleotide exchange factor activity
• metallochaperone activity
• mitochondrial RNA polymerase binding
promoter specificity activity
• molecular function regulator
• molecular transducer activity
• morphogen activity
• negative regulation of molecular function
• neurotransmitter receptor regulator activity
• nucleic acid binding transcription factor activity
• nutrient reservoir activity
• positive regulation of molecular function
• protein tag
• receptor regulator activity
• regulation of molecular function
• signal transducer activity
• structural molecule activity
• transcription factor activity, core RNA
polymerase binding
• transcription factor activity, protein binding
• transcription factor activity, transcription factor
binding
• translation regulator activity
• transporter activity
NUMBER OF TERMS: ~10k
http://geneontology.org/
What is Molecular Function in GO?
• Describes “function”…?
GO:0003674
molecular_function
Elemental activities, such as catalysis or
binding, describing the actions of a
gene product at the molecular level. A
given gene product may exhibit one
or more molecular functions.
Motivation
• Is GO’s molecular function ontology really
function, “little” processes or both?
• Documented as a function
• Sometimes looks like a process
• Sometimes treated like a process
• Confusion of thing with a function and the
function
• This can make modelling harder than it need be
A Couple of Observations
• Pragmatically, we commit to GO – it’s the only
show in town and it works
• There’s a lot of chemicals around in GO MF
• We are biochemistry….!
• Probably few functions – strip out all the “non-
function” stuff and see what’s left
• Then we can look at the ontological nature of GO
MF
• Also, re-create in a more sustainable form
It’s all work in progress
A “tangled” ontology of amino acids
12
There are several dimensions of
classification here
• The amino acids themselves – a chemical dimension
• The size of the amino acids side chain
• The charge on the side chain
• The polarity of the side chain
• The hydrophobicity of the side chain
• We can normalise these into separate hierarchies then put
them back together again
• Our goal is to put entities into separate trees all formed on
the same basis
• Size only talks about size; amino acid only talks about
chemical composition (based on an alpha-carbon with an
amino and carboxylic acid group);and so onof classification
13
The dimensions
separated
14
Amino Acids
Alanine
Arginine
Asparagine
Cysteine
Glutamate
Glutamine
Glycine
Histidine
Isoleucine
Leucine
Lysine
Methionine
Phenylalanine
Proline
Serine
Threonine
Tryptophan
Tyrosine
Valine
Charge
Negative
Neutral
Positive
Size
Tiny
Small
Medium
Large
Polarity
Polar
Nonpolar
Hydrophobicity
Hydrophobic
Hydrophilic
The process
• Hand-crafted ontologies with a polyhierarchy
are “tangled”
• Usually axiomatically lean
• We classify along one axis and use
“restrictions” to other modules to capture
other axes
• Then re-build the polyhierarchy using the
axiomatically rich ontology
15
“Pulling out” dimensions
• Each separate tree must be the same kind of
thing
• We don’t mix continuants, processes,
qualities, etc
• We don’t mix our classification by, for
instance, structure and then charge
• We do that compositionally via defined classes
and automated reasoners
16
The amino acid pattern
17
Class: AminoAcid
SubClassOf:
hasSize some Size,
hasPolarity some Polar,
hasCharge some Charge,
hasHydrophobicity some Hydrophobicity
An amino acid
18
Class: Lysine
SubClassOf:
AminoAcid,
hasSize some Large,
hasCharge some Positive,
hasPolarity some Polar,
hasHydrophobicity some Hydrophilic
Rebuilding the hierarchy
• Class: LargeAminoAcid
– EquivalentTo: AminoAcid
• and hasSize some Large
• Class: PositiveAminoAcid
– EquivalentTo: AminoAcid
– and hasCharge some Positive
• Class: LargePositiveAminoAcid
– EquivalentTo: LargeAminoAcid and PositiveAminoAcid
19
A “tangled” ontology of amino acids
20
Other Ontology Topics as
Factors in GO MF
molecular
function
chemical
chemical
role
reaction
biological
process
cellular
component
cell
protein
sequence
40-60% of terms
mention chemicals
Some GO Terms
GO
MF
glucose
import
cytosolic
calcium ion
transport
hydrolase
activity
tyrosine
binding
retroviral
strand
transfer
activity
electron
carrier
activity
Binding
• ~2k terms in the binding bit of GO MF
• Remove the chemicals
• Leaves “binding”
• There is a function “to bind”
• There is a process of binding”
• Linguistically – an infinitive and a
gerund/nominalised verb
More “to bind” Functions?
• “to bind” is the basic function
• Specialise to to bind covalently, to bind via
hydrogen, to bind electrostatically
but these are built compositionally with
reference to other ontologies
Chemorepellant - chemoattractant
activity
GO:0042056
chemoattractant activity
Providing the environmental signal that
initiates the directed movement of a
motile cell or organism towards a
higher concentration of that signal.
GO:0045499
chemorepellent activity
Providing the environmental signal that
initiates the directed movement of a
motile cell or organism towards a
lower concentration of that signal.
To diffuse
GO realisable entities
RealizableEntity
ToCatalyse
ToBind
ToMark
ToStore
ToDiffuse
ToTransportToMaintainIntegrity
ToProtect
ToModulate
ToRegulate
ToTransduce
Angels on the head of a pin
Distinctions with no (practical)
difference
• “Distinction without a difference” – making a
distinction where none exists
• Distinctions may exist, but does one need to
make them?
• Does a distinction make a practical difference
to the use case in hand?
• Make no distinction unless it makes a
difference
• Beware of consistency…
New function
hierarchy
• RealizableEntity
– ToCatalyse
– ToBind
• ToMark
– ToStore
– ToDiffuse
– ToTransport
– ToMaintainIntegrity
– ToProtect
– ToModulate
• ToRegulate
– ToTransduce
Is
realized
in
Standard pattern – some and only
Has
realizable
entity
Gene
product
Realisable
entity
Biological
process
RO candidate: capable_of = shortcut
Is capable of
Gene
product
Biological
process
Some patterns
• hasRealisableEntity some (to_bind and
realisedIn only (binding and hasInput some
chemical)))
• Add “playsrole some role” for a chemical role
like drug
• hasRealisableEntity some (to_catalyse and
realisedIn only (catalysis and hasInput some
chemical and hasOutput some chemical))
Actually doing it
• Programmatically using Tawny-OWL
• Asserted tree of molecular realisables and
molecular processes
• Defined classes for the actual terms
• May have to restrict to OWL EL for practical
reasons
• We shall see…
Strategies for Defined Classes
• Total post co-ordination
• Total pre co-ordination
• Pre co-ordinate those classes that have been
used in annotation
How many GO MF terms are used?
Annotation file
Homo sapiens: Canonical
accessions from UniProt
(goa_human.gaf.gz)
Unfiltered GOA UniProt gene
association file
(goa_uniprot_all.gaf.gz)
Total number of GO-
UniProt annotations 354 515 ~ 354K 294 208 149 ~ 294M
Unique UniProt IDs 19 055 ~ 19K 45 968 890 ~ 46M
Unique active Molecular
Function classes 3 947 ~ 4K 7 521 ~ 7K
Unique active Molecular
Function classes used
more than 5 times
1 313 ~ 1K
What have we found?
• Very few functions
• … and some look dispositional
• It looks like physics
• Most functions involve binding – makes sense
• We separate realisables and processes
• We live with a bit of “replication”
• With molecular processes, do we need molecular
funtion?
• WE change the upper reaches of GO MF, but…
• Does it make any practical difference?
Formality
• Ontological formality
• Making the right distinctions drives consistent
use of relationships
• Facilitates the kind of analysis we’ve done
• Can also be a barrier to progress
• Representational formality
• Knowing what is being said is useful
• Allows clean interpretation
• Enables useful reasoning
Pragmatic Decisions
• Commit enough to achieve goals
• If re-using take on the commitments of that ontology
– If using OBO commit to OBO
– If what you’re using uses something with which you
disagree – get over it
• Axiom pragmatics
• Don’t represent that which isn’t needed
• Truth and beauty
• A counsel of perfection is a counsel of despair
• I’d make “gene product” explicit

Weitere ähnliche Inhalte

Ähnlich wie The Pragmatics and Formality of Authoring OntologiesOdsl 2016

Biochemistry - Ch3 Amino Acids , Peptides , Protein
Biochemistry - Ch3 Amino Acids , Peptides , ProteinBiochemistry - Ch3 Amino Acids , Peptides , Protein
Biochemistry - Ch3 Amino Acids , Peptides , ProteinAreej Abu Hanieh
 
Crash course of biochemistry
Crash  course of biochemistryCrash  course of biochemistry
Crash course of biochemistryGaurav Kr
 
Enzymes Part~1
Enzymes Part~1Enzymes Part~1
Enzymes Part~1Alok Kumar
 
Enzyme~clinical enzymology
Enzyme~clinical enzymologyEnzyme~clinical enzymology
Enzyme~clinical enzymologyAlok Kumar
 
Enzyme and coenzyme
Enzyme and coenzymeEnzyme and coenzyme
Enzyme and coenzymeHeru Pramono
 
UNIT-3 BACTERIAL ETABOLISM.pptx
UNIT-3 BACTERIAL ETABOLISM.pptxUNIT-3 BACTERIAL ETABOLISM.pptx
UNIT-3 BACTERIAL ETABOLISM.pptxAyushiSharma843565
 
Lehninger_Ch1_Introduction.ppt
Lehninger_Ch1_Introduction.pptLehninger_Ch1_Introduction.ppt
Lehninger_Ch1_Introduction.pptssuser796efb
 
2017-2018محاضرات الانزيمات
 2017-2018محاضرات الانزيمات 2017-2018محاضرات الانزيمات
2017-2018محاضرات الانزيماتMustafa Taha mohammed
 
The Chemistry of Monoclonal Antibodies
The Chemistry of Monoclonal AntibodiesThe Chemistry of Monoclonal Antibodies
The Chemistry of Monoclonal AntibodiesPharmaxo
 
Lecture-1_Introduction.pdf
Lecture-1_Introduction.pdfLecture-1_Introduction.pdf
Lecture-1_Introduction.pdfAhmadMateen10
 
motifs and PPI databases.pptx
motifs and PPI databases.pptxmotifs and PPI databases.pptx
motifs and PPI databases.pptxNighatRbb
 
5. Biochemistry of enzymes edited 2024.pptx
5. Biochemistry of enzymes edited 2024.pptx5. Biochemistry of enzymes edited 2024.pptx
5. Biochemistry of enzymes edited 2024.pptxmohammed959032
 
Ap bio ch 3 Functional Groups & Macromolecules
Ap bio ch 3 Functional Groups & MacromoleculesAp bio ch 3 Functional Groups & Macromolecules
Ap bio ch 3 Functional Groups & Macromoleculeszernwoman
 
Enzyme. defination ,classification and application
Enzyme. defination ,classification and applicationEnzyme. defination ,classification and application
Enzyme. defination ,classification and applicationnileemamodhave1
 
Gr meeting august 14, 2003
Gr meeting august 14, 2003Gr meeting august 14, 2003
Gr meeting august 14, 2003Samares Biswas
 
Protein Function - General Biology 2 Lesson
Protein Function - General Biology 2 LessonProtein Function - General Biology 2 Lesson
Protein Function - General Biology 2 LessonCyrusEsguerra6
 
Enzymes by Dr. Aritri Bir
Enzymes by Dr. Aritri BirEnzymes by Dr. Aritri Bir
Enzymes by Dr. Aritri BirAritriBir
 

Ähnlich wie The Pragmatics and Formality of Authoring OntologiesOdsl 2016 (20)

Biochemistry - Ch3 Amino Acids , Peptides , Protein
Biochemistry - Ch3 Amino Acids , Peptides , ProteinBiochemistry - Ch3 Amino Acids , Peptides , Protein
Biochemistry - Ch3 Amino Acids , Peptides , Protein
 
Ch 3 نفسه
Ch 3 نفسهCh 3 نفسه
Ch 3 نفسه
 
Crash course of biochemistry
Crash  course of biochemistryCrash  course of biochemistry
Crash course of biochemistry
 
Enzymes Part~1
Enzymes Part~1Enzymes Part~1
Enzymes Part~1
 
Enzyme~clinical enzymology
Enzyme~clinical enzymologyEnzyme~clinical enzymology
Enzyme~clinical enzymology
 
Enzyme and coenzyme
Enzyme and coenzymeEnzyme and coenzyme
Enzyme and coenzyme
 
UNIT-3 BACTERIAL ETABOLISM.pptx
UNIT-3 BACTERIAL ETABOLISM.pptxUNIT-3 BACTERIAL ETABOLISM.pptx
UNIT-3 BACTERIAL ETABOLISM.pptx
 
Lehninger_Ch1_Introduction.ppt
Lehninger_Ch1_Introduction.pptLehninger_Ch1_Introduction.ppt
Lehninger_Ch1_Introduction.ppt
 
2017-2018محاضرات الانزيمات
 2017-2018محاضرات الانزيمات 2017-2018محاضرات الانزيمات
2017-2018محاضرات الانزيمات
 
The Chemistry of Monoclonal Antibodies
The Chemistry of Monoclonal AntibodiesThe Chemistry of Monoclonal Antibodies
The Chemistry of Monoclonal Antibodies
 
Lecture-1_Introduction.pdf
Lecture-1_Introduction.pdfLecture-1_Introduction.pdf
Lecture-1_Introduction.pdf
 
motifs and PPI databases.pptx
motifs and PPI databases.pptxmotifs and PPI databases.pptx
motifs and PPI databases.pptx
 
5. Biochemistry of enzymes edited 2024.pptx
5. Biochemistry of enzymes edited 2024.pptx5. Biochemistry of enzymes edited 2024.pptx
5. Biochemistry of enzymes edited 2024.pptx
 
Ap bio ch 3 Functional Groups & Macromolecules
Ap bio ch 3 Functional Groups & MacromoleculesAp bio ch 3 Functional Groups & Macromolecules
Ap bio ch 3 Functional Groups & Macromolecules
 
Bmm480 Enzymology lecture-1
Bmm480 Enzymology lecture-1Bmm480 Enzymology lecture-1
Bmm480 Enzymology lecture-1
 
Enzymes
EnzymesEnzymes
Enzymes
 
Enzyme. defination ,classification and application
Enzyme. defination ,classification and applicationEnzyme. defination ,classification and application
Enzyme. defination ,classification and application
 
Gr meeting august 14, 2003
Gr meeting august 14, 2003Gr meeting august 14, 2003
Gr meeting august 14, 2003
 
Protein Function - General Biology 2 Lesson
Protein Function - General Biology 2 LessonProtein Function - General Biology 2 Lesson
Protein Function - General Biology 2 Lesson
 
Enzymes by Dr. Aritri Bir
Enzymes by Dr. Aritri BirEnzymes by Dr. Aritri Bir
Enzymes by Dr. Aritri Bir
 

Mehr von robertstevens65

Ontologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientOntologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientrobertstevens65
 
Choosing and Building Knowledge Artefacts
Choosing and Building Knowledge ArtefactsChoosing and Building Knowledge Artefacts
Choosing and Building Knowledge Artefactsrobertstevens65
 
Populous: A tool for Populating OWL Ontologies from Templates
Populous: A tool for Populating OWL Ontologies from TemplatesPopulous: A tool for Populating OWL Ontologies from Templates
Populous: A tool for Populating OWL Ontologies from Templatesrobertstevens65
 
Keeping ontology development Agile
Keeping ontology development AgileKeeping ontology development Agile
Keeping ontology development Agilerobertstevens65
 
Lessons from teaching non-computer scientists OWL and ontologies
Lessons from teaching non-computer scientists OWL and ontologiesLessons from teaching non-computer scientists OWL and ontologies
Lessons from teaching non-computer scientists OWL and ontologiesrobertstevens65
 
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)robertstevens65
 
A Rose by Any Other Name is Still a Rose
A Rose by Any Other Name is Still a RoseA Rose by Any Other Name is Still a Rose
A Rose by Any Other Name is Still a Roserobertstevens65
 
Working with big biomedical ontologies
Working with big biomedical ontologiesWorking with big biomedical ontologies
Working with big biomedical ontologiesrobertstevens65
 
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...robertstevens65
 
Ontology learning from text
Ontology learning from textOntology learning from text
Ontology learning from textrobertstevens65
 
Knowledge Management in a Knowledge Based Discipline
Knowledge Management in a Knowledge Based DisciplineKnowledge Management in a Knowledge Based Discipline
Knowledge Management in a Knowledge Based Disciplinerobertstevens65
 
A family History Knowledge Base in OWL 2
A family History Knowledge Base in OWL 2A family History Knowledge Base in OWL 2
A family History Knowledge Base in OWL 2robertstevens65
 
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4 RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4 robertstevens65
 
Communities building ontologies: Tensions and Reality
Communities building ontologies: Tensions and RealityCommunities building ontologies: Tensions and Reality
Communities building ontologies: Tensions and Realityrobertstevens65
 
Issues in Learning an Ontology from Text
Issues in Learning an Ontology from Text Issues in Learning an Ontology from Text
Issues in Learning an Ontology from Text robertstevens65
 
Making Semantics do Some Work
Making Semantics do Some WorkMaking Semantics do Some Work
Making Semantics do Some Workrobertstevens65
 
Can there be such a thing as Ontology Engineering?
Can there be such a thing as Ontology Engineering?Can there be such a thing as Ontology Engineering?
Can there be such a thing as Ontology Engineering?robertstevens65
 
The Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in BiologyThe Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in Biologyrobertstevens65
 

Mehr von robertstevens65 (20)

Ontologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientOntologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficient
 
Choosing and Building Knowledge Artefacts
Choosing and Building Knowledge ArtefactsChoosing and Building Knowledge Artefacts
Choosing and Building Knowledge Artefacts
 
Populous: A tool for Populating OWL Ontologies from Templates
Populous: A tool for Populating OWL Ontologies from TemplatesPopulous: A tool for Populating OWL Ontologies from Templates
Populous: A tool for Populating OWL Ontologies from Templates
 
Keeping ontology development Agile
Keeping ontology development AgileKeeping ontology development Agile
Keeping ontology development Agile
 
Spreadsheets to OWL
Spreadsheets to OWLSpreadsheets to OWL
Spreadsheets to OWL
 
Lessons from teaching non-computer scientists OWL and ontologies
Lessons from teaching non-computer scientists OWL and ontologiesLessons from teaching non-computer scientists OWL and ontologies
Lessons from teaching non-computer scientists OWL and ontologies
 
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
 
A Rose by Any Other Name is Still a Rose
A Rose by Any Other Name is Still a RoseA Rose by Any Other Name is Still a Rose
A Rose by Any Other Name is Still a Rose
 
Working with big biomedical ontologies
Working with big biomedical ontologiesWorking with big biomedical ontologies
Working with big biomedical ontologies
 
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
 
Ontology learning from text
Ontology learning from textOntology learning from text
Ontology learning from text
 
Knowledge Management in a Knowledge Based Discipline
Knowledge Management in a Knowledge Based DisciplineKnowledge Management in a Knowledge Based Discipline
Knowledge Management in a Knowledge Based Discipline
 
Ontology at Manchester
Ontology at ManchesterOntology at Manchester
Ontology at Manchester
 
A family History Knowledge Base in OWL 2
A family History Knowledge Base in OWL 2A family History Knowledge Base in OWL 2
A family History Knowledge Base in OWL 2
 
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4 RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4
 
Communities building ontologies: Tensions and Reality
Communities building ontologies: Tensions and RealityCommunities building ontologies: Tensions and Reality
Communities building ontologies: Tensions and Reality
 
Issues in Learning an Ontology from Text
Issues in Learning an Ontology from Text Issues in Learning an Ontology from Text
Issues in Learning an Ontology from Text
 
Making Semantics do Some Work
Making Semantics do Some WorkMaking Semantics do Some Work
Making Semantics do Some Work
 
Can there be such a thing as Ontology Engineering?
Can there be such a thing as Ontology Engineering?Can there be such a thing as Ontology Engineering?
Can there be such a thing as Ontology Engineering?
 
The Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in BiologyThe Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in Biology
 

Kürzlich hochgeladen

Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 

Kürzlich hochgeladen (20)

Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 

The Pragmatics and Formality of Authoring OntologiesOdsl 2016

  • 1. Formality and Pragmatics in Authoring Ontologies Robert Stevens ODLS 2016 School of Computer Science The University of Manchester Manchester United Kingdom M13 9PL Robert.stevens@manchester.ac.uk
  • 2. Acknowledgements • On-going work with Phil Lord on normalising the Gene Ontology • The Gene Ontology folk for making GO • Nico Matentzoglu for my slides • Mercedes Casteleiro for numbers
  • 3. Formality and Pragmatics • Formality: Acting strictly according to procedure or rules – Ontological formality – Representational formality • Pragmatics: Behaviour driven by practical consequences rather than dogma • There’s a tension between the two
  • 4. Gene Ontology Molecular Function • D-alanyl carrier activity • acetylcholine receptor regulator activity • antioxidant activity • binding • calcium channel regulator activity • catalytic activity • channel regulator activity • chemoattractant activity • chemorepellent activity • core DNA-dependent RNA polymerase binding promoter specificity activity • electron carrier activity • enzyme regulator activity • guanyl-nucleotide exchange factor activity • metallochaperone activity • mitochondrial RNA polymerase binding promoter specificity activity • molecular function regulator • molecular transducer activity • morphogen activity • negative regulation of molecular function • neurotransmitter receptor regulator activity • nucleic acid binding transcription factor activity • nutrient reservoir activity • positive regulation of molecular function • protein tag • receptor regulator activity • regulation of molecular function • signal transducer activity • structural molecule activity • transcription factor activity, core RNA polymerase binding • transcription factor activity, protein binding • transcription factor activity, transcription factor binding • translation regulator activity • transporter activity NUMBER OF TERMS: ~10k http://geneontology.org/
  • 5.
  • 6. What is Molecular Function in GO? • Describes “function”…? GO:0003674 molecular_function Elemental activities, such as catalysis or binding, describing the actions of a gene product at the molecular level. A given gene product may exhibit one or more molecular functions.
  • 7. Motivation • Is GO’s molecular function ontology really function, “little” processes or both? • Documented as a function • Sometimes looks like a process • Sometimes treated like a process • Confusion of thing with a function and the function • This can make modelling harder than it need be
  • 8. A Couple of Observations • Pragmatically, we commit to GO – it’s the only show in town and it works • There’s a lot of chemicals around in GO MF • We are biochemistry….! • Probably few functions – strip out all the “non- function” stuff and see what’s left • Then we can look at the ontological nature of GO MF • Also, re-create in a more sustainable form
  • 9. It’s all work in progress
  • 10. A “tangled” ontology of amino acids 12
  • 11. There are several dimensions of classification here • The amino acids themselves – a chemical dimension • The size of the amino acids side chain • The charge on the side chain • The polarity of the side chain • The hydrophobicity of the side chain • We can normalise these into separate hierarchies then put them back together again • Our goal is to put entities into separate trees all formed on the same basis • Size only talks about size; amino acid only talks about chemical composition (based on an alpha-carbon with an amino and carboxylic acid group);and so onof classification 13
  • 13. The process • Hand-crafted ontologies with a polyhierarchy are “tangled” • Usually axiomatically lean • We classify along one axis and use “restrictions” to other modules to capture other axes • Then re-build the polyhierarchy using the axiomatically rich ontology 15
  • 14. “Pulling out” dimensions • Each separate tree must be the same kind of thing • We don’t mix continuants, processes, qualities, etc • We don’t mix our classification by, for instance, structure and then charge • We do that compositionally via defined classes and automated reasoners 16
  • 15. The amino acid pattern 17 Class: AminoAcid SubClassOf: hasSize some Size, hasPolarity some Polar, hasCharge some Charge, hasHydrophobicity some Hydrophobicity
  • 16. An amino acid 18 Class: Lysine SubClassOf: AminoAcid, hasSize some Large, hasCharge some Positive, hasPolarity some Polar, hasHydrophobicity some Hydrophilic
  • 17. Rebuilding the hierarchy • Class: LargeAminoAcid – EquivalentTo: AminoAcid • and hasSize some Large • Class: PositiveAminoAcid – EquivalentTo: AminoAcid – and hasCharge some Positive • Class: LargePositiveAminoAcid – EquivalentTo: LargeAminoAcid and PositiveAminoAcid 19
  • 18. A “tangled” ontology of amino acids 20
  • 19. Other Ontology Topics as Factors in GO MF molecular function chemical chemical role reaction biological process cellular component cell protein sequence 40-60% of terms mention chemicals
  • 20. Some GO Terms GO MF glucose import cytosolic calcium ion transport hydrolase activity tyrosine binding retroviral strand transfer activity electron carrier activity
  • 21. Binding • ~2k terms in the binding bit of GO MF • Remove the chemicals • Leaves “binding” • There is a function “to bind” • There is a process of binding” • Linguistically – an infinitive and a gerund/nominalised verb
  • 22. More “to bind” Functions? • “to bind” is the basic function • Specialise to to bind covalently, to bind via hydrogen, to bind electrostatically but these are built compositionally with reference to other ontologies
  • 23. Chemorepellant - chemoattractant activity GO:0042056 chemoattractant activity Providing the environmental signal that initiates the directed movement of a motile cell or organism towards a higher concentration of that signal. GO:0045499 chemorepellent activity Providing the environmental signal that initiates the directed movement of a motile cell or organism towards a lower concentration of that signal. To diffuse
  • 25. Angels on the head of a pin
  • 26. Distinctions with no (practical) difference • “Distinction without a difference” – making a distinction where none exists • Distinctions may exist, but does one need to make them? • Does a distinction make a practical difference to the use case in hand? • Make no distinction unless it makes a difference • Beware of consistency…
  • 27. New function hierarchy • RealizableEntity – ToCatalyse – ToBind • ToMark – ToStore – ToDiffuse – ToTransport – ToMaintainIntegrity – ToProtect – ToModulate • ToRegulate – ToTransduce
  • 28. Is realized in Standard pattern – some and only Has realizable entity Gene product Realisable entity Biological process
  • 29. RO candidate: capable_of = shortcut Is capable of Gene product Biological process
  • 30. Some patterns • hasRealisableEntity some (to_bind and realisedIn only (binding and hasInput some chemical))) • Add “playsrole some role” for a chemical role like drug • hasRealisableEntity some (to_catalyse and realisedIn only (catalysis and hasInput some chemical and hasOutput some chemical))
  • 31. Actually doing it • Programmatically using Tawny-OWL • Asserted tree of molecular realisables and molecular processes • Defined classes for the actual terms • May have to restrict to OWL EL for practical reasons • We shall see…
  • 32. Strategies for Defined Classes • Total post co-ordination • Total pre co-ordination • Pre co-ordinate those classes that have been used in annotation
  • 33. How many GO MF terms are used? Annotation file Homo sapiens: Canonical accessions from UniProt (goa_human.gaf.gz) Unfiltered GOA UniProt gene association file (goa_uniprot_all.gaf.gz) Total number of GO- UniProt annotations 354 515 ~ 354K 294 208 149 ~ 294M Unique UniProt IDs 19 055 ~ 19K 45 968 890 ~ 46M Unique active Molecular Function classes 3 947 ~ 4K 7 521 ~ 7K Unique active Molecular Function classes used more than 5 times 1 313 ~ 1K
  • 34. What have we found? • Very few functions • … and some look dispositional • It looks like physics • Most functions involve binding – makes sense • We separate realisables and processes • We live with a bit of “replication” • With molecular processes, do we need molecular funtion? • WE change the upper reaches of GO MF, but… • Does it make any practical difference?
  • 35. Formality • Ontological formality • Making the right distinctions drives consistent use of relationships • Facilitates the kind of analysis we’ve done • Can also be a barrier to progress • Representational formality • Knowing what is being said is useful • Allows clean interpretation • Enables useful reasoning
  • 36. Pragmatic Decisions • Commit enough to achieve goals • If re-using take on the commitments of that ontology – If using OBO commit to OBO – If what you’re using uses something with which you disagree – get over it • Axiom pragmatics • Don’t represent that which isn’t needed • Truth and beauty • A counsel of perfection is a counsel of despair • I’d make “gene product” explicit

Hinweis der Redaktion

  1. Informal definitions of the words formality and pragmatics I build ontology based applicationis and pragmatics come into play I like formality (up to a point) but I’d prefer an applicationi that does something over a formal ontology that is not usable – both is great, but I scarifice formality first
  2. 1) #Slide with molecular function title #add textbox with number of terms #URL: http://geneontology.org/ D-alanyl carrier activity acetylcholine receptor regulator activity antioxidant activity binding calcium channel regulator activity catalytic activity channel regulator activity chemoattractant activity chemorepellent activity core DNA-dependent RNA polymerase binding promoter specificity activity electron carrier activity enzyme regulator activity guanyl-nucleotide exchange factor activity metallochaperone activity mitochondrial RNA polymerase binding promoter specificity activity molecular function regulator molecular function regulator molecular transducer activity morphogen activity negative regulation of molecular function neurotransmitter receptor regulator activity nucleic acid binding transcription factor activity nutrient reservoir activity positive regulation of molecular function protein tag receptor regulator activity regulation of molecular function signal transducer activity structural molecule activity transcription factor activity, core RNA polymerase I binding transcription factor activity, core RNA polymerase II binding transcription factor activity, core RNA polymerase III binding transcription factor activity, core RNA polymerase binding transcription factor activity, protein binding transcription factor activity, transcription factor binding translation regulator activity transporter activity
  3. title: GO Molecular function 1. molecular_function   (GO:0003674) "Elemental activities, such as catalysis or binding, describing the actions of a gene product at the molecular level. A given gene product may exhibit one or more molecular functions." - 1. above in a box at the top of the slide with a text box below into which I can put bullets. the first bullet is * Describes "function"....?
  4. first slide is a tangled hiearchy (title "Normalisation 1" "Vehicle" at the top the leaves are: fast red sports car fast green sports car red lorry slow yellow lorry green van fast red motor cycle black estate car green saloon car red estate car   Then some intermedate, "defined classes" such as: red vehicle green vehicle fast red car red car and any you can think of andmake it tangled
  5. second slide (title "Normalisation 2") separate out a set of hierarchies Vehicle colour speed style   and if you can fit it on, an axiom pattern of   Class: Vehicle SubClassOf:      hasColour some Colour      hasStyle some Style      hasSpeed some Speed
  6. Normalisation; a paper from Alan Rector (2003)
  7. This pulling out of non-function aspects of GO MF I not complete Most aspects have OBO support Not electron and energy
  8. title: Chemorepellant - chemoattractant activity below, 1 and 2 are some kind of box with the GO term and Id as some form of title with the definition below. this links down to a blob containing 3. 1. chemoattractant activity   (GO:0042056) Providing the environmental signal that initiates the directed movement of a motile cell or organism towards a higher concentration of that signal. 2.chemorepellent activity   (GO:0045499) Providing the environmental signal that initiates the directed movement of a motile cell or organism towards a lower concentration of that signal. 3. both linking down to a blob containing "To diffuse"
  9. RealizableEntity Some of these functions l look dispoitional To store, to diffuse and to structurally maintain Lots of these “functyions” als also imply bidning This is not a surprise as some binding must happen for anything to happen(as-subclasses ToCatalyse :comment "To reduce the activation energy of a reaction, enabling it to go faster.") (defclass ToBind :comment "To interact tightly with another entity, longer than transiently, such that separating the entity requires significant energy. ToBind functions are often transitive; A has a function ToBind B, then vice versa is also true.") (defclass ToMark :comment "To bind between this entity X, and another entity Y, so that a third entity Z can also be bound, and thereby interact with Y." :super ToBind)) ;; #+end_src (defclass ToStore :comment "To contain a substance for later use.") (defclass ToDiffuse :comment "To spread outward from a single point as a result of Brownian motion.") (defclass ToTransport :comment "To enable the movement of an entity in a directed manner.") (defclass ToMaintainIntegrity :comment "To keep the same structure, shape or organisation despite physical forces, either in compression or in extension.") (defclass ToProtect :comment "To prevent an event occuring to this or another entity.") (defclass ToModulate :comment "To alter the strength or quantity of some other realisable entity.") (defclass ToRegulate :comment "To modulate in a directed manner, as part of a feedback loop." :super ToModulate) (defclass ToTransduce :comment "To change energy from one form to another.")
  10. Talk about Mungall et al’s normalisation of GO Partial; not down to the bare functions Intersting point around ribose sugars