This document discusses chemical classification for the Semantic Web. It explains that classification using ontologies enhances the potential of the chemical Semantic Web by conveying the type of data through hierarchical organization with logical definitions. It provides examples of how ontologies like ChEBI can classify chemicals like caffeine into subclasses and shows how individual data can be linked to these classifications through rdf:type relationships. The goal is to put scientific knowledge about what the data represents directly into the data through classification.
UiPath Community: AI for UiPath Automation Developers
Chemical classification for the Semantic Web
1. Janna Hastings, EBI Cheminformatics and Metabolism
Chemical classification
for the Semantic Web
ACS Skolnik Symposium, Philadelphia,
21 August 2012
EBI is an Outstation of the European Molecular Biology Laboratory.
2. Classification conveys the type for data
The Semantic Web makes data of all types
available, open and interlinked
Classification using OWL ontologies
dramatically enhances the potential of the
chemical Semantic Web
2 21.08.2012
3. Why classify for the Semantic Web?
RDF “triples”:
?subject ?relationship ?object
rdf:type
3 21.08.2012
6. Molecules are small
They are three-dimensional
Their structures can vary according to their environment
We say they have the same type
when they share important properties
All caffeine molecules have type caffeine
6 21.08.2012
7. There are many different ways to
represent molecules
InChI=1S/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3
SMILES=Cn1cnc2n(C)c(=O)n(C)c(=O)c12
Name=caffeine
Name=1,3,7-trimethyl-3,7-dihydro-1H-purine-2,6-dione
Identifier=KEGG COMPOUND:C07481
None of these are (themselves) molecules
They describe and approximate
7 21.08.2012
8. ?subject ?relationship ?object
Science aims to make discoveries of general rules
about the things that that data are about
Classification puts the scientific
knowledge into the data
8 21.08.2012
9. RDF is a technology for data representation,
OWL is a technology for classification
Hierarchical
organisation Synonyms
root Cross-references
leaves
Can be re-used across
Logical data sources
definitions
The Web Ontology Language (OWL)
9 21.08.2012 Hastings et al. Journal of Cheminformatics 2012
4:8 doi:10.1186/1758-2946-4-8
10. ChEBI Chemical entity
Chemical substance
Molecular entity
inorganic molecular entity Group hydroxy
group
organic molecular entity
aldehyde
sodium chloride
carboxylic acid
organophosphorous
compound
pyridoxal
acetylsalicylic acid
(vitamin B6)
(aspirin)
chlorfenvinfos
10 21.08.2012 Hastings et al. Journal of Cheminformatics 2012
4:8 doi:10.1186/1758-2946-4-8
11. owl:subClassOf
rdf:type
Your data, your favourite identifier
InChI=1S/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3
SMILES=Cn1cnc2n(C)c(=O)n(C)c(=O)c12
Name=caffeine
Name=1,3,7-trimethyl-3,7-dihydro-1H-purine-2,6-dione
Identifier=KEGG COMPOUND:C07481
…
…
11 21.08.2012
14. Thanks
Christoph Steinbeck
Marcus Ennis, Gareth Owen, Steve Turner, Adriano Dekker,
Venkatesh Muthukrishnan, ChEBI users
Leonid Chepelev, Michel Dumontier, Colin Batchelor, Evan
Bolton, Nico Adams, Egon Willighagen, Despoina Magka,
Robert Stevens, Andrew Dalke
Funding: BBSRC, EU Questions?
14 21.08.2012