A 5-minute presentation at University of Edinburgh for UK Ontology Workshop 2013-04-11. The animals demonstrate that ontologies can be simple and lament the lack og good ontologies in most of physical science, especially computational chemistry. Blog at http://blogs.ch.cam.ac.uk/pmr
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Ontologies in Physical Science
1. Ontologies in Physical Science
Peter Murray-Rust,
University of Cambridge
& Open Knowledge Foundation
Onto Workshop, ed.ac.uk 2013-04-11
An #animalgarden production
2. PMR and friends
want us to help build Is it an
a computational important
chemistry ontology problem?
$1,000,000,000/yr
for compchem
They need
OWL
Problem: How to build ontologies when
people are uninterested or antagonistic
even though we have the technology
3. Perhaps the
chemists could
use OWL-DL
Chemists don’t
use ontologies
Top-down
schemas like
AniML haven’t
(yet) taken off
4. Are there any
ontologies in physical
science that work?
Crystallo-
graphers build
CIF dictionaries
The IUCr, right? Tell
us about CIF
IUCr: International Union of Crystallography
5. CIF Core defines
500 common
concepts
Like the
wavelength of
the radiation
used
Or the volume of
the crystal cell
CIF: http://www.iucr.org/cif
6. An Core dictionary (coreCIF) version 2.4.3
example _diffrn_ambient_temperature
? Definition: The mean temperature in kelvins at
which the intensities were measured.
Range: 0.0 -> infinity Type: numb
ID For machines:
Constraint + type
For
humans
http://www.iucr.org/__data/iucr/cifdic_html/1/
cif_core.dic/Idiffrn_ambient_temperature.html
7. Definition: The mean temperature in kelvins at
which the intensities were measured.
So everyone
converts
temperatures
to use K?
Yes! today I
swam at 273K
But chemists
We MUST
want to use all
have a units
sorts of
ontology
different units
8. OWL? Is CIF
a proper
ontology? It’s
not RDF…
…but we’ve global URIs, like
cif:_diffrn_ambient_temperature
Because IUCr controls the
namespace prefix: cif=
http://www.iucr.org/cif
9. CIF had 20 years
of community
involvement
through IUCr
But most top-
down chemistry
projects don’t
work
So we’ll do this
bottom-up.
10. Every compchem
program uses basically
the same scientific
concepts
We think each should
build its own dictionary so
we understand the output
Won’t that just
be a mess?
No. It’s the first step to
interoperability.
11. The programs
will use CML* for
chemical output
Hyperchem
builds ITS
dictionary
NWChem
Each annotates builds ITS
their own dictionary
program output
Chemical Markup Language PMR/Rzepa http://www.xml-cml.org
12. Alpha-electrons:
Hyperchem uses
hchem:e_alpha
NWChem has
nwchem:_alpha_elec
We agree they are the
same so create
compchem:alphae
in a communal
cml:compchem dictionary
that everyone uses
13. What if the
data structure
or concepts CML provides
don’t map conventions so
each group can
define their data
structure
Data can then be
machine validated
against each
convention!
14. But there are We’ve
over 20 prototyped with
program many before.
codes. They’ll be
encouraged
GULP, DPOL
Y, CASTEP, S
IESTA, MOPA
C…
I think it’s
going to work.
BUT TTT*
TTT: Things Take Time (Piet Hein)
15. Will it work? It National labs
depends on CSIRO/AU
people and PNNL/US
are committed
And we have
companies like
I wish we had Hyperchem
some and Kitware
publishers
16. We’ll need
tools
We’ve got FoX* for
FORTRAN output
JUMBOTemplates
to parse logfiles
RDF for navigating
dictionaries
FoX*: XML/FORTRAN Toby White, Andrew Walker
17. Benefits of semantic dictionaries:
• FORTRAN logfile can be made semantic
• High degree of interoperability in chemistry
• Semantic publication (HTML5, CML, MathML)
• Interoperates with mainstream Web
• Easily scalable to other phys sci.
Problems:
• Closed code/minds is short-term market advantage
• Non-trivial commitment (updates, code revision)
• Getting top-down approval (e.g. IUPAC)
18. Benefits of semantic dictionaries:
• FORTRAN logfile can be made semantic
• High degree of interoperability in chemistry
• Semantic publication (HTML5, CML, MathML)
• Interoperates with mainstream Web
• Easily scalable to other phys sci.
Problems:
• Closed code/minds is short-term market advantage
• Non-trivial commitment (updates, code revision)
• Getting top-down approval (e.g. IUPAC)
19. Benefits of semantic dictionaries:
• FORTRAN logfile can be made semantic
• High degree of interoperability in chemistry
• Semantic publication (HTML5, CML, MathML)
• Interoperates with mainstream Web
• Easily scalable to other phys sci.
Problems:
• Closed code/minds is short-term market advantage
• Non-trivial commitment (updates, code revision)
• Getting top-down approval (e.g. IUPAC)
20. Benefits of semantic dictionaries:
• FORTRAN logfile can be made semantic
• High degree of interoperability in chemistry
• Semantic publication (HTML5, CML, MathML)
• Interoperates with mainstream Web
• Easily scalable to other phys sci.
Problems:
• Closed code/minds is short-term market advantage
• Non-trivial commitment (updates, code revision)
• Getting top-down approval (e.g. IUPAC)
21. Benefits of semantic dictionaries:
• FORTRAN logfile can be made semantic
• High degree of interoperability in chemistry
• Semantic publication (HTML5, CML, MathML)
• Interoperates with mainstream Web
• Easily scalable to other phys sci.
Problems:
• Closed code/minds is short-term market advantage
• Non-trivial commitment (updates, code revision)
• Getting top-down approval (e.g. IUPAC)
22. Perhaps the
chemists could
use OWL-DL
Chemists don’t
use ANY
ontologies
Top-down
schemas like
AniML haven’t
(yet) taken off