SlideShare ist ein Scribd-Unternehmen logo
1 von 216
“Shopping for data should be as easy as 
shopping for shoes!” 
Dr. Carole Goble 
Professor, Dept. of Computer Science 
University of Manchester
“A little bit of semantics goes a long way” 
Dr. James Hendler 
Artificial Intelligence Researcher 
Rensselaer Polytechnic Institute 
One of the originators of the Semantic Web
…but a lot of semantics goes a long, long way! 
Mark Wilkinson 
Isaac Peral Distinguished Researcher 
Director, Fundación BBVA Chair in Biological Informatics 
Center for Plant Biotechnology and Genomics 
Technical University of Madrid
Making the Web a 
biomedical research platform 
from hypothesis through to publication
Publication 
Discourse 
Interpretation 
Hypothesis 
Experiment
Publication 
Discourse 
Interpretation 
Hypothesis 
Experiment
Motivation: 
3 intersecting trends in the Life Sciences 
that are now, or soon will be, 
extremely problematic
TREND #1 
NON-REPRODUCIBLE SCIENCE & 
THE FAILURE OF PEER REVIEW
Trend #1 
Multiple recent surveys of high-throughput biology 
reveal that upwards of 50% of published studies 
are not reproducible 
- Baggerly, 2009 
- Ioannidis, 2009
Trend #1 
Similar (if not worse!) in clinical studies 
- Begley & Ellis, Nature, 2012 
- Booth, Forbes, 2012 
- Huang & Gottardo, Briefings in Bioinformatics, 2012
Trend #1 
“the most common errors are simple, 
the most simple errors are common” 
At least partially because the 
analytical methodology was inappropriate 
and/or not sufficiently described 
- Baggerly, 2009
Trend #1 
These errors pass peer review 
The researcher is (sometimes) unaware of the error 
The process that led to the error is not recorded 
Therefore it cannot be detected during peer-review
Agencies have Noticed! 
In March, 2012, the US Institute of Medicine ~said 
“Enough is enough!”
Agencies have Noticed! 
Institute of Medicine Recommendations 
For Conduct of High-Throughput Research: 
1. Rigorously-described, -annotated, and -followed data 
management and manipulation procedures 
2. “Lock down” the computational analysis pipeline once it 
Evolution of Translational Omics Lessons Learned and the Path Forward. The 
Institute of Medicine of the National Academies, Report Brief, March 2012. 
has been selected 
3. Publish the analytical workflow in a formal manner, 
together with the full starting and result datasets
TREND #2 
BIGGER, CHEAPER DATA
Trend #2 
High-throughput technologies are becoming 
cheaper and easier to use
Trend #2 
High-throughput technologies are becoming 
cheaper and easier to use 
But there are still very few experts trained in 
statistical analysis of high-throughput data
Trend #2 
The number of job postings for data scientist 
positions increased by 15,000% between the 
summers of 2011 and 2012 
-- Indeed.com job trends data reported by 
http://blogs.nature.com/naturejobs/2013/03/18/so-you-want-to-be-a-data-scientist
Trend #2 
Therefore 
Even small, moderately-funded laboratories 
can now afford to produce more data 
than they can manage or interpret
Trend #2 
Therefore 
Even small, moderately-funded laboratories 
can now afford to produce more data 
than they can manage or interpret 
These labs will likely never be able to afford 
a qualified data scientist
TREND #3 
“THE SINGULARITY”
The Healthcare 
Singularity and the 
Age of Semantic 
Medicine, Michael 
Gillam, et al, The 
Fourth Paradigm: 
Data-Intensive 
Scientific Discovery 
Tony Hey (Editor), 
2009 
Slide adapted with 
permission from 
Joanne Luciano, 
Presentation at 
Health Web 
Science Workshop 
2012, Evanston IL, 
USA 
June 22, 2012. 
Trend #3
“The Singularity” 
The X-intercept is where, the moment a discovery is made, 
it is immediately put into practice 
The Healthcare Singularity and the Age of Semantic Medicine, Michael Gillam, et al, The Fourth Paradigm: Data-Intensive Scientific Discovery Tony Hey (Editor), 2009 
Slide Borrowed with Permission from Joanne Luciano, Presentation at Health Web Science Workshop 2012, Evanston IL, USA 
June 22, 2012.
You 
Are 
Here 
Scientific research would have to be 
conducted within a medium that 
immediately interpreted 
and disseminated the results...
...in a form that immediately (actively!) affected the 
results of other researchers... 
You 
Are 
Here
...without requiring them to be aware 
of these new discoveries. 
You 
Are 
Here
3 intersecting 
and problematic trends 
Non-reproducible science that passes peer-review 
Cheaper production of larger and more complex datasets 
that require specialized expertise to analyze properly 
Need to more rapidly disseminate and use new discoveries
We Want More!
I don’t just want to reproduce 
your experiment...
I want to re-use your experiment
In my own laboratory... On MY DATA!
When I do my analysis 
I want to draw on the knowledge 
of global domain-experts like 
statisticians and pathologists... 
...as if they were mentors sitting 
in the chair beside me.
Please don’t make me find 
all of the data and knowledge 
that I require to do my experiment 
...it simply isn’t possible anymore... 
Image from: Mark Smiciklas 
Intersection Consulting, cc-nca
Image from AJ Cann 
cc-by-a license 
I want to support peer review(ers) 
so that I do better science.
How do we get there from here?
To overcome these intersecting problems 
and to achieve the goals of transparent 
reproducible research
We must learn how to 
do research IN the Web 
Not OVER the Web
How we use 
The Web today
The Web is not a pigeon!
Semantic Web Technologies
The Web
The Semantic Web 
causally related to
This is the critical bit! 
The link is explicitly labeled! 
causally related to 
???
http://semanticscience.org/resource/SIO_000243 
SIO_000243: 
<owl:ObjectProperty rdf:about="&resource;SIO_000243"> 
<rdfs:label xml: lang="en"> is causally related with</rdfs:label> 
<rdf:type rdf:resource="&owl;SymmetricProperty"/> 
<rdf:type rdf:resource="&owl;TransitiveProperty"/> 
<dc:description xml:lang="en"> A transitive, symmetric, temporal relation 
in which one entity is causally related with another non-identical entity. 
</dc:description> 
<rdfs:subPropertyOf rdf:resource="&resource;SIO_000322"/> 
</owl:ObjectProperty> 
causally related with
http://semanticscience.org/resource/SIO_000243 
SIO_000243: 
<owl:ObjectProperty rdf:about="&resource;SIO_000243"> 
<rdfs:label xml: lang="en"> is causally related with</rdfs:label> 
<rdf:type rdf:resource="&owl;SymmetricProperty"/> 
<rdf:type rdf:resource="&owl;TransitiveProperty"/> 
<dc:description xml:lang="en"> A transitive, symmetric, temporal relation 
in which one entity is causally related with another non-identical entity. 
</dc:description> 
<rdfs:subPropertyOf rdf:resource="&resource;SIO_000322"/> 
</owl:ObjectProperty> 
causally related with
Semantic Web Technologies 
“deep semantics”
Deep Semantics?
Ontology Spectrum 
Catalog/ 
ID 
Selected 
Logical 
Constraints 
(disjointness, 
inverse, …) 
Terms/ 
glossary 
Thesauri 
“narrower 
term” 
relation 
Formal 
is-a 
Frames 
(Properties) 
Informal 
is-a 
Formal 
instance 
Value Restrs. General 
Logical 
constraints 
Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty; 
– updated by McGuinness. 
Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html
Ontology Spectrum 
Catalog/ 
ID 
Selected 
Logical 
Constraints 
(disjointness, 
inverse, …) 
Terms/ 
glossary 
Thesauri 
“narrower 
term” 
relation 
Formal 
is-a 
Frames 
(Properties) 
Informal 
is-a 
Formal 
instance 
Value Restrs. General 
Logical 
constraints 
Most biomedical ontologies 
e.g. Gene Ontology 
Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty; 
– updated by McGuinness. 
Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html
Ontology Spectrum 
Catalog/ 
ID 
Ontologies being used in today’s talk 
Selected 
Logical 
Constraints 
(disjointness, 
inverse, …) 
Terms/ 
glossary 
Thesauri 
“narrower 
term” 
relation 
Formal 
is-a 
Frames 
(Properties) 
Informal 
is-a 
Formal 
instance 
Value Restrs. General 
Logical 
constraints 
Most biomedical ontologies 
e.g. Gene Ontology 
Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty; 
– updated by McGuinness. 
Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html
Ontology Spectrum 
Catalog/ 
ID 
Discovery & Interpretation systems – flexible! 
Selected 
Logical 
Constraints 
(disjointness, 
inverse, …) 
Terms/ 
glossary 
Thesauri 
“narrower 
term” 
relation 
Formal 
is-a 
Frames 
(Properties) 
Informal 
is-a 
Formal 
instance 
Value Restrs. General 
Logical 
constraints 
Categorization Systems 
Like library shelves, inflexible 
Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty; 
– updated by McGuinness. 
Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html
Remember, this is the critical bit! 
causally related with 
http://semanticscience.org/resource/SIO_000243 
It’s relationships that make 
the Semantic Web “Semantic”
Semantic Web Technologies 
“deep semantics”
Even with “deep semantics” 
a lot of important information cannot be represented 
on the Semantic Web 
For example, all of the data that results from 
analytical algorithms and statistical analyses
Varying estimates 
put the size of the 
Deep Web between 
500 and 800 times 
larger than the 
surface Web
On the WWW 
“automation” of 
access to Deep Web 
data happens through 
“Web Services”
There are many suggestions for how to bring the Deep Web 
into the Semantic Web using Semantic Web Services (SWS)
There are many suggestions for how to bring the Deep Web 
into the Semantic Web using Semantic Web Services (SWS) 
Describe input data 
Describe output data 
Describe how the system manipulates the data 
Describe how the world changes as a result
There are many suggestions for how to bring the Deep Web 
into the Semantic Web using Semantic Web Services (SWS) 
Describe input data 
Describe output data 
Describe how the system manipulates the data 
Describe how the world changes as a result 
None, so far, has proven to be wildly successful 
(in my opinion)
There are many suggestions for how to bring the Deep Web 
into the Semantic Web using Semantic Web Services (SWS) 
Describe input data 
Describe output data 
Describe how the system manipulates the data 
Describe how the world changes as a result 
None, so far, has proven to be wildly successful 
(in my opinion) 
…because describing what a Service does is HARD!
Lord, Phillip, et al. The Semantic Web–ISWC 2004 (2004): 350-364.
Scientific Web Services 
are DIFFERENT! 
Lord, Phillip, et al. The Semantic Web–ISWC 2004 (2004): 350-364.
“The service interfaces within bioinformatics are relatively 
simple. An extensible or constrained interoperability 
framework is likely to suffice for current demands: a fully 
generic framework is currently not necessary.” 
Lord, Phillip, et al. The Semantic Web–ISWC 2004 (2004): 350-364.
Scientific Web Services are DIFFERENT! 
They’re simpler! 
So perhaps we can solve the Semantic Web Service problem 
as it pertains to this (important!) domain
With respect to the Semantic Web 
What is missing from this list? 
Describe input data 
Describe output data 
Describe how the system manipulates the data 
Describe how the world changes as a result
causally related with 
http://semanticscience.org/resource/SIO_000243
causally related with 
http://semanticscience.org/resource/SIO_000243 
The Semantic Web gets its semantics from relationships
causally related with 
http://semanticscience.org/resource/SIO_000243 
The Semantic Web gets its semantics from relationships 
In 2008 I published a set of design-patterns 
for scientific Semantic Web Services 
that focuses on the biological relationship that the Service “exposes”
Design Pattern for 
Web Services on the Semantic Web
AACTCTTCGTAGTG... 
Web Service 
BLAST
AACTCTTCGTAGTG... 
SADI 
BLAST 
has_seq_string 
has 
homology 
to 
Terminal Flower 
type 
gene 
species 
A. thal. 
has_seq_string 
sequence 
SADI requires you to explicitly declare 
as part of your analytical output, 
the biological relationship that your 
algorithm “exposed”. 
AACTCTTCGTAGTG... 
sequence
I want to share several stories that demonstrate 
the cool things that happen when you use 
SADI + deep semantics
Story #1: SHARE 
The Semantic Health 
and Research Environment
A proof-of-concept workflow orchestrator 
+ SADI Semantic Web Service registry 
Objective: answer biologists’ questions
The SHARE registry 
indexes all of the input/output/relationship 
triples that can be generated by all known services 
This is how SHARE discovers services
SHARE demonstrations 
with increasing 
semantic complexity
What is the phenotype of every allele of the 
Antirrhinum majus DEFICIENS gene 
SELECT ?allele ?image ?desc 
WHERE { 
locus:DEF genetics:hasVariant ?allele . 
?allele info:visualizedByImage ?image . 
?image info:hasDescription ?desc 
}
What is the phenotype of every allele of the 
Antirrhinum majus DEFICIENS gene 
SELECT ?allele ?image ?desc 
WHERE { 
locus:DEF genetics:hasVariant ?allele . 
?allele info:visualizedByImage ?image . 
?image info:hasDescription ?desc 
} 
The query language here is SPARQL 
The W3C-approved, standard query language for the Semantic Web
What is the phenotype of every allele of the 
Antirrhinum majus DEFICIENS gene 
SELECT ?allele ?image ?desc 
WHERE { 
locus:DEF genetics:hasVariant ?allele . 
?allele info:visualizedByImage ?image . 
?image info:hasDescription ?desc 
} 
Note that there is no “FROM” clause! 
We don’t tell it where it should get the information, 
The machine has to figure that out by itself...
What is the phenotype of every allele of the 
Antirrhinum majus DEFICIENS gene 
SELECT ?allele ?image ?desc 
WHERE { 
locus:DEF genetics:hasVariant ?allele . 
?allele info:visualizedByImage ?image . 
?image info:hasDescription ?desc 
} 
Starting data: the locus “DEF” (Deficiens)
What is the phenotype of every allele of the 
Antirrhinum majus DEFICIENS gene 
SELECT ?allele ?image ?desc 
WHERE { 
locus:DEF genetics:hasVariant ?allele . 
?allele info:visualizedByImage ?image . 
?image info:hasDescription ?desc 
} 
Query: A series of relationships v.v. DEF
Enter that query into 
SHARE
Click “Submit”...
...and in a few seconds you get your answer. 
Based on the relationships in your query, SHARE queried its registry 
to automatically discover SADI Services capable of generating those triples
Because it is the Semantic Web 
The query results are live hyperlinks 
to the respective Database or images 
(The answer is IN the Web!)
What pathways does UniProt protein P47989 belong to? 
PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> 
PREFIX ont: <http://ontology.dumontierlab.com/> 
PREFIX uniprot: <http://lsrn.org/UniProt:> 
SELECT ?gene ?pathway 
WHERE { 
uniprot:P47989 pred:isEncodedBy ?gene . 
?gene ont:isParticipantIn ?pathway . 
}
What pathways does UniProt protein P47989 belong to? 
PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> 
PREFIX ont: <http://ontology.dumontierlab.com/> 
PREFIX uniprot: <http://lsrn.org/UniProt:> 
SELECT ?gene ?pathway 
WHERE { 
uniprot:P47989 pred:isEncodedBy ?gene . 
?gene ont:isParticipantIn ?pathway . 
}
What pathways does UniProt protein P47989 belong to? 
PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> 
PREFIX ont: <http://ontology.dumontierlab.com/> 
PREFIX uniprot: <http://lsrn.org/UniProt:> 
SELECT ?gene ?pathway 
WHERE { 
uniprot:P47989 pred:isEncodedBy ?gene . 
?gene ont:isParticipantIn ?pathway . 
} 
Note again that there is no “From” clause… 
I have not told SHARE where to look for the 
answer, I am simply asking my question
Enter that query into 
SHARE
Two different 
providers of 
gene 
information 
(KEGG & 
NCBI); 
were found & 
accessed 
Two different 
providers of 
pathway 
information 
(KEGG and 
GO); 
were found & 
accessed
The results are all links to the original data 
(The answer is IN the Web!)
Show me the latest Blood Urea Nitrogen and Creatinine levels 
of patients who appear to be rejecting their transplants 
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> 
PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> 
SELECT ?patient ?bun ?creat 
FROM <http://sadiframework.org/ontologies/patients.rdf> 
WHERE { 
?patient rdf:type patient:LikelyRejecter . 
?patient l:latestBUN ?bun . 
?patient l:latestCreatinine ?creat . 
}
Show me the latest Blood Urea Nitrogen (BUN) and 
Creatinine levels of patients who appear to be 
rejecting their transplants 
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> 
PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> 
SELECT ?patient ?bun ?creat 
FROM <http://sadiframework.org/ontologies/patients.rdf> 
WHERE { 
?patient rdf:type patient:LikelyRejecter . 
?patient l:latestBUN ?bun . 
?patient l:latestCreatinine ?creat . 
}
Likely Rejecter: 
A patient who has creatinine levels 
that are increasing over time 
- - Mark D Wilkinson’s definition
Likely Rejecter: 
…but there is no “likely rejecter” 
column or table in our database… 
only blood chemistry measurements 
at various time-points
Likely Rejecter: 
So the data required to answer this question 
DOESN’T EXIST!
My definition of a Likely Rejecter is encoded in 
a machine-readable document written in the OWL Ontology language 
Basically: 
“the regression line over creatinine measurements should have an increasing slope”
Our ontology refers to other ontologies (possibly published by other people) 
to learn about what the properties of “regression models” are 
e.g. that regression models have slopes and intercepts 
and that slopes and intercepts have decimal values
?
Enter that query into 
SHARE
SHARE examines the query 
Burrows around the Web reading 
the various ontologies 
then uses the discovered Class definitions as a template 
to map a path from what it has, to what it needs, using 
SADI services
Based on the Class definition 
SHARE decides that it needs to do a 
Linear Regression analysis 
on the blood creatinine measurements
?
The conversation between SHARE and the registry 
reveals the use of “Deep Semantics” 
Q: Is there a SADI service that will consume instances of Patient and give 
me instances of LikelyRejector 
A: No 
Q: Okay... So LikelyRejectors need a regression model of increasing slope 
over their BloodCreatinine, so... Is there a SADI service that will consume 
BloodCreatinine over time and give me its linear regression model? 
A: No 
Q: Okay... Blood Creatinine over time is a subclass of data of type 
X/Y coordinate, so is there a service that consumes X/Y data and 
returns its regression model? 
A: Yes  here’s the URL.
The SHARE system utilizes SADI to discover 
analytical services on the Web that do linear regression analysis 
and sends the data to be analyzed
This happens iteratively 
(e.g. SHARE also has to examine the slope of the regression line 
using another service, find the “latest” in a series of time measurements, etc.) 
There is reasoning after every Service invocation 
(i.e. after every clause in the query) 
Once it is able to find instances (OWL Individuals) 
of the LikelyRejector class, it continues with the 
rest of the query
VOILA!
The way SHARE “interprets” data varies 
depending on the context of the query 
(i.e. which ontologies it reads – Mine? Yours?) 
and on what part of the query 
it is trying to answer at any given moment 
(which ontological concept is relevant to that clause)
Example? 
Blood Creatinine measurements 
were not dictated to be 
Blood Creatinine measurements
Example? 
The data had the ‘qualities/properties’ that 
allowed one machine to interpret 
that they were Blood Creatinine measurements 
(e.g. to determine which patients were rejecting)
Example? 
But the data also had the ‘qualities/properties’ that 
allowed another machine to interpret them as 
Simple X/Y coordinate data 
(e.g. the Linear Regression calculation tool)
Benefit 
of Deep Semantics 
Data is amenable to 
constant re-interpretation
http://www.flickr.com/people/faernworks/
Story #2: Measurement Units 
One example of the “little ways” 
that Semantics will help researchers 
day-by-day
Units must be harmonized 
Don’t leave this up to the researcher 
(it’s fiddly, time-consuming, and error-prone)
NASA Mars Climate Orbiter
Oops!
The Reality of Clinical Datasets 
(this is a small snapshot of a dataset we worked on, 
courtesy of Dr. Bruce McManus & Janet McManus, from the PROOF COE) 
ID HEIGHT WEIGHT SBP CHOL HDL BMI 
GR 
SBP 
GR 
CHOL 
GR 
HDL 
GR 
pt1 1.82 177 128 227 55 0 0 1 0 
pt2 179 196 13.4 5.9 1.7 1 0 1 0 
Height in m and cm Chol in mmol/l and mg/l 
...and other delicious weirdness  
The clinical analyses described here 
were supported in part by the 
PROOF Center of Excellence 
for the Prevention of Organ Failure
GOAL: reduce the likelihood of errors by 
getting the clinical researcher 
“out of the loop” 
(as per the Institute of Medicine Recommendations)
Experiment: 
Reproduce a clinical study 
(from >10 years ago) 
by logically encoding 
the clinical diagnosis guidelines 
of the American Heart Association 
then ask SHARE to automatically 
analyse the patient clinical data
Semantically defining globally-accepted clinical phenotypes; 
Building on the expertise of others 
SystolicBloodPressure = 
GALEN:SystolicBloodPressure and 
GALEN is a popular biomedical ontology 
but it is largely, like GO, a series of 
named but undefined Classes 
("sio:has measurement value" some "sio:measurement" and 
("sio:has unit" some “om: unit of measure”) and 
(“om:dimension” value “om:pressure or stress dimension”) and 
"sio:has value" some rdfs:Literal))
Semantically defining globally-accepted clinical phenotypes; 
Building on the expertise of others 
SystolicBloodPressure = 
relationships like “has measurement valule” 
GALEN:SystolicBloodPressure and 
So we use OWL to extend the GALEN 
Classes with rich, logical descriptors 
that take advantage of rich semantic 
and “dimension” and “has unit” 
("sio:has measurement value" some "sio:measurement" and 
("sio:has unit" some “om: unit of measure”) and 
(“om:dimension” value “om:pressure or stress dimension”) and 
"sio:has value" some rdfs:Literal))
Semantically defining globally-accepted clinical phenotypes; 
Building on the expertise of others 
SystolicBloodPressure = 
GALEN:SystolicBloodPressure and 
("sio:has measurement value" some "sio:measurement" and 
("sio:has unit" some “om: unit of measure”) and 
(“om:dimension” value “om:pressure or stress dimension”) and 
"sio:has value" some rdfs:Literal)) 
Very general definition 
“some kind of pressure unit” 
(so that others can build on this as they wish!)
Semantically defining globally-accepted clinical phenotypes; 
Building on the expertise of others 
HighRiskSystolicBloodPressure (as defined by Framingham) 
SystolicBloodPressure and 
sio:hasMeasurement some 
(sio:Measurement and 
(“sio:has unit” value om:kilopascal) and 
(sio:hasValue some double[>= "18.7"^^double]))) 
Now we are specific to our clinical study (Framingham definitions): 
MUST be in kpascal and must be > 18.7
Running the Clinical Analysis 
“Select the patients who are at-risk” 
SELECT ?record ?convertedvalue ?convertedunit 
FROM <./patient.rdf> 
WHERE { 
?record rdf:type measure:HighRiskSystolicBloodPressure . 
?record sio:hasMeasurement ?measurement. 
?measurement sio:hasValue ?Pressure. 
} 
All measurements have now been automatically 
harmonized to KiloPascal, because we encoded the 
semantics in the model 
RecordID Start Val Start Unit Pressure End Unit 
Pt1 15 cmHg 19.998 KiloPascal 
Pt2 14.6 cmHg 19.465 KiloPascal 
Pt1 148 mmHg 19.731 KiloPascal 
Pt2 146 mmHg 19.465 KiloPascal
While doing this experiment, we noticed 
some interesting anomalies…
Visual inspection of our output data and the AHA guidelines 
showed that in many cases the clinician 
“tweaked” the guidelines when doing their analysis 
------------------ 
AHA BMI risk threshold: BMI=25 
In our dataset the clinical researcher used BMI=26 
------------------ 
AHA HDL guideline HDL<=1.03mmol/l 
The dataset from our researcher: HDL<=0.89mmol/l 
-------------------
Visual inspection of our output data and the AHA guidelines 
showed that in many cases the clinician 
“tweaked” the guidelines when doing their analysis 
These Alterations Were Not Recorded 
in Their Study Notes!
Adjusting our Semantic definitions and re-running the analysis 
resulted in nearly 100% correspondence with the clinical researcher 
HighRiskCholesterolRecord= 
PatientRecord and 
(sio:hasAttribute some 
(cardio:SerumCholesterolConcentration and 
sio:hasMeasurement some ( sio:Measurement and 
(sio:hasUnit value cardio:mili-mole-per-liter) and 
(sio:hasValue some double[>= 5.0])))) 
HighRiskCholesterolRecord= 
PatientRecord and 
(sio:hasAttribute some 
(cardio:SerumCholesterolConcentration and 
sio:hasMeasurement some ( sio:Measurement and 
(sio:hasUnit value cardio:mili-mole-per-liter) and 
(sio:hasValue some double[>= 5.2]))))
Reflect on this for a second... Because this is important! 
1. We semantically encoded clinical guidelines 
2. We found that clinical researchers did not follow the official guidelines 
3. Their “personalization” of the guidelines was unreported 
4. Nevertheless, we were able to create “personalized” Semantic Models 
5. These models reflect the opinion of an individual domain-expert 
6. These models are shared on the Web 
7. Can be automatically re-used by others to interpret their own data using 
that clinical expert’s viewpoint
PREFIX AHA =http://americanheart.org/measurements/ 
PREFIX McManus=http://stpaulshospital.org/researchers/mcmanus/ 
AHA:HighRiskCholesterolRecord 
PatientRecord and 
(sio:hasAttribute some 
(cardio:SerumCholesterolConcentration and 
sio:hasMeasurement some ( sio:Measurement and 
(sio:hasUnit value cardio:mili-mole-per-liter) and 
(sio:hasValue some double[>= 5.0])))) 
McManus:HighRiskCholesterolRecord 
PatientRecord and 
(sio:hasAttribute some 
(cardio:SerumCholesterolConcentration and 
sio:hasMeasurement some ( sio:Measurement and 
(sio:hasUnit value cardio:mili-mole-per-liter) and 
(sio:hasValue some double[>= 5.2]))))
To do the analysis using AHL guidelines 
SELECT ?patient ?risk 
WHERE { 
?patient rdf:type AHA: HighRiskCholesterolRecord . 
?patient ex:hasCholesterolProfile ?risk 
}
To do the analysis using McManus’ expert-opinion 
SELECT ?patient ?risk 
WHERE { 
?patient rdf:type McManus:HighRiskCholesterolRecord . 
?patient ex:hasCholesterolProfile ?risk 
}
Flexibility Transparency 
Reproducibility Shareability Comparability 
Simplicity Automation
Personalization 
(I’m going to return to this point several times)
Story #3: in silico Science 
Reproduce a peer-reviewed 
scientific publication 
by semantically modelling 
the problem
The Publication 
Discovering Protein Partners of a 
Human Tumor Suppressor Protein
Original Study Simplified 
Using what is known about protein interactions 
in fly & yeast 
predict new interactions with this 
Human Tumor Suppressor
Semantic Model of the Experiment 
OWL
Semantic Model of the Experiment 
Note that every word in this 
diagram is, in reality, a URL 
(it’s a Semantic Web model) 
i.e. It refers to the expertise of 
other researchers, distributed 
around the world on the Web
Set-up the Experimental Conditions 
In a local data-file 
provide the protein we are interested in 
and the two species we wish to use in our comparison 
taxon:9606 a i:OrganismOfInterest . # human 
uniprot:Q9UK53 a i:ProteinOfInterest . # ING1 
taxon:4932 a i:ModelOrganism1 . # yeast 
taxon:7227 a i:ModelOrganism2 . # fly
SELECT ?protein 
FROM <file:/local/workflow.input.n3> 
WHERE { 
?protein a i:ProbableInteractor . 
} 
Run the Experiment
SELECT ?protein 
FROM <file:/local/workflow.input.n3> 
WHERE { 
?protein a i:ProbableInteractor . 
} 
Run the Experiment 
This is the URL that leads our computer 
to the Semantic model of the problem
SHARE examines the semantic model of 
Probable Interactors 
Retrieves third-party expertise from the Web 
Discusses with SADI 
what analytical tools are necessary 
Chooses the right tools for the problem 
Solves the problem!
SHARE derives (and executes) the following analysis automatically
SHARE is aware of the context of the specific question being asked
There are five very cool things about what you just saw...
There are five very cool things about what you just saw... 
was able to create a 
workflow based on a 
semantic model 
1.
There are five very cool things about what you just saw... 
was able to create a 
COMPUTATIONAL workflow 
based on a BIOLOGICAL model 
2.
There are five very cool things about what you just saw... 
(this is important because we want 
who don’t speak computerese!) 2. 
this system to be used by clinicians and biologists
There are five very cool things about what you just saw... 
The workflow it created, and services 
selected, differed depending on the 
context of the question 
3. 
taxon:4932 a i:ModelOrganism1 . # yeast 
taxon:7227 a i:ModelOrganism2 . # fly
There are five very cool things about what you just saw... 
The machine was contextually “aware of” 
The workflow it created, and services 
chosen, differed depending on the 
BOTH the biological model 
context of the question 
3. 
AND the data it was analysing 
taxon:4932 a i:ModelOrganism1 . # yeast 
taxon:7227 a i:ModelOrganism2 . # fly 
(...remember this... It will be important later!)
There are five very cool things about what you just saw... 
The ontological model was abstract (and 
shareable!), but the workflow generated 
from that model was explicit and concrete 
4.
There are five very cool things about what you just saw... 
The ontological model was abstract (and 
shareable!), but the workflow generated 
from that model was explicit and concrete 
4.
There are five very cool things about what you just saw... 
The ontological model was abstract (and 
shareable!), but the workflow generated 
from that model was explicit and concrete 
4. 
This matters because…
Remember 
Trend #1 
“the most common errors are simple, 
the most simple errors are common” 
At least partially because the 
analytical methodology was inappropriate 
and/or not sufficiently described
Remember 
Trend #1 
“the most common errors are simple, 
the most simple errors are common” 
At least partially because the 
analytical methodology was inappropriate 
and/or not sufficiently described 
Here, the methodology leading to a result is explicit 
and automatically constructed from an abstract template 
so this is (at least in part) a 
Solved Problem
There are five very cool things about what you just saw... 
The choice of tool-selection was 
guided by the knowledge of 
worldwide domain-experts encoded in 
globally-distributed ontologies 
(e.g. Expert high-throughput statisticians, etc...) 
5.
There are five very cool things about what you just saw... 
The choice of tool-selection was 
guided by the knowledge of 
worldwide domain-experts encoded in 
globally-distributed ontologies 
(e.g. Expert high-throughput statisticians, etc...) 
And this matters because… 
5.
Remember 
Trend #2 
Even small, moderately-funded laboratories 
can now afford to produce more data 
than they can manage or interpret 
These labs will likely never be able to afford 
a qualified data scientist
Remember 
Trend #2 
Even small, moderately-funded laboratories 
can now afford to produce more data 
than they can manage or interpret 
These labs will likely never be able to afford 
a qualified data scientist 
But if the expert knowledge of data scientists is 
encoded in ontologies, and can be discovered 
in a contextually-aware manner… then this is a 
SOLVED PROBLEM
Story #4: Personalized Health Info 
Can we make the Health information 
on the Web 
more “personal”?
Remember when I said... 
The machine was contextually “aware of” 
BOTH the biological model 
AND the data it was analysing
This “dual-awareness” provides some 
very interesting opportunities 
for personalizing a patient’s Health Research activity
PROBLEM: 
Patients are self-educating 
both about their personal medical situation 
(e.g. getting themselves sequenced) 
also surfing the Web, getting dubious advice 
from sites of dubious authority 
and joining social-health groups 
to exchange (often anecdotal) 
medical “advice” with other patients
PROBLEM: 
Patients are self-educating 
The information on any given site 
may or may not 
be relevant to THAT patient 
Information on the Web is, by nature, not personalized
PROBLEM: 
Clinicians often have patients 
(especially chronically-ill patients) 
on a “trajectory” of treatment 
Medicine is complicated! 
e.g. the treatment trajectory of the patient can be 
multi-step, and a specific sign/symptom might be 
perfectly normal at a particular phase in their 
“flow” of treatment
PROBLEM SUMMARY 
Patients are reading non-personalized medical text 
of dubious quality and relevance 
Clinicians have no way to intervene 
in this self-education process 
explaining to patients how the information they read 
relates to their personal “health trajectory”
Now you might see why this is so relevant! 
The machine was contextually “aware of” 
BOTH the biological model 
AND the data it was analysing
This is an early prototype of a 
Patient-driven Personalized Medicine 
Web interface
Basically, it is a set of SHARE queries 
Attached to a local database 
of patient information 
Running behind a Web bookmarklet
The queries text-mine a Web page 
then compare the concepts in the page 
to the patient’s personal data 
using a SHARE query
The queries text-mine a Web page 
then compare the concepts in the page 
to the patient’s personal data 
using a SHARE query 
(that could contain ontologies... 
...ontologies designed by their clinician!!)
Matching based on official 
name, compound name, 
brand name, trade name, 
or “common name” 
Still needs some work... 
??!?!?
Link out to PubMed 
Why the alert?
The SADI+SHARE workflow and reasoning was 
personalized to YOUR medical data
In future iterations, we will enable the workflow 
to be further customized through “personalized” 
OWL Classes (e.g. Provided by your Clinician!!)
These OWL Classes might include information about the 
current trajectory of your treatment for a chronic disease, 
for example, such that what you read on the Web is 
placed in the context of your expert Clinical care...
Frankly, I think it’s quite cool that people 
patients 
are creating and running 
“personal health-research” workflows 
at the touch of a button!
Almost the end… 
Three brief final points....
Publication 
Discourse 
Interpretation 
Hypothesis 
Experiment 
? 
?
The Semantic Model represents 
a possible solution to a problem
The Semantic Model represents 
a possible solution to a problem 
By my definition, that is a hypothesis
The Semantic Model represents 
a possible solution to a problem 
That hypothesis is tested by automatically converting it into a workflow;
The Semantic Model represents 
a possible solution to a problem 
That hypothesis is tested by automatically converting it into a workflow; 
the workflow, and the results of the workflow are intimately tied to the hypothesis
The Semantic Model represents 
a possible solution to a problem 
i.e. You (or anyone!) can determine exactly which aspect 
of the hypothesis led to which output data element, why, and how
The Semantic Model represents 
a possible solution to a problem 
“Exquisite Provenance” 
a perfect record not only of what was done, when, and how 
but also WHY
And this is important because...
“Exquisite Provenance” 
is required 
for the output data and knowledge 
to be published as...
Richly annotated, citable, and queryable snippets of 
scientific knowledge encoded in Linked Data/OWL 
i.e. a way to publish data and knowledge on the Semantic Web
Publication 
Discourse 
Interpretation 
Hypothesis 
Experiment
A “modest” vision for 
pure in silico Science
Last point… perhaps this is not yet obvious…
SADI services consume Linked Data on the Web
SADI services consume Linked Data on the Web 
The ontologies provided to SHARE are 
written in OWL, and are therefore 
inherently part of the Web
SADI services consume Linked Data on the Web 
The ontologies provided to SHARE are 
written in OWL, and are therefore 
inherently part of the Web 
SADI services create novel semantic links 
between existing data-points on the Web, or 
between existing data and new data
SADI services consume Linked Data on the Web 
The ontologies provided to SHARE are 
written in OWL, and are therefore 
inherently part of the Web 
SADI services create novel semantic links 
between existing data-points on the Web, or 
between existing data and new data 
The output of the automatically-generated workflow 
is therefore Linked Data 
and is therefore inherently part of the Web
SADI services consume Linked Data on the Web 
The ontologies provided to SHARE are 
written in OWL, and are therefore 
inherently part of the Web 
SADI services create novel semantic links 
between existing data-points on the Web, or 
between existing data and new data 
The output of the automatically-generated workflow 
is therefore Linked Data 
and is therefore inherently part of the Web 
The concluding NanoPublications are a combination 
of Linked Data and OWL, and are published directly to the Web
The Life Science “Singularity” 
We 
Are 
Here! 
The Semantic Web is a cradle-to-grave 
biomedical research platform 
that can, and will, dramatically improve 
how biomedical research is done
The important people 
Luke McCarthy 
(SADI/SHARE) 
Benjamin Vandervalk 
(SHARE) 
Dr. Soroush Samadian 
(clinical experiments) 
Ian Wood 
(Experiment-replication experiment)
Microsoft Research

Weitere ähnliche Inhalte

Was ist angesagt?

CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata managementPistoia Alliance
 
Repurposing authoritative data about faculty to analyze publication output, i...
Repurposing authoritative data about faculty to analyze publication output, i...Repurposing authoritative data about faculty to analyze publication output, i...
Repurposing authoritative data about faculty to analyze publication output, i...Paul Albert
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicinePaul Groth
 
Measuring Research Impact on the Web
Measuring Research Impact on the WebMeasuring Research Impact on the Web
Measuring Research Impact on the WebCharleston Conference
 
Great Science, Technology, Engineering and Medicine Resources Web Search Univ...
Great Science, Technology, Engineering and Medicine Resources Web Search Univ...Great Science, Technology, Engineering and Medicine Resources Web Search Univ...
Great Science, Technology, Engineering and Medicine Resources Web Search Univ...Matthew Von Hendy
 
How to measure research impact on the web
How to measure research impact on the webHow to measure research impact on the web
How to measure research impact on the webKinga Hosszu
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnected Data World
 
Controlled vocabularies and VIVO
Controlled vocabularies and VIVOControlled vocabularies and VIVO
Controlled vocabularies and VIVOPaul Albert
 
The Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in BiologyThe Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in Biologyrobertstevens65
 
Annotation examples--Fribourg--2019-09-03
Annotation examples--Fribourg--2019-09-03Annotation examples--Fribourg--2019-09-03
Annotation examples--Fribourg--2019-09-03jodischneider
 
Science Research Guide: MySearchDatabasesBioethicsEcosystems
Science Research Guide: MySearchDatabasesBioethicsEcosystemsScience Research Guide: MySearchDatabasesBioethicsEcosystems
Science Research Guide: MySearchDatabasesBioethicsEcosystemsCathy Oxley
 
How many medline platforms on the web?
How many medline platforms on the web?How many medline platforms on the web?
How many medline platforms on the web?Basset Hervé
 
Chapter 2 Psychological Research
Chapter 2 Psychological ResearchChapter 2 Psychological Research
Chapter 2 Psychological Researchvwagner1
 

Was ist angesagt? (20)

CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata management
 
Repurposing authoritative data about faculty to analyze publication output, i...
Repurposing authoritative data about faculty to analyze publication output, i...Repurposing authoritative data about faculty to analyze publication output, i...
Repurposing authoritative data about faculty to analyze publication output, i...
 
Whitney Symposium Lecture June 2008
Whitney Symposium Lecture June 2008Whitney Symposium Lecture June 2008
Whitney Symposium Lecture June 2008
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicine
 
Improving online chemistry one structure at a time
Improving online chemistry one structure at a timeImproving online chemistry one structure at a time
Improving online chemistry one structure at a time
 
Open science 2014
Open science 2014Open science 2014
Open science 2014
 
Measuring Research Impact on the Web
Measuring Research Impact on the WebMeasuring Research Impact on the Web
Measuring Research Impact on the Web
 
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life SciencesBuilding A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
 
Paul Groth
Paul GrothPaul Groth
Paul Groth
 
Great Science, Technology, Engineering and Medicine Resources Web Search Univ...
Great Science, Technology, Engineering and Medicine Resources Web Search Univ...Great Science, Technology, Engineering and Medicine Resources Web Search Univ...
Great Science, Technology, Engineering and Medicine Resources Web Search Univ...
 
How to measure research impact on the web
How to measure research impact on the webHow to measure research impact on the web
How to measure research impact on the web
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
 
Controlled vocabularies and VIVO
Controlled vocabularies and VIVOControlled vocabularies and VIVO
Controlled vocabularies and VIVO
 
Chemistry made mobile – the expanding world of chemistry in the hand
Chemistry made mobile – the expanding world of chemistry in the handChemistry made mobile – the expanding world of chemistry in the hand
Chemistry made mobile – the expanding world of chemistry in the hand
 
The Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in BiologyThe Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in Biology
 
Annotation examples--Fribourg--2019-09-03
Annotation examples--Fribourg--2019-09-03Annotation examples--Fribourg--2019-09-03
Annotation examples--Fribourg--2019-09-03
 
Science Research Guide: MySearchDatabasesBioethicsEcosystems
Science Research Guide: MySearchDatabasesBioethicsEcosystemsScience Research Guide: MySearchDatabasesBioethicsEcosystems
Science Research Guide: MySearchDatabasesBioethicsEcosystems
 
How many medline platforms on the web?
How many medline platforms on the web?How many medline platforms on the web?
How many medline platforms on the web?
 
Predatory Journals
Predatory JournalsPredatory Journals
Predatory Journals
 
Chapter 2 Psychological Research
Chapter 2 Psychological ResearchChapter 2 Psychological Research
Chapter 2 Psychological Research
 

Andere mochten auch

IBC FAIR Data Prototype Implementation slideshow
IBC FAIR Data Prototype Implementation   slideshowIBC FAIR Data Prototype Implementation   slideshow
IBC FAIR Data Prototype Implementation slideshowMark Wilkinson
 
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Mark Wilkinson
 
Curriculum specification F4
Curriculum specification F4Curriculum specification F4
Curriculum specification F4hajahrokiah
 
Building a community around your blog v3
Building a community around your blog v3Building a community around your blog v3
Building a community around your blog v3Brendan Sera-Shriar
 
Juc paris olivier lamy talk
Juc paris olivier lamy talkJuc paris olivier lamy talk
Juc paris olivier lamy talkOlivier Lamy
 
Research - this time it's personal
Research - this time it's personalResearch - this time it's personal
Research - this time it's personalMark Wilkinson
 
Migrate, Grow, and Cultivate your Community
Migrate, Grow, and Cultivate your CommunityMigrate, Grow, and Cultivate your Community
Migrate, Grow, and Cultivate your CommunityBrendan Sera-Shriar
 
The Semantic Web - This time... its Personal
The Semantic Web - This time... its PersonalThe Semantic Web - This time... its Personal
The Semantic Web - This time... its PersonalMark Wilkinson
 
Design Studio
Design StudioDesign Studio
Design Studiomilarepa1
 
Thesis Presentation 2009
Thesis Presentation 2009Thesis Presentation 2009
Thesis Presentation 2009joangriff
 
Semana de la biblioteca 2011 final
Semana de la biblioteca 2011 finalSemana de la biblioteca 2011 final
Semana de la biblioteca 2011 finalPaola Padilla
 
SmartBrief Portfolio
SmartBrief PortfolioSmartBrief Portfolio
SmartBrief PortfolioSmartBrief
 
SWAT4LS 2011: SADI Knowledge Explorer Plug-in
SWAT4LS 2011: SADI Knowledge Explorer Plug-inSWAT4LS 2011: SADI Knowledge Explorer Plug-in
SWAT4LS 2011: SADI Knowledge Explorer Plug-inMark Wilkinson
 
Eindadvies over-de-vernieuwing-van-de-examenprogrammas-maatschappijwetenschap...
Eindadvies over-de-vernieuwing-van-de-examenprogrammas-maatschappijwetenschap...Eindadvies over-de-vernieuwing-van-de-examenprogrammas-maatschappijwetenschap...
Eindadvies over-de-vernieuwing-van-de-examenprogrammas-maatschappijwetenschap...Luc Sluijsmans
 
¡UNA BOTELLA AGUA....Y QUE!
¡UNA BOTELLA AGUA....Y QUE!¡UNA BOTELLA AGUA....Y QUE!
¡UNA BOTELLA AGUA....Y QUE!pipis397
 

Andere mochten auch (20)

IBC FAIR Data Prototype Implementation slideshow
IBC FAIR Data Prototype Implementation   slideshowIBC FAIR Data Prototype Implementation   slideshow
IBC FAIR Data Prototype Implementation slideshow
 
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
 
Curriculum specification F4
Curriculum specification F4Curriculum specification F4
Curriculum specification F4
 
Building a community around your blog v3
Building a community around your blog v3Building a community around your blog v3
Building a community around your blog v3
 
Juc paris olivier lamy talk
Juc paris olivier lamy talkJuc paris olivier lamy talk
Juc paris olivier lamy talk
 
Bambu Communication Group Credential
Bambu Communication Group CredentialBambu Communication Group Credential
Bambu Communication Group Credential
 
Research - this time it's personal
Research - this time it's personalResearch - this time it's personal
Research - this time it's personal
 
Migrate, Grow, and Cultivate your Community
Migrate, Grow, and Cultivate your CommunityMigrate, Grow, and Cultivate your Community
Migrate, Grow, and Cultivate your Community
 
i楼市
i楼市i楼市
i楼市
 
The Semantic Web - This time... its Personal
The Semantic Web - This time... its PersonalThe Semantic Web - This time... its Personal
The Semantic Web - This time... its Personal
 
Design Studio
Design StudioDesign Studio
Design Studio
 
Gitools
GitoolsGitools
Gitools
 
Thesis Presentation 2009
Thesis Presentation 2009Thesis Presentation 2009
Thesis Presentation 2009
 
hi
hihi
hi
 
Semana de la biblioteca 2011 final
Semana de la biblioteca 2011 finalSemana de la biblioteca 2011 final
Semana de la biblioteca 2011 final
 
SmartBrief Portfolio
SmartBrief PortfolioSmartBrief Portfolio
SmartBrief Portfolio
 
Red5 - PHUG Workshops
Red5 - PHUG WorkshopsRed5 - PHUG Workshops
Red5 - PHUG Workshops
 
SWAT4LS 2011: SADI Knowledge Explorer Plug-in
SWAT4LS 2011: SADI Knowledge Explorer Plug-inSWAT4LS 2011: SADI Knowledge Explorer Plug-in
SWAT4LS 2011: SADI Knowledge Explorer Plug-in
 
Eindadvies over-de-vernieuwing-van-de-examenprogrammas-maatschappijwetenschap...
Eindadvies over-de-vernieuwing-van-de-examenprogrammas-maatschappijwetenschap...Eindadvies over-de-vernieuwing-van-de-examenprogrammas-maatschappijwetenschap...
Eindadvies over-de-vernieuwing-van-de-examenprogrammas-maatschappijwetenschap...
 
¡UNA BOTELLA AGUA....Y QUE!
¡UNA BOTELLA AGUA....Y QUE!¡UNA BOTELLA AGUA....Y QUE!
¡UNA BOTELLA AGUA....Y QUE!
 

Ähnlich wie Presentation to the J. Craig Venter Institute, Dec. 2014

Open Research Practices in the Age of a Papermill Pandemic
Open Research Practices in the Age of a Papermill PandemicOpen Research Practices in the Age of a Papermill Pandemic
Open Research Practices in the Age of a Papermill PandemicDorothy Bishop
 
Force11: Enabling transparency and efficiency in the research landscape
Force11: Enabling transparency and efficiency in the research landscapeForce11: Enabling transparency and efficiency in the research landscape
Force11: Enabling transparency and efficiency in the research landscapemhaendel
 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Amit Sheth
 
ContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKpetermurrayrust
 
Ontologies for baby animals and robots From "baby stuff" to the world of adul...
Ontologies for baby animals and robots From "baby stuff" to the world of adul...Ontologies for baby animals and robots From "baby stuff" to the world of adul...
Ontologies for baby animals and robots From "baby stuff" to the world of adul...Aaron Sloman
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...Carole Goble
 
Web Science, SADI, and the Singularity
Web Science, SADI, and the SingularityWeb Science, SADI, and the Singularity
Web Science, SADI, and the SingularityMark Wilkinson
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015William Gunn
 
Evolution of e-Research
Evolution of e-ResearchEvolution of e-Research
Evolution of e-ResearchDavid De Roure
 
Information literacy
Information literacyInformation literacy
Information literacySean Socha
 
Rapid biomedical search
Rapid biomedical search Rapid biomedical search
Rapid biomedical search petermurrayrust
 
DRUGS New agreement to tackle pharmaceutical pollution p.1
DRUGS New agreement to tackle pharmaceutical pollution p.1DRUGS New agreement to tackle pharmaceutical pollution p.1
DRUGS New agreement to tackle pharmaceutical pollution p.1AlyciaGold776
 
Data Management Open House
Data Management Open HouseData Management Open House
Data Management Open HouseJackie Wirz, PhD
 
Mock Scientific Research Paper
Mock Scientific Research PaperMock Scientific Research Paper
Mock Scientific Research PaperJessica Howard
 
Open Data and the Social Sciences - OpenCon Community Webcast
Open Data and the Social Sciences - OpenCon Community WebcastOpen Data and the Social Sciences - OpenCon Community Webcast
Open Data and the Social Sciences - OpenCon Community WebcastRight to Research
 

Ähnlich wie Presentation to the J. Craig Venter Institute, Dec. 2014 (20)

Open Research Practices in the Age of a Papermill Pandemic
Open Research Practices in the Age of a Papermill PandemicOpen Research Practices in the Age of a Papermill Pandemic
Open Research Practices in the Age of a Papermill Pandemic
 
Cartegena051811
Cartegena051811Cartegena051811
Cartegena051811
 
Force11: Enabling transparency and efficiency in the research landscape
Force11: Enabling transparency and efficiency in the research landscapeForce11: Enabling transparency and efficiency in the research landscape
Force11: Enabling transparency and efficiency in the research landscape
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Suggested Annotative Bibliography Essay
Suggested Annotative Bibliography EssaySuggested Annotative Bibliography Essay
Suggested Annotative Bibliography Essay
 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
 
ContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UK
 
Ontologies for baby animals and robots From "baby stuff" to the world of adul...
Ontologies for baby animals and robots From "baby stuff" to the world of adul...Ontologies for baby animals and robots From "baby stuff" to the world of adul...
Ontologies for baby animals and robots From "baby stuff" to the world of adul...
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systemsA01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
 
Web Science, SADI, and the Singularity
Web Science, SADI, and the SingularityWeb Science, SADI, and the Singularity
Web Science, SADI, and the Singularity
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015
 
Evolution of e-Research
Evolution of e-ResearchEvolution of e-Research
Evolution of e-Research
 
Information literacy
Information literacyInformation literacy
Information literacy
 
Scientific Method
Scientific MethodScientific Method
Scientific Method
 
Rapid biomedical search
Rapid biomedical search Rapid biomedical search
Rapid biomedical search
 
DRUGS New agreement to tackle pharmaceutical pollution p.1
DRUGS New agreement to tackle pharmaceutical pollution p.1DRUGS New agreement to tackle pharmaceutical pollution p.1
DRUGS New agreement to tackle pharmaceutical pollution p.1
 
Data Management Open House
Data Management Open HouseData Management Open House
Data Management Open House
 
Mock Scientific Research Paper
Mock Scientific Research PaperMock Scientific Research Paper
Mock Scientific Research Paper
 
Open Data and the Social Sciences - OpenCon Community Webcast
Open Data and the Social Sciences - OpenCon Community WebcastOpen Data and the Social Sciences - OpenCon Community Webcast
Open Data and the Social Sciences - OpenCon Community Webcast
 

Mehr von Mark Wilkinson

FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1Mark Wilkinson
 
Introducing the fair evaluator
Introducing the fair evaluatorIntroducing the fair evaluator
Introducing the fair evaluatorMark Wilkinson
 
FAIR Projector Builder
FAIR Projector BuilderFAIR Projector Builder
FAIR Projector BuilderMark Wilkinson
 
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...Mark Wilkinson
 
smartAPIs: EUDAT Semantic Working Group Presentation @ RDA 9th Plenary
smartAPIs:  EUDAT Semantic Working Group Presentation @ RDA 9th PlenarysmartAPIs:  EUDAT Semantic Working Group Presentation @ RDA 9th Plenary
smartAPIs: EUDAT Semantic Working Group Presentation @ RDA 9th PlenaryMark Wilkinson
 
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...Mark Wilkinson
 
Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015
Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015
Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015Mark Wilkinson
 
Sample data and other ur ls
Sample data and other ur lsSample data and other ur ls
Sample data and other ur lsMark Wilkinson
 
Example code for the SADI BMI Calculator Web Service
Example code for the SADI BMI Calculator Web ServiceExample code for the SADI BMI Calculator Web Service
Example code for the SADI BMI Calculator Web ServiceMark Wilkinson
 
Tutorial - Creating SADI semantic-web-services
Tutorial - Creating SADI semantic-web-servicesTutorial - Creating SADI semantic-web-services
Tutorial - Creating SADI semantic-web-servicesMark Wilkinson
 
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordForce11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordMark Wilkinson
 
Enhancing Reproducibility and Transparency in Clinical Research through Seman...
Enhancing Reproducibility and Transparency in Clinical Research through Seman...Enhancing Reproducibility and Transparency in Clinical Research through Seman...
Enhancing Reproducibility and Transparency in Clinical Research through Seman...Mark Wilkinson
 
Web Science 2.0 - in silico science
Web Science 2.0 - in silico scienceWeb Science 2.0 - in silico science
Web Science 2.0 - in silico scienceMark Wilkinson
 
Web Science - ISoLA 2012
Web Science - ISoLA 2012Web Science - ISoLA 2012
Web Science - ISoLA 2012Mark Wilkinson
 
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to cho...
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to cho...Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to cho...
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to cho...Mark Wilkinson
 
SADI in Perl - Protege Plugin Tutorial (fixed Aug 24, 2011)
SADI in Perl - Protege Plugin Tutorial (fixed Aug 24, 2011)SADI in Perl - Protege Plugin Tutorial (fixed Aug 24, 2011)
SADI in Perl - Protege Plugin Tutorial (fixed Aug 24, 2011)Mark Wilkinson
 
Technologies, methods and challenges to data sharing and aggrigation
Technologies, methods and challenges to data sharing and aggrigationTechnologies, methods and challenges to data sharing and aggrigation
Technologies, methods and challenges to data sharing and aggrigationMark Wilkinson
 
ISoLA 2010: SADI Taverna plug-in
ISoLA 2010:  SADI Taverna plug-inISoLA 2010:  SADI Taverna plug-in
ISoLA 2010: SADI Taverna plug-inMark Wilkinson
 

Mehr von Mark Wilkinson (20)

FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1
 
Introducing the fair evaluator
Introducing the fair evaluatorIntroducing the fair evaluator
Introducing the fair evaluator
 
FAIR Projector Builder
FAIR Projector BuilderFAIR Projector Builder
FAIR Projector Builder
 
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
 
smartAPIs: EUDAT Semantic Working Group Presentation @ RDA 9th Plenary
smartAPIs:  EUDAT Semantic Working Group Presentation @ RDA 9th PlenarysmartAPIs:  EUDAT Semantic Working Group Presentation @ RDA 9th Plenary
smartAPIs: EUDAT Semantic Working Group Presentation @ RDA 9th Plenary
 
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
 
Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015
Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015
Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015
 
Sample data and other ur ls
Sample data and other ur lsSample data and other ur ls
Sample data and other ur ls
 
Example code for the SADI BMI Calculator Web Service
Example code for the SADI BMI Calculator Web ServiceExample code for the SADI BMI Calculator Web Service
Example code for the SADI BMI Calculator Web Service
 
Sadi service
Sadi serviceSadi service
Sadi service
 
Tutorial - Creating SADI semantic-web-services
Tutorial - Creating SADI semantic-web-servicesTutorial - Creating SADI semantic-web-services
Tutorial - Creating SADI semantic-web-services
 
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordForce11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, Oxford
 
Enhancing Reproducibility and Transparency in Clinical Research through Seman...
Enhancing Reproducibility and Transparency in Clinical Research through Seman...Enhancing Reproducibility and Transparency in Clinical Research through Seman...
Enhancing Reproducibility and Transparency in Clinical Research through Seman...
 
SADI CSHALS 2013
SADI CSHALS 2013SADI CSHALS 2013
SADI CSHALS 2013
 
Web Science 2.0 - in silico science
Web Science 2.0 - in silico scienceWeb Science 2.0 - in silico science
Web Science 2.0 - in silico science
 
Web Science - ISoLA 2012
Web Science - ISoLA 2012Web Science - ISoLA 2012
Web Science - ISoLA 2012
 
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to cho...
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to cho...Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to cho...
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to cho...
 
SADI in Perl - Protege Plugin Tutorial (fixed Aug 24, 2011)
SADI in Perl - Protege Plugin Tutorial (fixed Aug 24, 2011)SADI in Perl - Protege Plugin Tutorial (fixed Aug 24, 2011)
SADI in Perl - Protege Plugin Tutorial (fixed Aug 24, 2011)
 
Technologies, methods and challenges to data sharing and aggrigation
Technologies, methods and challenges to data sharing and aggrigationTechnologies, methods and challenges to data sharing and aggrigation
Technologies, methods and challenges to data sharing and aggrigation
 
ISoLA 2010: SADI Taverna plug-in
ISoLA 2010:  SADI Taverna plug-inISoLA 2010:  SADI Taverna plug-in
ISoLA 2010: SADI Taverna plug-in
 

Kürzlich hochgeladen

Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts servicevipmodelshub1
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$kojalkojal131
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girladitipandeya
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...Diya Sharma
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Servicesexy call girls service in goa
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Delhi Call girls
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Roomdivyansh0kumar0
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of indiaimessage0108
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607dollysharma2066
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.soniya singh
 

Kürzlich hochgeladen (20)

Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girls
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
 
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICECall Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of india
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
 

Presentation to the J. Craig Venter Institute, Dec. 2014

  • 1. “Shopping for data should be as easy as shopping for shoes!” Dr. Carole Goble Professor, Dept. of Computer Science University of Manchester
  • 2. “A little bit of semantics goes a long way” Dr. James Hendler Artificial Intelligence Researcher Rensselaer Polytechnic Institute One of the originators of the Semantic Web
  • 3. …but a lot of semantics goes a long, long way! Mark Wilkinson Isaac Peral Distinguished Researcher Director, Fundación BBVA Chair in Biological Informatics Center for Plant Biotechnology and Genomics Technical University of Madrid
  • 4. Making the Web a biomedical research platform from hypothesis through to publication
  • 5. Publication Discourse Interpretation Hypothesis Experiment
  • 6. Publication Discourse Interpretation Hypothesis Experiment
  • 7. Motivation: 3 intersecting trends in the Life Sciences that are now, or soon will be, extremely problematic
  • 8. TREND #1 NON-REPRODUCIBLE SCIENCE & THE FAILURE OF PEER REVIEW
  • 9. Trend #1 Multiple recent surveys of high-throughput biology reveal that upwards of 50% of published studies are not reproducible - Baggerly, 2009 - Ioannidis, 2009
  • 10. Trend #1 Similar (if not worse!) in clinical studies - Begley & Ellis, Nature, 2012 - Booth, Forbes, 2012 - Huang & Gottardo, Briefings in Bioinformatics, 2012
  • 11. Trend #1 “the most common errors are simple, the most simple errors are common” At least partially because the analytical methodology was inappropriate and/or not sufficiently described - Baggerly, 2009
  • 12. Trend #1 These errors pass peer review The researcher is (sometimes) unaware of the error The process that led to the error is not recorded Therefore it cannot be detected during peer-review
  • 13. Agencies have Noticed! In March, 2012, the US Institute of Medicine ~said “Enough is enough!”
  • 14. Agencies have Noticed! Institute of Medicine Recommendations For Conduct of High-Throughput Research: 1. Rigorously-described, -annotated, and -followed data management and manipulation procedures 2. “Lock down” the computational analysis pipeline once it Evolution of Translational Omics Lessons Learned and the Path Forward. The Institute of Medicine of the National Academies, Report Brief, March 2012. has been selected 3. Publish the analytical workflow in a formal manner, together with the full starting and result datasets
  • 15. TREND #2 BIGGER, CHEAPER DATA
  • 16. Trend #2 High-throughput technologies are becoming cheaper and easier to use
  • 17. Trend #2 High-throughput technologies are becoming cheaper and easier to use But there are still very few experts trained in statistical analysis of high-throughput data
  • 18. Trend #2 The number of job postings for data scientist positions increased by 15,000% between the summers of 2011 and 2012 -- Indeed.com job trends data reported by http://blogs.nature.com/naturejobs/2013/03/18/so-you-want-to-be-a-data-scientist
  • 19. Trend #2 Therefore Even small, moderately-funded laboratories can now afford to produce more data than they can manage or interpret
  • 20. Trend #2 Therefore Even small, moderately-funded laboratories can now afford to produce more data than they can manage or interpret These labs will likely never be able to afford a qualified data scientist
  • 21. TREND #3 “THE SINGULARITY”
  • 22. The Healthcare Singularity and the Age of Semantic Medicine, Michael Gillam, et al, The Fourth Paradigm: Data-Intensive Scientific Discovery Tony Hey (Editor), 2009 Slide adapted with permission from Joanne Luciano, Presentation at Health Web Science Workshop 2012, Evanston IL, USA June 22, 2012. Trend #3
  • 23. “The Singularity” The X-intercept is where, the moment a discovery is made, it is immediately put into practice The Healthcare Singularity and the Age of Semantic Medicine, Michael Gillam, et al, The Fourth Paradigm: Data-Intensive Scientific Discovery Tony Hey (Editor), 2009 Slide Borrowed with Permission from Joanne Luciano, Presentation at Health Web Science Workshop 2012, Evanston IL, USA June 22, 2012.
  • 24. You Are Here Scientific research would have to be conducted within a medium that immediately interpreted and disseminated the results...
  • 25. ...in a form that immediately (actively!) affected the results of other researchers... You Are Here
  • 26. ...without requiring them to be aware of these new discoveries. You Are Here
  • 27. 3 intersecting and problematic trends Non-reproducible science that passes peer-review Cheaper production of larger and more complex datasets that require specialized expertise to analyze properly Need to more rapidly disseminate and use new discoveries
  • 29. I don’t just want to reproduce your experiment...
  • 30. I want to re-use your experiment
  • 31. In my own laboratory... On MY DATA!
  • 32. When I do my analysis I want to draw on the knowledge of global domain-experts like statisticians and pathologists... ...as if they were mentors sitting in the chair beside me.
  • 33. Please don’t make me find all of the data and knowledge that I require to do my experiment ...it simply isn’t possible anymore... Image from: Mark Smiciklas Intersection Consulting, cc-nca
  • 34. Image from AJ Cann cc-by-a license I want to support peer review(ers) so that I do better science.
  • 35. How do we get there from here?
  • 36. To overcome these intersecting problems and to achieve the goals of transparent reproducible research
  • 37. We must learn how to do research IN the Web Not OVER the Web
  • 38. How we use The Web today
  • 39. The Web is not a pigeon!
  • 42. The Semantic Web causally related to
  • 43. This is the critical bit! The link is explicitly labeled! causally related to ???
  • 44. http://semanticscience.org/resource/SIO_000243 SIO_000243: <owl:ObjectProperty rdf:about="&resource;SIO_000243"> <rdfs:label xml: lang="en"> is causally related with</rdfs:label> <rdf:type rdf:resource="&owl;SymmetricProperty"/> <rdf:type rdf:resource="&owl;TransitiveProperty"/> <dc:description xml:lang="en"> A transitive, symmetric, temporal relation in which one entity is causally related with another non-identical entity. </dc:description> <rdfs:subPropertyOf rdf:resource="&resource;SIO_000322"/> </owl:ObjectProperty> causally related with
  • 45. http://semanticscience.org/resource/SIO_000243 SIO_000243: <owl:ObjectProperty rdf:about="&resource;SIO_000243"> <rdfs:label xml: lang="en"> is causally related with</rdfs:label> <rdf:type rdf:resource="&owl;SymmetricProperty"/> <rdf:type rdf:resource="&owl;TransitiveProperty"/> <dc:description xml:lang="en"> A transitive, symmetric, temporal relation in which one entity is causally related with another non-identical entity. </dc:description> <rdfs:subPropertyOf rdf:resource="&resource;SIO_000322"/> </owl:ObjectProperty> causally related with
  • 46. Semantic Web Technologies “deep semantics”
  • 48. Ontology Spectrum Catalog/ ID Selected Logical Constraints (disjointness, inverse, …) Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (Properties) Informal is-a Formal instance Value Restrs. General Logical constraints Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty; – updated by McGuinness. Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html
  • 49. Ontology Spectrum Catalog/ ID Selected Logical Constraints (disjointness, inverse, …) Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (Properties) Informal is-a Formal instance Value Restrs. General Logical constraints Most biomedical ontologies e.g. Gene Ontology Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty; – updated by McGuinness. Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html
  • 50. Ontology Spectrum Catalog/ ID Ontologies being used in today’s talk Selected Logical Constraints (disjointness, inverse, …) Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (Properties) Informal is-a Formal instance Value Restrs. General Logical constraints Most biomedical ontologies e.g. Gene Ontology Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty; – updated by McGuinness. Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html
  • 51. Ontology Spectrum Catalog/ ID Discovery & Interpretation systems – flexible! Selected Logical Constraints (disjointness, inverse, …) Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (Properties) Informal is-a Formal instance Value Restrs. General Logical constraints Categorization Systems Like library shelves, inflexible Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty; – updated by McGuinness. Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html
  • 52. Remember, this is the critical bit! causally related with http://semanticscience.org/resource/SIO_000243 It’s relationships that make the Semantic Web “Semantic”
  • 53. Semantic Web Technologies “deep semantics”
  • 54. Even with “deep semantics” a lot of important information cannot be represented on the Semantic Web For example, all of the data that results from analytical algorithms and statistical analyses
  • 55.
  • 56.
  • 57. Varying estimates put the size of the Deep Web between 500 and 800 times larger than the surface Web
  • 58. On the WWW “automation” of access to Deep Web data happens through “Web Services”
  • 59. There are many suggestions for how to bring the Deep Web into the Semantic Web using Semantic Web Services (SWS)
  • 60. There are many suggestions for how to bring the Deep Web into the Semantic Web using Semantic Web Services (SWS) Describe input data Describe output data Describe how the system manipulates the data Describe how the world changes as a result
  • 61. There are many suggestions for how to bring the Deep Web into the Semantic Web using Semantic Web Services (SWS) Describe input data Describe output data Describe how the system manipulates the data Describe how the world changes as a result None, so far, has proven to be wildly successful (in my opinion)
  • 62. There are many suggestions for how to bring the Deep Web into the Semantic Web using Semantic Web Services (SWS) Describe input data Describe output data Describe how the system manipulates the data Describe how the world changes as a result None, so far, has proven to be wildly successful (in my opinion) …because describing what a Service does is HARD!
  • 63. Lord, Phillip, et al. The Semantic Web–ISWC 2004 (2004): 350-364.
  • 64. Scientific Web Services are DIFFERENT! Lord, Phillip, et al. The Semantic Web–ISWC 2004 (2004): 350-364.
  • 65. “The service interfaces within bioinformatics are relatively simple. An extensible or constrained interoperability framework is likely to suffice for current demands: a fully generic framework is currently not necessary.” Lord, Phillip, et al. The Semantic Web–ISWC 2004 (2004): 350-364.
  • 66. Scientific Web Services are DIFFERENT! They’re simpler! So perhaps we can solve the Semantic Web Service problem as it pertains to this (important!) domain
  • 67. With respect to the Semantic Web What is missing from this list? Describe input data Describe output data Describe how the system manipulates the data Describe how the world changes as a result
  • 68. causally related with http://semanticscience.org/resource/SIO_000243
  • 69. causally related with http://semanticscience.org/resource/SIO_000243 The Semantic Web gets its semantics from relationships
  • 70. causally related with http://semanticscience.org/resource/SIO_000243 The Semantic Web gets its semantics from relationships In 2008 I published a set of design-patterns for scientific Semantic Web Services that focuses on the biological relationship that the Service “exposes”
  • 71. Design Pattern for Web Services on the Semantic Web
  • 73. AACTCTTCGTAGTG... SADI BLAST has_seq_string has homology to Terminal Flower type gene species A. thal. has_seq_string sequence SADI requires you to explicitly declare as part of your analytical output, the biological relationship that your algorithm “exposed”. AACTCTTCGTAGTG... sequence
  • 74. I want to share several stories that demonstrate the cool things that happen when you use SADI + deep semantics
  • 75. Story #1: SHARE The Semantic Health and Research Environment
  • 76. A proof-of-concept workflow orchestrator + SADI Semantic Web Service registry Objective: answer biologists’ questions
  • 77. The SHARE registry indexes all of the input/output/relationship triples that can be generated by all known services This is how SHARE discovers services
  • 78. SHARE demonstrations with increasing semantic complexity
  • 79. What is the phenotype of every allele of the Antirrhinum majus DEFICIENS gene SELECT ?allele ?image ?desc WHERE { locus:DEF genetics:hasVariant ?allele . ?allele info:visualizedByImage ?image . ?image info:hasDescription ?desc }
  • 80. What is the phenotype of every allele of the Antirrhinum majus DEFICIENS gene SELECT ?allele ?image ?desc WHERE { locus:DEF genetics:hasVariant ?allele . ?allele info:visualizedByImage ?image . ?image info:hasDescription ?desc } The query language here is SPARQL The W3C-approved, standard query language for the Semantic Web
  • 81. What is the phenotype of every allele of the Antirrhinum majus DEFICIENS gene SELECT ?allele ?image ?desc WHERE { locus:DEF genetics:hasVariant ?allele . ?allele info:visualizedByImage ?image . ?image info:hasDescription ?desc } Note that there is no “FROM” clause! We don’t tell it where it should get the information, The machine has to figure that out by itself...
  • 82. What is the phenotype of every allele of the Antirrhinum majus DEFICIENS gene SELECT ?allele ?image ?desc WHERE { locus:DEF genetics:hasVariant ?allele . ?allele info:visualizedByImage ?image . ?image info:hasDescription ?desc } Starting data: the locus “DEF” (Deficiens)
  • 83. What is the phenotype of every allele of the Antirrhinum majus DEFICIENS gene SELECT ?allele ?image ?desc WHERE { locus:DEF genetics:hasVariant ?allele . ?allele info:visualizedByImage ?image . ?image info:hasDescription ?desc } Query: A series of relationships v.v. DEF
  • 84. Enter that query into SHARE
  • 86. ...and in a few seconds you get your answer. Based on the relationships in your query, SHARE queried its registry to automatically discover SADI Services capable of generating those triples
  • 87. Because it is the Semantic Web The query results are live hyperlinks to the respective Database or images (The answer is IN the Web!)
  • 88. What pathways does UniProt protein P47989 belong to? PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> PREFIX ont: <http://ontology.dumontierlab.com/> PREFIX uniprot: <http://lsrn.org/UniProt:> SELECT ?gene ?pathway WHERE { uniprot:P47989 pred:isEncodedBy ?gene . ?gene ont:isParticipantIn ?pathway . }
  • 89. What pathways does UniProt protein P47989 belong to? PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> PREFIX ont: <http://ontology.dumontierlab.com/> PREFIX uniprot: <http://lsrn.org/UniProt:> SELECT ?gene ?pathway WHERE { uniprot:P47989 pred:isEncodedBy ?gene . ?gene ont:isParticipantIn ?pathway . }
  • 90. What pathways does UniProt protein P47989 belong to? PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> PREFIX ont: <http://ontology.dumontierlab.com/> PREFIX uniprot: <http://lsrn.org/UniProt:> SELECT ?gene ?pathway WHERE { uniprot:P47989 pred:isEncodedBy ?gene . ?gene ont:isParticipantIn ?pathway . } Note again that there is no “From” clause… I have not told SHARE where to look for the answer, I am simply asking my question
  • 91. Enter that query into SHARE
  • 92.
  • 93.
  • 94. Two different providers of gene information (KEGG & NCBI); were found & accessed Two different providers of pathway information (KEGG and GO); were found & accessed
  • 95. The results are all links to the original data (The answer is IN the Web!)
  • 96. Show me the latest Blood Urea Nitrogen and Creatinine levels of patients who appear to be rejecting their transplants PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creat FROM <http://sadiframework.org/ontologies/patients.rdf> WHERE { ?patient rdf:type patient:LikelyRejecter . ?patient l:latestBUN ?bun . ?patient l:latestCreatinine ?creat . }
  • 97. Show me the latest Blood Urea Nitrogen (BUN) and Creatinine levels of patients who appear to be rejecting their transplants PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creat FROM <http://sadiframework.org/ontologies/patients.rdf> WHERE { ?patient rdf:type patient:LikelyRejecter . ?patient l:latestBUN ?bun . ?patient l:latestCreatinine ?creat . }
  • 98. Likely Rejecter: A patient who has creatinine levels that are increasing over time - - Mark D Wilkinson’s definition
  • 99. Likely Rejecter: …but there is no “likely rejecter” column or table in our database… only blood chemistry measurements at various time-points
  • 100. Likely Rejecter: So the data required to answer this question DOESN’T EXIST!
  • 101. My definition of a Likely Rejecter is encoded in a machine-readable document written in the OWL Ontology language Basically: “the regression line over creatinine measurements should have an increasing slope”
  • 102. Our ontology refers to other ontologies (possibly published by other people) to learn about what the properties of “regression models” are e.g. that regression models have slopes and intercepts and that slopes and intercepts have decimal values
  • 103. ?
  • 104. Enter that query into SHARE
  • 105. SHARE examines the query Burrows around the Web reading the various ontologies then uses the discovered Class definitions as a template to map a path from what it has, to what it needs, using SADI services
  • 106. Based on the Class definition SHARE decides that it needs to do a Linear Regression analysis on the blood creatinine measurements
  • 107. ?
  • 108. The conversation between SHARE and the registry reveals the use of “Deep Semantics” Q: Is there a SADI service that will consume instances of Patient and give me instances of LikelyRejector A: No Q: Okay... So LikelyRejectors need a regression model of increasing slope over their BloodCreatinine, so... Is there a SADI service that will consume BloodCreatinine over time and give me its linear regression model? A: No Q: Okay... Blood Creatinine over time is a subclass of data of type X/Y coordinate, so is there a service that consumes X/Y data and returns its regression model? A: Yes  here’s the URL.
  • 109. The SHARE system utilizes SADI to discover analytical services on the Web that do linear regression analysis and sends the data to be analyzed
  • 110. This happens iteratively (e.g. SHARE also has to examine the slope of the regression line using another service, find the “latest” in a series of time measurements, etc.) There is reasoning after every Service invocation (i.e. after every clause in the query) Once it is able to find instances (OWL Individuals) of the LikelyRejector class, it continues with the rest of the query
  • 111. VOILA!
  • 112. The way SHARE “interprets” data varies depending on the context of the query (i.e. which ontologies it reads – Mine? Yours?) and on what part of the query it is trying to answer at any given moment (which ontological concept is relevant to that clause)
  • 113. Example? Blood Creatinine measurements were not dictated to be Blood Creatinine measurements
  • 114. Example? The data had the ‘qualities/properties’ that allowed one machine to interpret that they were Blood Creatinine measurements (e.g. to determine which patients were rejecting)
  • 115. Example? But the data also had the ‘qualities/properties’ that allowed another machine to interpret them as Simple X/Y coordinate data (e.g. the Linear Regression calculation tool)
  • 116. Benefit of Deep Semantics Data is amenable to constant re-interpretation
  • 118. Story #2: Measurement Units One example of the “little ways” that Semantics will help researchers day-by-day
  • 119. Units must be harmonized Don’t leave this up to the researcher (it’s fiddly, time-consuming, and error-prone)
  • 120. NASA Mars Climate Orbiter
  • 121. Oops!
  • 122. The Reality of Clinical Datasets (this is a small snapshot of a dataset we worked on, courtesy of Dr. Bruce McManus & Janet McManus, from the PROOF COE) ID HEIGHT WEIGHT SBP CHOL HDL BMI GR SBP GR CHOL GR HDL GR pt1 1.82 177 128 227 55 0 0 1 0 pt2 179 196 13.4 5.9 1.7 1 0 1 0 Height in m and cm Chol in mmol/l and mg/l ...and other delicious weirdness  The clinical analyses described here were supported in part by the PROOF Center of Excellence for the Prevention of Organ Failure
  • 123. GOAL: reduce the likelihood of errors by getting the clinical researcher “out of the loop” (as per the Institute of Medicine Recommendations)
  • 124. Experiment: Reproduce a clinical study (from >10 years ago) by logically encoding the clinical diagnosis guidelines of the American Heart Association then ask SHARE to automatically analyse the patient clinical data
  • 125. Semantically defining globally-accepted clinical phenotypes; Building on the expertise of others SystolicBloodPressure = GALEN:SystolicBloodPressure and GALEN is a popular biomedical ontology but it is largely, like GO, a series of named but undefined Classes ("sio:has measurement value" some "sio:measurement" and ("sio:has unit" some “om: unit of measure”) and (“om:dimension” value “om:pressure or stress dimension”) and "sio:has value" some rdfs:Literal))
  • 126. Semantically defining globally-accepted clinical phenotypes; Building on the expertise of others SystolicBloodPressure = relationships like “has measurement valule” GALEN:SystolicBloodPressure and So we use OWL to extend the GALEN Classes with rich, logical descriptors that take advantage of rich semantic and “dimension” and “has unit” ("sio:has measurement value" some "sio:measurement" and ("sio:has unit" some “om: unit of measure”) and (“om:dimension” value “om:pressure or stress dimension”) and "sio:has value" some rdfs:Literal))
  • 127. Semantically defining globally-accepted clinical phenotypes; Building on the expertise of others SystolicBloodPressure = GALEN:SystolicBloodPressure and ("sio:has measurement value" some "sio:measurement" and ("sio:has unit" some “om: unit of measure”) and (“om:dimension” value “om:pressure or stress dimension”) and "sio:has value" some rdfs:Literal)) Very general definition “some kind of pressure unit” (so that others can build on this as they wish!)
  • 128. Semantically defining globally-accepted clinical phenotypes; Building on the expertise of others HighRiskSystolicBloodPressure (as defined by Framingham) SystolicBloodPressure and sio:hasMeasurement some (sio:Measurement and (“sio:has unit” value om:kilopascal) and (sio:hasValue some double[>= "18.7"^^double]))) Now we are specific to our clinical study (Framingham definitions): MUST be in kpascal and must be > 18.7
  • 129. Running the Clinical Analysis “Select the patients who are at-risk” SELECT ?record ?convertedvalue ?convertedunit FROM <./patient.rdf> WHERE { ?record rdf:type measure:HighRiskSystolicBloodPressure . ?record sio:hasMeasurement ?measurement. ?measurement sio:hasValue ?Pressure. } All measurements have now been automatically harmonized to KiloPascal, because we encoded the semantics in the model RecordID Start Val Start Unit Pressure End Unit Pt1 15 cmHg 19.998 KiloPascal Pt2 14.6 cmHg 19.465 KiloPascal Pt1 148 mmHg 19.731 KiloPascal Pt2 146 mmHg 19.465 KiloPascal
  • 130. While doing this experiment, we noticed some interesting anomalies…
  • 131. Visual inspection of our output data and the AHA guidelines showed that in many cases the clinician “tweaked” the guidelines when doing their analysis ------------------ AHA BMI risk threshold: BMI=25 In our dataset the clinical researcher used BMI=26 ------------------ AHA HDL guideline HDL<=1.03mmol/l The dataset from our researcher: HDL<=0.89mmol/l -------------------
  • 132. Visual inspection of our output data and the AHA guidelines showed that in many cases the clinician “tweaked” the guidelines when doing their analysis These Alterations Were Not Recorded in Their Study Notes!
  • 133. Adjusting our Semantic definitions and re-running the analysis resulted in nearly 100% correspondence with the clinical researcher HighRiskCholesterolRecord= PatientRecord and (sio:hasAttribute some (cardio:SerumCholesterolConcentration and sio:hasMeasurement some ( sio:Measurement and (sio:hasUnit value cardio:mili-mole-per-liter) and (sio:hasValue some double[>= 5.0])))) HighRiskCholesterolRecord= PatientRecord and (sio:hasAttribute some (cardio:SerumCholesterolConcentration and sio:hasMeasurement some ( sio:Measurement and (sio:hasUnit value cardio:mili-mole-per-liter) and (sio:hasValue some double[>= 5.2]))))
  • 134. Reflect on this for a second... Because this is important! 1. We semantically encoded clinical guidelines 2. We found that clinical researchers did not follow the official guidelines 3. Their “personalization” of the guidelines was unreported 4. Nevertheless, we were able to create “personalized” Semantic Models 5. These models reflect the opinion of an individual domain-expert 6. These models are shared on the Web 7. Can be automatically re-used by others to interpret their own data using that clinical expert’s viewpoint
  • 135. PREFIX AHA =http://americanheart.org/measurements/ PREFIX McManus=http://stpaulshospital.org/researchers/mcmanus/ AHA:HighRiskCholesterolRecord PatientRecord and (sio:hasAttribute some (cardio:SerumCholesterolConcentration and sio:hasMeasurement some ( sio:Measurement and (sio:hasUnit value cardio:mili-mole-per-liter) and (sio:hasValue some double[>= 5.0])))) McManus:HighRiskCholesterolRecord PatientRecord and (sio:hasAttribute some (cardio:SerumCholesterolConcentration and sio:hasMeasurement some ( sio:Measurement and (sio:hasUnit value cardio:mili-mole-per-liter) and (sio:hasValue some double[>= 5.2]))))
  • 136. To do the analysis using AHL guidelines SELECT ?patient ?risk WHERE { ?patient rdf:type AHA: HighRiskCholesterolRecord . ?patient ex:hasCholesterolProfile ?risk }
  • 137. To do the analysis using McManus’ expert-opinion SELECT ?patient ?risk WHERE { ?patient rdf:type McManus:HighRiskCholesterolRecord . ?patient ex:hasCholesterolProfile ?risk }
  • 138. Flexibility Transparency Reproducibility Shareability Comparability Simplicity Automation
  • 139. Personalization (I’m going to return to this point several times)
  • 140. Story #3: in silico Science Reproduce a peer-reviewed scientific publication by semantically modelling the problem
  • 141. The Publication Discovering Protein Partners of a Human Tumor Suppressor Protein
  • 142. Original Study Simplified Using what is known about protein interactions in fly & yeast predict new interactions with this Human Tumor Suppressor
  • 143. Semantic Model of the Experiment OWL
  • 144. Semantic Model of the Experiment Note that every word in this diagram is, in reality, a URL (it’s a Semantic Web model) i.e. It refers to the expertise of other researchers, distributed around the world on the Web
  • 145. Set-up the Experimental Conditions In a local data-file provide the protein we are interested in and the two species we wish to use in our comparison taxon:9606 a i:OrganismOfInterest . # human uniprot:Q9UK53 a i:ProteinOfInterest . # ING1 taxon:4932 a i:ModelOrganism1 . # yeast taxon:7227 a i:ModelOrganism2 . # fly
  • 146. SELECT ?protein FROM <file:/local/workflow.input.n3> WHERE { ?protein a i:ProbableInteractor . } Run the Experiment
  • 147. SELECT ?protein FROM <file:/local/workflow.input.n3> WHERE { ?protein a i:ProbableInteractor . } Run the Experiment This is the URL that leads our computer to the Semantic model of the problem
  • 148. SHARE examines the semantic model of Probable Interactors Retrieves third-party expertise from the Web Discusses with SADI what analytical tools are necessary Chooses the right tools for the problem Solves the problem!
  • 149. SHARE derives (and executes) the following analysis automatically
  • 150. SHARE is aware of the context of the specific question being asked
  • 151.
  • 152. There are five very cool things about what you just saw...
  • 153. There are five very cool things about what you just saw... was able to create a workflow based on a semantic model 1.
  • 154. There are five very cool things about what you just saw... was able to create a COMPUTATIONAL workflow based on a BIOLOGICAL model 2.
  • 155. There are five very cool things about what you just saw... (this is important because we want who don’t speak computerese!) 2. this system to be used by clinicians and biologists
  • 156. There are five very cool things about what you just saw... The workflow it created, and services selected, differed depending on the context of the question 3. taxon:4932 a i:ModelOrganism1 . # yeast taxon:7227 a i:ModelOrganism2 . # fly
  • 157. There are five very cool things about what you just saw... The machine was contextually “aware of” The workflow it created, and services chosen, differed depending on the BOTH the biological model context of the question 3. AND the data it was analysing taxon:4932 a i:ModelOrganism1 . # yeast taxon:7227 a i:ModelOrganism2 . # fly (...remember this... It will be important later!)
  • 158. There are five very cool things about what you just saw... The ontological model was abstract (and shareable!), but the workflow generated from that model was explicit and concrete 4.
  • 159. There are five very cool things about what you just saw... The ontological model was abstract (and shareable!), but the workflow generated from that model was explicit and concrete 4.
  • 160. There are five very cool things about what you just saw... The ontological model was abstract (and shareable!), but the workflow generated from that model was explicit and concrete 4. This matters because…
  • 161. Remember Trend #1 “the most common errors are simple, the most simple errors are common” At least partially because the analytical methodology was inappropriate and/or not sufficiently described
  • 162. Remember Trend #1 “the most common errors are simple, the most simple errors are common” At least partially because the analytical methodology was inappropriate and/or not sufficiently described Here, the methodology leading to a result is explicit and automatically constructed from an abstract template so this is (at least in part) a Solved Problem
  • 163. There are five very cool things about what you just saw... The choice of tool-selection was guided by the knowledge of worldwide domain-experts encoded in globally-distributed ontologies (e.g. Expert high-throughput statisticians, etc...) 5.
  • 164. There are five very cool things about what you just saw... The choice of tool-selection was guided by the knowledge of worldwide domain-experts encoded in globally-distributed ontologies (e.g. Expert high-throughput statisticians, etc...) And this matters because… 5.
  • 165. Remember Trend #2 Even small, moderately-funded laboratories can now afford to produce more data than they can manage or interpret These labs will likely never be able to afford a qualified data scientist
  • 166. Remember Trend #2 Even small, moderately-funded laboratories can now afford to produce more data than they can manage or interpret These labs will likely never be able to afford a qualified data scientist But if the expert knowledge of data scientists is encoded in ontologies, and can be discovered in a contextually-aware manner… then this is a SOLVED PROBLEM
  • 167. Story #4: Personalized Health Info Can we make the Health information on the Web more “personal”?
  • 168. Remember when I said... The machine was contextually “aware of” BOTH the biological model AND the data it was analysing
  • 169. This “dual-awareness” provides some very interesting opportunities for personalizing a patient’s Health Research activity
  • 170. PROBLEM: Patients are self-educating both about their personal medical situation (e.g. getting themselves sequenced) also surfing the Web, getting dubious advice from sites of dubious authority and joining social-health groups to exchange (often anecdotal) medical “advice” with other patients
  • 171. PROBLEM: Patients are self-educating The information on any given site may or may not be relevant to THAT patient Information on the Web is, by nature, not personalized
  • 172. PROBLEM: Clinicians often have patients (especially chronically-ill patients) on a “trajectory” of treatment Medicine is complicated! e.g. the treatment trajectory of the patient can be multi-step, and a specific sign/symptom might be perfectly normal at a particular phase in their “flow” of treatment
  • 173. PROBLEM SUMMARY Patients are reading non-personalized medical text of dubious quality and relevance Clinicians have no way to intervene in this self-education process explaining to patients how the information they read relates to their personal “health trajectory”
  • 174. Now you might see why this is so relevant! The machine was contextually “aware of” BOTH the biological model AND the data it was analysing
  • 175. This is an early prototype of a Patient-driven Personalized Medicine Web interface
  • 176. Basically, it is a set of SHARE queries Attached to a local database of patient information Running behind a Web bookmarklet
  • 177. The queries text-mine a Web page then compare the concepts in the page to the patient’s personal data using a SHARE query
  • 178. The queries text-mine a Web page then compare the concepts in the page to the patient’s personal data using a SHARE query (that could contain ontologies... ...ontologies designed by their clinician!!)
  • 179.
  • 180.
  • 181.
  • 182.
  • 183. Matching based on official name, compound name, brand name, trade name, or “common name” 
  • 184. Still needs some work... ??!?!?
  • 185.
  • 186.
  • 187.
  • 188. Link out to PubMed Why the alert?
  • 189.
  • 190. The SADI+SHARE workflow and reasoning was personalized to YOUR medical data
  • 191. In future iterations, we will enable the workflow to be further customized through “personalized” OWL Classes (e.g. Provided by your Clinician!!)
  • 192. These OWL Classes might include information about the current trajectory of your treatment for a chronic disease, for example, such that what you read on the Web is placed in the context of your expert Clinical care...
  • 193. Frankly, I think it’s quite cool that people patients are creating and running “personal health-research” workflows at the touch of a button!
  • 194. Almost the end… Three brief final points....
  • 195. Publication Discourse Interpretation Hypothesis Experiment ? ?
  • 196. The Semantic Model represents a possible solution to a problem
  • 197. The Semantic Model represents a possible solution to a problem By my definition, that is a hypothesis
  • 198. The Semantic Model represents a possible solution to a problem That hypothesis is tested by automatically converting it into a workflow;
  • 199. The Semantic Model represents a possible solution to a problem That hypothesis is tested by automatically converting it into a workflow; the workflow, and the results of the workflow are intimately tied to the hypothesis
  • 200. The Semantic Model represents a possible solution to a problem i.e. You (or anyone!) can determine exactly which aspect of the hypothesis led to which output data element, why, and how
  • 201. The Semantic Model represents a possible solution to a problem “Exquisite Provenance” a perfect record not only of what was done, when, and how but also WHY
  • 202. And this is important because...
  • 203. “Exquisite Provenance” is required for the output data and knowledge to be published as...
  • 204. Richly annotated, citable, and queryable snippets of scientific knowledge encoded in Linked Data/OWL i.e. a way to publish data and knowledge on the Semantic Web
  • 205. Publication Discourse Interpretation Hypothesis Experiment
  • 206. A “modest” vision for pure in silico Science
  • 207.
  • 208. Last point… perhaps this is not yet obvious…
  • 209. SADI services consume Linked Data on the Web
  • 210. SADI services consume Linked Data on the Web The ontologies provided to SHARE are written in OWL, and are therefore inherently part of the Web
  • 211. SADI services consume Linked Data on the Web The ontologies provided to SHARE are written in OWL, and are therefore inherently part of the Web SADI services create novel semantic links between existing data-points on the Web, or between existing data and new data
  • 212. SADI services consume Linked Data on the Web The ontologies provided to SHARE are written in OWL, and are therefore inherently part of the Web SADI services create novel semantic links between existing data-points on the Web, or between existing data and new data The output of the automatically-generated workflow is therefore Linked Data and is therefore inherently part of the Web
  • 213. SADI services consume Linked Data on the Web The ontologies provided to SHARE are written in OWL, and are therefore inherently part of the Web SADI services create novel semantic links between existing data-points on the Web, or between existing data and new data The output of the automatically-generated workflow is therefore Linked Data and is therefore inherently part of the Web The concluding NanoPublications are a combination of Linked Data and OWL, and are published directly to the Web
  • 214. The Life Science “Singularity” We Are Here! The Semantic Web is a cradle-to-grave biomedical research platform that can, and will, dramatically improve how biomedical research is done
  • 215. The important people Luke McCarthy (SADI/SHARE) Benjamin Vandervalk (SHARE) Dr. Soroush Samadian (clinical experiments) Ian Wood (Experiment-replication experiment)