SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Bio2RDF Release 2: Improved coverage,
interoperability and provenance of Life
Science Linked Data
Linked Data for the Life Sciences
Alison Callahan1, Jose Cruz-Toledo1, Peter Ansell2, Michel Dumontier1
1Carleton University, 2University of Queensland
ESWC2013::Bio2RDF Release 21
ESWC2013::Bio2RDF Release 2
is an open source framework
on the emerging semantic web
to produce and provide biological linked data
that uses simple conventions
2
ESWC2013::Bio2RDF Release 2
reduces the time and effort
so that you can get to
involved in data integration
doing science
3
Main features of Bio2RDF Release 2
• Bio2RDF conversion scripts, mapping files and web
application are open source and freely available at
http://github.com/bio2rdf
• Bio2RDF enables (syntactic) data integration within and
across datasets by using one language (RDF), having a
common URI pattern and a common resource registry
• 19 Release 2 datasets have provenance and endpoints
feature pre-computed graph summaries for fast lookup
• Bio2RDF web application enables entity resolution, query
federation across an expandable distributed network of
SPARQL endpoints
ESWC2013::Bio2RDF Release 24
ESWC2013::Bio2RDF Release 2
At the heart of Linked Data for the Life Sciences
5
Bio2RDF data are identified using
simple http URI patterns
Bio2RDF data are identified by Internationalized Resource Identifiers
(IRIs) of the form:
• http://bio2rdf.org/namespace:identifier
for source data with an assigned identifier
• http://bio2rdf.org/namespace_resource:identifier
for source data without an assigned identifier
• http://bio2rdf.org/namespace_vocabulary:identifier
for dataset-specific types and relations
Where namespace comes from a curated registry of datasets, hence
enabling simple syntactic-based integration in and across datasets.
ESWC2013::Bio2RDF Release 26
Data are described through machine-
understandable statements
ESWC2013::Bio2RDF Release 2
drugbank:DB00650
drugbank_vocabulary:Drug
rdf:type
drugbank_resource:DB00440_DB00650
drugbank_vocabulary:Drug-Drug-Interaction
rdf:type
drugbank_vocabulary:ddi-interactor-in
rdfs:label
DDI between Trimethoprim and
Leucovorin [drugbank_resource:
DB00440_DB00650]
rdfs:label
Leucovorin [drugbank:DB00650]
7
The linked data network expands
with inter-dataset statements
ESWC2013::Bio2RDF Release 2
drugbank:DB00650
pharmgkb_vocabulary:Drug
rdf:type
rdfs:label
Leucovorin [drugbank:DB00650]
pharmgkb:PA450198
drugbank_vocabulary:Drug
pharmgkb_vocabulary:xref
leucovorin [pharmgkb:PA450198]
rdfs:label
DrugBank
PharmGKB
8
Linked Data: You can look it up
ESWC2013::Bio2RDF Release 29
You can get what links to it
ESWC2013::Bio2RDF Release 2
http://bio2rdf.org/linksns/drugbank/drugbank:DB00650
What links to DrugBank’s Leucovorin?
10
Every Bio2RDF dataset now contains
provenance metadata
ESWC2013::Bio2RDF Release 2
Features
- Entity-dataset link
- Creator
- Publisher
- Date created
- License & rights
- Source
- Availability
- SPARQL endpoint
- Data dump
Vocabularies
VoID
Dublin Core
W3C Provenance
Bio2RDF vocabulary
11
data item
Bio2RDF
dataset
Source
dataset
void:inDataset
prov:wasDerivedFrom
Dataset Namespace # of triples
Affymetrix affymetrix 44469611
Biomodels* biomodels 589753
Comparative Toxicogenomics Database ctd 141845167
DrugBank drugbank 1121468
NCBI Gene ncbigene 394026267
Gene Ontology Annotations goa 80028873
HUGO Gene Nomenclature Committee hgnc 836060
Homologene homologene 1281881
InterPro* interpro 999031
iProClass iproclass 211365460
iRefIndex irefindex 31042135
Medical Subject Headings mesh 4172230
NCBO BioPortal* bioportal 15384622
National Drug Code Directory* ndc 17814216
Online Mendelian Inheritance in Man omim 1848729
Pharmacogenomics Knowledge Base pharmgkb 37949275
SABIO-RK* sabiork 2618288
Saccharomyces Genome Database sgd 5551009
NCBI Taxonomy taxon 17814216
Total 19 1,010,758,291
Bio2RDF Release 2 – New and Updated Datasets
ESWC2013::Bio2RDF Release 212
Inter-dataset connectivity
ESWC2013::Bio2RDF Release 213
ESWC2013::Bio2RDF Release 214
The Wider Network of Bio2RDF Linked Data
ESWC2013::Bio2RDF Release 215
ESWC2013::Bio2RDF Release 216
ESWC2013::Bio2RDF Release 217
ESWC2013::Bio2RDF Release 218
Graph summaries in query formulation
ESWC2013::Bio2RDF Release 2
PREFIX drugbank_vocabulary: <http://bio2rdf.org/drugbank_vocabulary:>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?ddi ?d1name ?d2name
WHERE {
?ddi a drugbank_vocabulary:Drug-Drug-Interaction .
?d1 drugbank_vocabulary:ddi-interactor-in ?ddi .
?d1 rdfs:label ?d1name .
?d2 drugbank_vocabulary:ddi-interactor-in ?ddi .
?d2 rdfs:label ?d2name.
FILTER (?d1 != ?d2)
}
19
You can use the SPARQLed query
assistant with updated endpoints
ESWC2013::Bio2RDF Release 2
http://sindicetech.com/sindice-suite/sparqled/
graph: http://sindicetech.com/analytics20
Use virtuoso’s built in faceted
browser to construct increasingly
complex queries with little effort
ESWC2013::Bio2RDF Release 221
Federated Queries over independent
SPARQL endpoints
# get all biochemical reactions in biomodels that are kinds of "protein catabolic
process“, as defined by the gene ontology (in bioportal endpoint)
SPARQL Endpoint: http://bioportal.bio2rdf.org/sparql
SELECT ?go ?label count(distinct ?x)
WHERE {
?go rdfs:label ?label .
?go rdfs:subClassOf ?tgo OPTION (TRANSITIVE) .
?tgo rdfs:label ?tlabel .
FILTER regex(?tlabel, "^protein catabolic process")
service <http://biomodels.bio2rdf.org/sparql> {
?x <http://bio2rdf.org/biopax_vocabulary:identical-to> ?go .
?x a <http://www.biopax.org/release/biopax-level3.owl#BiochemicalReaction> .
} # end service
}
ESWC2013::Bio2RDF Release 222
Question: Find all proteins that interact with beta
amyloid
SELECT * WHERE {
?protein a bio2rdf:Protein .
?protein bio2rdf:interacts_with bio2rdf:beta-amyloid
.
}
Heterogeneous biological data on the
semantic web is difficult to query
UniProt Protein PDB Protein
iRefIndex Protein
?
Physical interaction?
Pathway interaction?
Genetic interaction?
ESWC2013::Bio2RDF Release 223
ontology as a
strategy to formally
represent and
integrate knowledge
ESWC2013::Bio2RDF Release 224
uniprot:P05067
uniprot:Protein
is a
sio:gene
is a is a
Semantic data integration, consistency checking and
query answering over Bio2RDF with the
Semanticscience Integrated Ontology (SIO)
dataset
ontology
Knowledge Base
ESWC2013::Bio2RDF Release 2
pharmgkb:PA30917
refseq:Protein
is a
is a
omim:189931
omim:Gene pharmgkb:Gene
Querying Bio2RDF Linked Open Data with a Global Schema. Alison Callahan, José Cruz-Toledo and
Michel Dumontier. Bio-ontologies 2012.
25
ESWC2013::Bio2RDF Release 226
SRIQ(D)
10700+ axioms
1300+ classes
201 object properties (inc. inverses)
1 datatype property
Bio2RDF and SIO powered SPARQL 1.1 federated query:
Find chemicals in CTD and proteins in SGD that participate
in the same GO process
SELECT ?chem, ?prot, ?proc
FROM <http://bio2rdf.org/ctd>
WHERE {
?chemical a sio:chemical-entity.
?chemical rdfs:label ?chem.
?chemical sio:is-participant-in ?process.
?process rdfs:label ?proc.
FILTER regex (?process, "http://bio2rdf.org/go:")
SERVICE <http://sgd.bio2rdf.org/sparql> {
?protein a sio:protein .
?protein sio:is-participant-in ?process.
?protein rdfs:label ?prot .
}
}
ESWC2013::Bio2RDF Release 227
Bio2RDF RDFization guidelines are
available at our wiki
https://github.com/bio2rdf/bio2rdf-scripts/wiki/RDFization-Guide-v1.1
ESWC2013::Bio2RDF Release 228
What the future holds
• Aiming for twice yearly release schedule. Next release will
include large datasets (>15B RefSeq, Genbank, PubMed, PDB)
• Working with identifiers.org to create a common registry with
2200 entries
• Spiking in identifiers.org and original data uris, where
available
• Consolidating provenance/metrics with OpenPHACTS
• Incorporate W3C Linking Open Drug Data (LODD) effort
– OMIM (released), SIDER (beta), CHEMBL (beta), LinkedCT
(beta), DailyMed (RDF available), TCM (RDF available),
• Extended dataset coverage by tapping into existing endpoints
(uniprot, bioportal, ebi-rdf?)
• Showcase with other third party tools
ESWC2013::Bio2RDF Release 229
Bio2RDF Release 2 – A summary
• Updated data conversion source code to use PHP API (all
available through GitHub)
– http://github.com/bio2rdf/bio2rdf-scripts
– https://github.com/bio2rdf/bio2rdf-scripts/wiki/RDFization-
Guide-v1.1
• Simple Bio2RDF IRI design patterns that facilitate syntactic
consistency and interoperability backed by simple registry
• Dataset provenance and metrics
• We welcome comments, suggestions and contributions
• Join our mailing list at bio2rdf@googlegroups.com
ESWC2013::Bio2RDF Release 230
Acknowledgements
Bio2RDF Release 2
Allison Callahan, Jose Cruz-Toledo, Peter Ansell
Bio2RDF
Francois Belleau, Marc-Alexandre Nolin
Alex De Leon, Steve Etlinger, Nichealla Keath
Jacques Corbeil, James Hogan Jean Morissette, Nicole
Tourigny, Philippe Rigault and Paul Roe
ESWC2013::Bio2RDF Release 231
dumontierlab.com
michel_dumontier@carleton.ca
ESWC2013::Bio2RDF Release 2
Website: http://dumontierlab.com
Presentations: http://slideshare.com/micheldumontier
32

Weitere ähnliche Inhalte

Andere mochten auch

Lee.Portfolio 09
Lee.Portfolio 09Lee.Portfolio 09
Lee.Portfolio 09leesalcone
 
Dokumentacia sommeliers seminars1
Dokumentacia sommeliers seminars1Dokumentacia sommeliers seminars1
Dokumentacia sommeliers seminars1Nural Tataoglu
 
powerpoint
powerpointpowerpoint
powerpointUruguayo
 
Laporan observasi rpp dan laboratorium
Laporan observasi rpp dan laboratoriumLaporan observasi rpp dan laboratorium
Laporan observasi rpp dan laboratoriumSamantars17
 
Exposicion de financiamientos
Exposicion de financiamientosExposicion de financiamientos
Exposicion de financiamientosAlita Orpe
 
εργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασης
εργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασηςεργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασης
εργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασηςπεντάλ σχολικό
 
Understanding Android Handling of Touch Events
Understanding Android Handling of Touch EventsUnderstanding Android Handling of Touch Events
Understanding Android Handling of Touch Eventsjensmohr
 
Mobile note mobile by MAdvertise et Bemobee
Mobile note mobile by MAdvertise et BemobeeMobile note mobile by MAdvertise et Bemobee
Mobile note mobile by MAdvertise et BemobeeFranck Deville
 
Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...
Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...
Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...Unmetric
 
(151105)영화속 와인 이야기 v2
(151105)영화속 와인 이야기   v2(151105)영화속 와인 이야기   v2
(151105)영화속 와인 이야기 v2휘웅 정
 
Short Intro to Android Fragments
Short Intro to Android FragmentsShort Intro to Android Fragments
Short Intro to Android FragmentsJussi Pohjolainen
 

Andere mochten auch (18)

Lee.Portfolio 09
Lee.Portfolio 09Lee.Portfolio 09
Lee.Portfolio 09
 
Dokumentacia sommeliers seminars1
Dokumentacia sommeliers seminars1Dokumentacia sommeliers seminars1
Dokumentacia sommeliers seminars1
 
Les xarxes socials
Les xarxes socialsLes xarxes socials
Les xarxes socials
 
powerpoint
powerpointpowerpoint
powerpoint
 
resume
resumeresume
resume
 
Clases de noviembre 5 ños 1
Clases de noviembre 5 ños 1Clases de noviembre 5 ños 1
Clases de noviembre 5 ños 1
 
Laporan observasi rpp dan laboratorium
Laporan observasi rpp dan laboratoriumLaporan observasi rpp dan laboratorium
Laporan observasi rpp dan laboratorium
 
Baby talk 2
Baby talk 2Baby talk 2
Baby talk 2
 
Exposicion de financiamientos
Exposicion de financiamientosExposicion de financiamientos
Exposicion de financiamientos
 
Great quotes of bruce lee
Great quotes of bruce leeGreat quotes of bruce lee
Great quotes of bruce lee
 
εργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασης
εργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασηςεργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασης
εργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασης
 
Understanding Android Handling of Touch Events
Understanding Android Handling of Touch EventsUnderstanding Android Handling of Touch Events
Understanding Android Handling of Touch Events
 
Mobile note mobile by MAdvertise et Bemobee
Mobile note mobile by MAdvertise et BemobeeMobile note mobile by MAdvertise et Bemobee
Mobile note mobile by MAdvertise et Bemobee
 
Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...
Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...
Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...
 
(151105)영화속 와인 이야기 v2
(151105)영화속 와인 이야기   v2(151105)영화속 와인 이야기   v2
(151105)영화속 와인 이야기 v2
 
Symfony day 2016
Symfony day 2016Symfony day 2016
Symfony day 2016
 
Conductores electricos
Conductores electricosConductores electricos
Conductores electricos
 
Short Intro to Android Fragments
Short Intro to Android FragmentsShort Intro to Android Fragments
Short Intro to Android Fragments
 

Ähnlich wie 2013 eswc-bio2rdf-r2

Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013François Belleau
 
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...Michel Dumontier
 
Presentation forpd bj_1
Presentation forpd bj_1Presentation forpd bj_1
Presentation forpd bj_1Maori Ito
 
Chem2bio2rdf portal
Chem2bio2rdf portalChem2bio2rdf portal
Chem2bio2rdf portalBin Chen
 
How can you access PubChem programmatically?
How can you access PubChem programmatically?How can you access PubChem programmatically?
How can you access PubChem programmatically?Sunghwan Kim
 
Web services for sharing germplasm data sets, at FAO in Rome (2006)
Web services for sharing germplasm data sets, at FAO in Rome (2006)Web services for sharing germplasm data sets, at FAO in Rome (2006)
Web services for sharing germplasm data sets, at FAO in Rome (2006)Dag Endresen
 
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...Dag Endresen
 
Use of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformaticsUse of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformaticsRemzi Çelebi
 
Gene Ontology Network Enrichment Analysis
Gene Ontology Network Enrichment AnalysisGene Ontology Network Enrichment Analysis
Gene Ontology Network Enrichment AnalysisUC Davis
 
Towards semantic systems chemical biology
Towards semantic systems chemical biology Towards semantic systems chemical biology
Towards semantic systems chemical biology Bin Chen
 
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeChunlei Wu
 
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)Dag Endresen
 
Presentation at the EMBL-EBI Industry RDF meeting
Presentation at the EMBL-EBI  Industry RDF meetingPresentation at the EMBL-EBI  Industry RDF meeting
Presentation at the EMBL-EBI Industry RDF meetingJohannes Keizer
 

Ähnlich wie 2013 eswc-bio2rdf-r2 (20)

Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009
 
2016 bmdid-mappings
2016 bmdid-mappings2016 bmdid-mappings
2016 bmdid-mappings
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
 
Querying Bio2RDF data
Querying Bio2RDF dataQuerying Bio2RDF data
Querying Bio2RDF data
 
SADI CSHALS 2013
SADI CSHALS 2013SADI CSHALS 2013
SADI CSHALS 2013
 
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
 
Bio2RDF@BH2010
Bio2RDF@BH2010Bio2RDF@BH2010
Bio2RDF@BH2010
 
Presentation forpd bj_1
Presentation forpd bj_1Presentation forpd bj_1
Presentation forpd bj_1
 
Chem2bio2rdf portal
Chem2bio2rdf portalChem2bio2rdf portal
Chem2bio2rdf portal
 
How can you access PubChem programmatically?
How can you access PubChem programmatically?How can you access PubChem programmatically?
How can you access PubChem programmatically?
 
Web services for sharing germplasm data sets, at FAO in Rome (2006)
Web services for sharing germplasm data sets, at FAO in Rome (2006)Web services for sharing germplasm data sets, at FAO in Rome (2006)
Web services for sharing germplasm data sets, at FAO in Rome (2006)
 
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
 
Use of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformaticsUse of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformatics
 
Gene Ontology Network Enrichment Analysis
Gene Ontology Network Enrichment AnalysisGene Ontology Network Enrichment Analysis
Gene Ontology Network Enrichment Analysis
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 
Towards semantic systems chemical biology
Towards semantic systems chemical biology Towards semantic systems chemical biology
Towards semantic systems chemical biology
 
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
 
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
 
AGROVOC, AGRIS and the CIARD RING, using RDF vocabularies and technologies f...
AGROVOC, AGRIS and the CIARD RING,  using RDF vocabularies and technologies f...AGROVOC, AGRIS and the CIARD RING,  using RDF vocabularies and technologies f...
AGROVOC, AGRIS and the CIARD RING, using RDF vocabularies and technologies f...
 
Presentation at the EMBL-EBI Industry RDF meeting
Presentation at the EMBL-EBI  Industry RDF meetingPresentation at the EMBL-EBI  Industry RDF meeting
Presentation at the EMBL-EBI Industry RDF meeting
 

Mehr von Michel Dumontier

A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsMichel Dumontier
 
Data-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsData-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsMichel Dumontier
 
The Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health SystemThe Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health SystemMichel Dumontier
 
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...Michel Dumontier
 
The role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health SystemThe role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health SystemMichel Dumontier
 
Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Michel Dumontier
 
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Michel Dumontier
 
Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?Michel Dumontier
 
The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...Michel Dumontier
 
Keynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerKeynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerMichel Dumontier
 
The future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureThe future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureMichel Dumontier
 
Developing and assessing FAIR digital resources
Developing and assessing FAIR digital resourcesDeveloping and assessing FAIR digital resources
Developing and assessing FAIR digital resourcesMichel Dumontier
 
Advancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIRAdvancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIRMichel Dumontier
 
A Framework to develop the FAIR Metrics
A Framework to develop the FAIR MetricsA Framework to develop the FAIR Metrics
A Framework to develop the FAIR MetricsMichel Dumontier
 
FAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationFAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationMichel Dumontier
 
Towards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessTowards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessMichel Dumontier
 

Mehr von Michel Dumontier (20)

A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge Graphs
 
Data-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsData-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge Graphs
 
Evaluating FAIRness
Evaluating FAIRnessEvaluating FAIRness
Evaluating FAIRness
 
The Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health SystemThe Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health System
 
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
 
The role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health SystemThe role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health System
 
Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...
 
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
 
Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?
 
The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...
 
Keynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerKeynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University Dinner
 
The future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureThe future of science and business - a UM Star Lecture
The future of science and business - a UM Star Lecture
 
Are we FAIR yet?
Are we FAIR yet?Are we FAIR yet?
Are we FAIR yet?
 
Developing and assessing FAIR digital resources
Developing and assessing FAIR digital resourcesDeveloping and assessing FAIR digital resources
Developing and assessing FAIR digital resources
 
Advancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIRAdvancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIR
 
A Framework to develop the FAIR Metrics
A Framework to develop the FAIR MetricsA Framework to develop the FAIR Metrics
A Framework to develop the FAIR Metrics
 
FAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationFAIR principles and metrics for evaluation
FAIR principles and metrics for evaluation
 
Towards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessTowards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRness
 
Data Science for the Win
Data Science for the WinData Science for the Win
Data Science for the Win
 
Ontologies
OntologiesOntologies
Ontologies
 

Kürzlich hochgeladen

Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 

Kürzlich hochgeladen (20)

Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 

2013 eswc-bio2rdf-r2

  • 1. Bio2RDF Release 2: Improved coverage, interoperability and provenance of Life Science Linked Data Linked Data for the Life Sciences Alison Callahan1, Jose Cruz-Toledo1, Peter Ansell2, Michel Dumontier1 1Carleton University, 2University of Queensland ESWC2013::Bio2RDF Release 21
  • 2. ESWC2013::Bio2RDF Release 2 is an open source framework on the emerging semantic web to produce and provide biological linked data that uses simple conventions 2
  • 3. ESWC2013::Bio2RDF Release 2 reduces the time and effort so that you can get to involved in data integration doing science 3
  • 4. Main features of Bio2RDF Release 2 • Bio2RDF conversion scripts, mapping files and web application are open source and freely available at http://github.com/bio2rdf • Bio2RDF enables (syntactic) data integration within and across datasets by using one language (RDF), having a common URI pattern and a common resource registry • 19 Release 2 datasets have provenance and endpoints feature pre-computed graph summaries for fast lookup • Bio2RDF web application enables entity resolution, query federation across an expandable distributed network of SPARQL endpoints ESWC2013::Bio2RDF Release 24
  • 5. ESWC2013::Bio2RDF Release 2 At the heart of Linked Data for the Life Sciences 5
  • 6. Bio2RDF data are identified using simple http URI patterns Bio2RDF data are identified by Internationalized Resource Identifiers (IRIs) of the form: • http://bio2rdf.org/namespace:identifier for source data with an assigned identifier • http://bio2rdf.org/namespace_resource:identifier for source data without an assigned identifier • http://bio2rdf.org/namespace_vocabulary:identifier for dataset-specific types and relations Where namespace comes from a curated registry of datasets, hence enabling simple syntactic-based integration in and across datasets. ESWC2013::Bio2RDF Release 26
  • 7. Data are described through machine- understandable statements ESWC2013::Bio2RDF Release 2 drugbank:DB00650 drugbank_vocabulary:Drug rdf:type drugbank_resource:DB00440_DB00650 drugbank_vocabulary:Drug-Drug-Interaction rdf:type drugbank_vocabulary:ddi-interactor-in rdfs:label DDI between Trimethoprim and Leucovorin [drugbank_resource: DB00440_DB00650] rdfs:label Leucovorin [drugbank:DB00650] 7
  • 8. The linked data network expands with inter-dataset statements ESWC2013::Bio2RDF Release 2 drugbank:DB00650 pharmgkb_vocabulary:Drug rdf:type rdfs:label Leucovorin [drugbank:DB00650] pharmgkb:PA450198 drugbank_vocabulary:Drug pharmgkb_vocabulary:xref leucovorin [pharmgkb:PA450198] rdfs:label DrugBank PharmGKB 8
  • 9. Linked Data: You can look it up ESWC2013::Bio2RDF Release 29
  • 10. You can get what links to it ESWC2013::Bio2RDF Release 2 http://bio2rdf.org/linksns/drugbank/drugbank:DB00650 What links to DrugBank’s Leucovorin? 10
  • 11. Every Bio2RDF dataset now contains provenance metadata ESWC2013::Bio2RDF Release 2 Features - Entity-dataset link - Creator - Publisher - Date created - License & rights - Source - Availability - SPARQL endpoint - Data dump Vocabularies VoID Dublin Core W3C Provenance Bio2RDF vocabulary 11 data item Bio2RDF dataset Source dataset void:inDataset prov:wasDerivedFrom
  • 12. Dataset Namespace # of triples Affymetrix affymetrix 44469611 Biomodels* biomodels 589753 Comparative Toxicogenomics Database ctd 141845167 DrugBank drugbank 1121468 NCBI Gene ncbigene 394026267 Gene Ontology Annotations goa 80028873 HUGO Gene Nomenclature Committee hgnc 836060 Homologene homologene 1281881 InterPro* interpro 999031 iProClass iproclass 211365460 iRefIndex irefindex 31042135 Medical Subject Headings mesh 4172230 NCBO BioPortal* bioportal 15384622 National Drug Code Directory* ndc 17814216 Online Mendelian Inheritance in Man omim 1848729 Pharmacogenomics Knowledge Base pharmgkb 37949275 SABIO-RK* sabiork 2618288 Saccharomyces Genome Database sgd 5551009 NCBI Taxonomy taxon 17814216 Total 19 1,010,758,291 Bio2RDF Release 2 – New and Updated Datasets ESWC2013::Bio2RDF Release 212
  • 15. The Wider Network of Bio2RDF Linked Data ESWC2013::Bio2RDF Release 215
  • 19. Graph summaries in query formulation ESWC2013::Bio2RDF Release 2 PREFIX drugbank_vocabulary: <http://bio2rdf.org/drugbank_vocabulary:> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?ddi ?d1name ?d2name WHERE { ?ddi a drugbank_vocabulary:Drug-Drug-Interaction . ?d1 drugbank_vocabulary:ddi-interactor-in ?ddi . ?d1 rdfs:label ?d1name . ?d2 drugbank_vocabulary:ddi-interactor-in ?ddi . ?d2 rdfs:label ?d2name. FILTER (?d1 != ?d2) } 19
  • 20. You can use the SPARQLed query assistant with updated endpoints ESWC2013::Bio2RDF Release 2 http://sindicetech.com/sindice-suite/sparqled/ graph: http://sindicetech.com/analytics20
  • 21. Use virtuoso’s built in faceted browser to construct increasingly complex queries with little effort ESWC2013::Bio2RDF Release 221
  • 22. Federated Queries over independent SPARQL endpoints # get all biochemical reactions in biomodels that are kinds of "protein catabolic process“, as defined by the gene ontology (in bioportal endpoint) SPARQL Endpoint: http://bioportal.bio2rdf.org/sparql SELECT ?go ?label count(distinct ?x) WHERE { ?go rdfs:label ?label . ?go rdfs:subClassOf ?tgo OPTION (TRANSITIVE) . ?tgo rdfs:label ?tlabel . FILTER regex(?tlabel, "^protein catabolic process") service <http://biomodels.bio2rdf.org/sparql> { ?x <http://bio2rdf.org/biopax_vocabulary:identical-to> ?go . ?x a <http://www.biopax.org/release/biopax-level3.owl#BiochemicalReaction> . } # end service } ESWC2013::Bio2RDF Release 222
  • 23. Question: Find all proteins that interact with beta amyloid SELECT * WHERE { ?protein a bio2rdf:Protein . ?protein bio2rdf:interacts_with bio2rdf:beta-amyloid . } Heterogeneous biological data on the semantic web is difficult to query UniProt Protein PDB Protein iRefIndex Protein ? Physical interaction? Pathway interaction? Genetic interaction? ESWC2013::Bio2RDF Release 223
  • 24. ontology as a strategy to formally represent and integrate knowledge ESWC2013::Bio2RDF Release 224
  • 25. uniprot:P05067 uniprot:Protein is a sio:gene is a is a Semantic data integration, consistency checking and query answering over Bio2RDF with the Semanticscience Integrated Ontology (SIO) dataset ontology Knowledge Base ESWC2013::Bio2RDF Release 2 pharmgkb:PA30917 refseq:Protein is a is a omim:189931 omim:Gene pharmgkb:Gene Querying Bio2RDF Linked Open Data with a Global Schema. Alison Callahan, José Cruz-Toledo and Michel Dumontier. Bio-ontologies 2012. 25
  • 26. ESWC2013::Bio2RDF Release 226 SRIQ(D) 10700+ axioms 1300+ classes 201 object properties (inc. inverses) 1 datatype property
  • 27. Bio2RDF and SIO powered SPARQL 1.1 federated query: Find chemicals in CTD and proteins in SGD that participate in the same GO process SELECT ?chem, ?prot, ?proc FROM <http://bio2rdf.org/ctd> WHERE { ?chemical a sio:chemical-entity. ?chemical rdfs:label ?chem. ?chemical sio:is-participant-in ?process. ?process rdfs:label ?proc. FILTER regex (?process, "http://bio2rdf.org/go:") SERVICE <http://sgd.bio2rdf.org/sparql> { ?protein a sio:protein . ?protein sio:is-participant-in ?process. ?protein rdfs:label ?prot . } } ESWC2013::Bio2RDF Release 227
  • 28. Bio2RDF RDFization guidelines are available at our wiki https://github.com/bio2rdf/bio2rdf-scripts/wiki/RDFization-Guide-v1.1 ESWC2013::Bio2RDF Release 228
  • 29. What the future holds • Aiming for twice yearly release schedule. Next release will include large datasets (>15B RefSeq, Genbank, PubMed, PDB) • Working with identifiers.org to create a common registry with 2200 entries • Spiking in identifiers.org and original data uris, where available • Consolidating provenance/metrics with OpenPHACTS • Incorporate W3C Linking Open Drug Data (LODD) effort – OMIM (released), SIDER (beta), CHEMBL (beta), LinkedCT (beta), DailyMed (RDF available), TCM (RDF available), • Extended dataset coverage by tapping into existing endpoints (uniprot, bioportal, ebi-rdf?) • Showcase with other third party tools ESWC2013::Bio2RDF Release 229
  • 30. Bio2RDF Release 2 – A summary • Updated data conversion source code to use PHP API (all available through GitHub) – http://github.com/bio2rdf/bio2rdf-scripts – https://github.com/bio2rdf/bio2rdf-scripts/wiki/RDFization- Guide-v1.1 • Simple Bio2RDF IRI design patterns that facilitate syntactic consistency and interoperability backed by simple registry • Dataset provenance and metrics • We welcome comments, suggestions and contributions • Join our mailing list at bio2rdf@googlegroups.com ESWC2013::Bio2RDF Release 230
  • 31. Acknowledgements Bio2RDF Release 2 Allison Callahan, Jose Cruz-Toledo, Peter Ansell Bio2RDF Francois Belleau, Marc-Alexandre Nolin Alex De Leon, Steve Etlinger, Nichealla Keath Jacques Corbeil, James Hogan Jean Morissette, Nicole Tourigny, Philippe Rigault and Paul Roe ESWC2013::Bio2RDF Release 231
  • 32. dumontierlab.com michel_dumontier@carleton.ca ESWC2013::Bio2RDF Release 2 Website: http://dumontierlab.com Presentations: http://slideshare.com/micheldumontier 32

Hinweis der Redaktion

  1. Bio2RDF facilitates data sharing and reuse in a number of ways: Access the scripts that generate our linked data. These scripts are openly licensed for modification, redistribution or use by anyone wishing to generate RDF data on their own We make use of syntactically consistent IRI patterns We do not impose a structural scheme for representing our data Bio2RDF allows for multiple mirrors host our data, a DNS “round robin” procedure balances incoming requests between participating mirrors, this prevents a single point of failure and allows for additional mirrors to be added to the network A centralized resource registry has been developed for consistent usage of namespaces and vocabularies- In the next slides I will go into a bit more detail about the first 3 points
  2. The Bio2RDF project transforms silos of life science data into a globally distributed network of linked data for biological knowledge discovery.
  3. In order to keep a clear link back to the original data, in our RDFized datasets we maintain the original data provider’s record identifiers by making use of the following URI pattern:- namespace: preferred short name for a biological dataset
  4. All resources in every bio2RDF graph is linked to a graph that describes the provenance of the generated linked data. We make use of the W3C Vocabulary of Interlinked Datasets (VoID), the Provenance vocabulary and Dublin core vocabulary to describe items such as: URL to the source data Date of generation Licensing information (if available) URL to the script used to generate the data Download URL for linked data files URL of SPARQL endpoint
  5. Here I present some numbers for the 19 updated datasets
  6. Here I am showing with more detail the top 10 subject type – predicate - object type frequencies that can be found in our Drugbank endpoint So for example to find drugs that participate in drug-drug interactions one would use the ddi-interactor-in predicate that associates the 1074 resources of type drug to the 10891 resources typed as drug-drug-interaction.
  7. We are currently in the process of exposing Bio2RDF data using a variety of 3rd party tools- specifically, all of our endpoints can be navigated using Virtuoso’s faceted browser- We have also generated Sindice metrics that enable the use of SPARQLedautomed query builder. (This tool allows users to semiautomatically construct SPARQL queries based on namespaces used therein)-- finally it is possible to use Sig.ma browser to aggreate data from multiple endpoints and construct mashups
  8. However, types and predicates are dataset specific and therefore not consistentWouldn’t it be nice to be able to query all of these datasets with a common nomenclature? (our objective)