The document discusses the Data2Semantics project. It aims to build useful services and tools for data publishers that maintain provenance information and cater for the entire research cycle, including a feedback loop to new research. One use case presented is developing a VIVO installation to demonstrate collaboration within a research community and integrate project results with the collaboration network. Future work discussed includes improving metadata extraction, ingesting additional content, developing shared ontologies between installations, and implementing reward mechanisms for individual authors.
2024: Domino Containers - The Next Step. News from the Domino Container commu...
COMMIT/VIVO
1. Rinke Hoekstra and Adianto Wibisono
VU University Amsterdam/University of Amsterdam
rinke.hoekstra@vu.nl
2. Rinke Hoekstra and Adianto Wibisono
VU University Amsterdam/University of Amsterdam
rinke.hoekstra@vu.nl
What is Data2Semantics?
3. Rinke Hoekstra and Adianto Wibisono
VU University Amsterdam/University of Amsterdam
rinke.hoekstra@vu.nl
What is Data2Semantics?
What is
4. Rinke Hoekstra and Adianto Wibisono
VU University Amsterdam/University of Amsterdam
rinke.hoekstra@vu.nl
What is Data2Semantics?
What is
5. Rinke Hoekstra and Adianto Wibisono
VU University Amsterdam/University of Amsterdam
rinke.hoekstra@vu.nl
What is Data2Semantics?
What is
6. Next Steps...
Rinke Hoekstra and Adianto Wibisono
VU University Amsterdam/University of Amsterdam
rinke.hoekstra@vu.nl
What is Data2Semantics?
What is
8. Data to 2 Semantics
From Data Semantics for Scientific Data Publishers http://www.data2semantics.org
TabLinker
Semi-Automatic RDF Converter for Eccentric Excel Files
, February 27, 12
Yasgui Provenance Reconstruction
COMPLEXITY vs. INTERESTINGNESS
?
Data Analysis
PROV-O-MaticTM
•
HUBBLE
Python Wrapper script for shell commands
https://github.com/Data2Semantics/data/blob/master/src/d2s/prov.py Linked Data Hub for
• Output in PROV-O & W3C Time vocabulary Clinical Decision Support
9. •
HUBBLE
Python Wrapper script for shell commands
https://github.com/Data2Semantics/data/blob/master/src/d2s/prov.py Linked Data Hub for
• Output in PROV-O & W3C Time vocabulary Clinical Decision Support
• Timestamped URIs for files/resources Hubble demonstrates three ‘sales pitches’ of
• ... integrate with GIT? linked data: inter-operability, interlinking and
tool availability.
• Provenance trail for conversion, loading and linking
AERS-LD
serious adverse
Monday, February 27, 12
event reports
exposed as
linked data
BioPortal SILK link
Mesh, specification
Google WebToolkit MedDRA, language
SnomedCT,
Partial Replication From patient to:
- Relevant publications
etc. and
PROV-O
acquiring$data$from$text?$
Cloud$
Analysis/
- Related adverse events
Semi8
Metrics$
- Clinical trials BioPortal
Automa;c$
Annota;on$ e.g.$GATE$
OpenCalais$ Amalgame$ SILK$
Querying$
- Drug information Annotator LOD Cloud
Graph$Rewri;ng$
- Known side effects
Papers &
Graph$Rewri;ng$
and$Ranking$
RDF$ RDF$ Internal$ Link$to$ with UMLS, DBPedia,
Conversion$ Cleaning$ Linking$ Other$Data$
- Statistical analysis
Guidelines Annotation Sider, Drugbank,
xml2rdf$
d2rq$ Visualiza;on$ sgvizler$
rdb2rdf$
$
Provenance$
Ontology LinkedCT
and
Enrichment$
User$ AIDA$Browser$
Interfaces$ Poseidon$(Pirates/Maps)$
PROV-O
Semi8 …$
Automa;c$
Conversion$
“tablinker”$
4Store
RDF$Feedback$
Provenance$
10. Key Points
• Build useful services and tools for data publishers ...
• ... that maintain provenance information ...
• ... and cater for the entire research cycle ...
• ... including a feedback loop to new research
13. • Public-private research community
• Emphasis on applications of IT
• Emphasis on knowledge transfer
• 15 projects
• Collaboration with EIT ICT-Labs
http://www.eitictlabs.eu/
http://www.commit-nl.nl
14. Why VIVO?
• Demonstrate collaboration within COMMIT/
between projects (synergy), between organizations
• Integrate project results with collaboration network
shared publications, deliverables
Linked Data Rubik’s Cube by Duncan Hull
16. Why ?
Most Dutch universities
Large companies
Government organizations
17.
18.
19. The Data
• COMMIT Website
http://www.commit-nl.nl
• All project plans (buzzword mining)
• All public deliverables (~200 per year)
• All participating persons (not just researchers)
20. “Pilot”
• Scraping
• Web Karma
http://bit.ly/WebKarma
21.
22.
23.
24.
25. Future Work
• Improve people scraper
first name, family name, affiliation
• Ingest other content
deliverables, plans etc.
• Shared ontology amongst Dutch VIVO installations
• Shared identifiers for researchers in NL (and VIVO)
ORCID, ResearcherID, Digital Author ID
26. Event
• Yearly event for all COMMIT people
• Tap into registration process to get detailed info
• Wireless sensor networks to capture “synergy”
• Prizes whatnot...
27. VIVO Pitfalls
• Very “institutional” perspective
• How to actively engage individual researchers?
Reward mechanisms, integrate with Web 2.0 practices...
http://oreilly.com/web2/archive/what-is-web-20.html (2005)
28. Web 2.0
• Web applications generate your data
• Rich user experience
• You control your own data
• Immediate reward
• Quality increases by usage
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46. • Lightweight Web Application
• Interface to API of existing data repositories
• Enrich metadata by linking to Linked Data resources
• Provide annotation services for data files
• Plugin based architecture
• Publish RDF metadata as new data publication
49. Where to publish the RDF?
Send me more!
http://linkitup.data2semantics.org
50.
51. Future Work
• Improve people scraper
first name, family name, affiliation
• Ingest other content
deliverables, plans etc.
• Shared ontology amongst Dutch VIVO installations
• Shared identifiers for researchers in NL
ORCID, ResearcherID, Digital Author ID
• ... reward mechanisms for individual authors!
http://www.data2semantics.org
52. Future Work
Next week COMMIT/ Data
• Improve people scraper
first name, family name, affiliation
Early March COMMIT/ VIVO
Early April COMMIT/ Days
• Ingest other content
deliverables, plans etc.
• Shared ontology amongst Dutch VIVO installations
• Shared identifiers for researchers in NL
ORCID, ResearcherID, Digital Author ID
• ... reward mechanisms for individual authors!
http://www.data2semantics.org
53. Future Work
Next week COMMIT/ Data
• Improve people scraper
first name, family name, affiliation
Early March COMMIT/ VIVO
Early April COMMIT/ Days
• Ingest other content
deliverables, plans etc.
• Shared ontology amongst Dutch VIVO installations
• Shared identifiers for researchers in NL
ORCID, ResearcherID, Digital Author ID
• ... reward mechanisms for individual authors!
http://www.data2semantics.org
54. Future Work
Next week COMMIT/ Data
• Improve people scraper
first name, family name, affiliation
Early March COMMIT/ VIVO
Early April COMMIT/ Days
• Ingest other content
deliverables, plans etc.
• Shared ontology amongst Dutch VIVO installations
• Shared identifiers for researchers in NL
ORCID, ResearcherID, Digital Author ID
• ... reward mechanisms for individual authors!
http://www.data2semantics.org