SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Downloaden Sie, um offline zu lesen
A Decentralized Network for
Publishing Linked Data
—
Nanopublications, Trusty URIs, and Science Bots
Tobias Kuhn
http://www.tkuhn.ch
@txkuhn
ETH Zurich
CERN Workshop on Innovations in Scholarly Communication
(OAI9)
Geneva
17 June 2015
Increasing Scientific Output:
>1.5M New Articles Per Year
Citation network of 30M scientific publications
Image from: Kuhn et al. Inheritance patterns in citation networks reveal scientific memes. Physical Review X 4. 2014.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 2 / 30
Increasing Importance of Scientific Data
London Underground staff sorting 4M used tickets to analyse line use in 1939
Image from: http://www.telegraph.co.uk/travel/picturegalleries/9791007/
The-history-of-the-Tube-in-pictures-150-years-of-London-Underground.html?frame=2447159
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 3 / 30
Problem:
Replication and Re-Use of Research Results
Exemplary Situation: Sue publishes a script that should allow
everybody to replicate her scientific analysis:
# Download data:
wget http://some-third-party.org/dataset/1.4
# Analyze data:
...
Problems:
• What if the resource becomes unavailable at this location?
• What if the third party silently changes that version of the
dataset?
• What if the web site gets hacked and the data manipulated?
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 4 / 30
Data Publishing, Archiving, and Re-Use
Scientific datasets become increasingly important, and these data are
increasingly produced and consumed directly by software.
Published data should therefore be:
• Verifiable (Is this really the data I am looking for?)
• Immutable (Can I be sure that it hasn’t been modified?)
• Permanent (Will it be available in 1, 5, 20 years from now?)
• Reliable (Can it be efficiently retrieved whenever needed?)
• Granular (Can I refer to individual data entries?)
• Semantic (Can it be automatically interpreted?)
• Linked (Does it use established identifiers and ontologies?)
• Trustworthy (Can I trust the source?)
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 5 / 30
Nanopublications:
Provenance-Aware Semantic Publishing
(based on RDF)
assertion
provenance
publication info
nanopublication
http://nanopub.org / @nanopub org
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 6 / 30
Vision: Changing Scholarly Communication
Now
Narrative articles at the center
Future
Nanopublications at the center
Images from Mons et al. The value of data. Nature genetics, 43(4):281–283, 2011
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 7 / 30
Nanopublication Example
sub:assertion {
sub:_3 a rdf:Statement ; rdf:subject schem:Adenosine%20triphosphate ;
rdf:predicate belv:decreases ; rdf:object sub:_1 ;
occursIn: obo:UBERON_0001134 , species:9606 .
sub:_1 a go:0003824 ; hasAgent: sub:_2 .
sub:_2 a Protein: ; geneProductOf: hgnc:12517 .
}
sub:provenance {
sub:assertion prov:hadPrimarySource pubmed:9703368 ;
prov:wasDerivedFrom beldoc: , sub:_4 .
beldoc: dce:description "Approximately 61,000 statements." ;
dce:rights "Copyright (c) 2011-2012, Selventa. All rights reserved." ;
dce:title "BEL Framework Large Corpus Document" ;
pav:authoredBy sub:_5 ; pav:version "20131211" .
sub:_4 prov:value "UCP1 contains six potential transmembrane a-helices (72) and
prov:wasQuotedFrom pubmed:9703368 .
sub:_5 rdfs:label "Selventa" .
}
sub:pubinfo {
this: dct:created "2014-07-03T14:34:13.226+02:00"^^xsd:dateTime ;
pav:createdBy orcid:0000-0001-6818-334X , orcid:0000-0002-1267-0234 .
}
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 8 / 30
Identifiability Problem of URIs (Web Links)
http://some-third-party.org/dataset/1.4
?
Given a URI for a digital artifact, there is no reliable standard
procedure of checking whether a retrieved file really represents the
correct and original state of that artifact.
Solution: Identifiers that include (iterative) cryptographic hash
values (as applied, for example, by Git and Bitcoin)
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 9 / 30
Cryptographic Hash Values
A cryptographic hash value is a short random-looking sequence of
bytes calculated on a given input:
This is some text. ⇒ hRUv0M
The same input always leads to exactly the same value:
This is some text. ⇒ hRUv0M
Even a minimally modified input leads to a completely different value:
This is xome text. ⇒ sCtYbf
The input is not reconstructible from the hash value (in practice):
This is some text. hRUv0M
Given an input and a matching hash value, we can therefore be
perfectly sure that it was exactly that input that led to the hash.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 10 / 30
Iterative Hashing
Hash values can be used as identifiers in an iterative fashion:
This is some text. ⇒ hRUv0M
This text is based on hRUv0M. ⇒ LwGqwX
This depends on LwGqwX. ⇒ civRbq
From a single identifier (such as civRbq), the entire reference tree
can be verified:
This is some text. ⇒ hRUv0M
This text is based on hRUv0M. ⇒ LwGqwX
This depends on LwGqwX. ⇒ civRbq
And any modification can be noticed:
This is xome text. ⇒  hRUv0M
This text is based on hRUv0M. ⇒ LwGqwX
This depends on LwGqwX. ⇒ civRbq
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 11 / 30
Trusty URIs: Cryptographic Hash Values for
Verifiable and Immutable Web Identifiers
Basic idea: Use of cryptographic hash values together with URIs as
identifiers for digital artifacts such as nanopublications.
Requirements:
• To allow for the verification of entire reference trees, the hash
should be part of the reference (i.e. the URI)
• To allow for meta-data, digital artifacts should be allowed to
contain self-references (i.e. their own URI)
• Format-independent hash for different kinds of content (e.g.
RDF)
• The complete approach should be decentralized and open
• We want to use them right away
.trighttp://example.org/r1. RA 5AbXdpz5DcaYXCh9l3eI9ruBosiL5XDU3rxBbBaUO70
Kuhn, Dumontier. Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data. ESWC 2014.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 12 / 30
Verifiable — Immutable — Permanent
Whether or not a given resource is the one a given trusty URI is
supposed to represent can be verified with perfect confidence.
(assuming that the trusty URI for the required artifact is known, e.g. because
another artifact contains it as a link)
http://trustyuri.net
Kuhn, Dumontier. Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data. ESWC 2014.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 13 / 30
Extended Range of Verifiability Through
Iterative Hashing
http://...RAcbjcRI...
http://...RAQozo2w...
http://...RABMq4Wc...
http://...RAcbjcRI...
http://...RAQozo2w...
http://.../resource23
http://.../resource23
...
http://...RAUx3Pqu...
http://.../resource55
http://...RABMq4Wc...
http://.../resource55
http://...RARz0AX-...
...
http://...RAUx3Pqu...
...
http://...RARz0AX...
...
range of
verifiability
http://trustyuri.net
Kuhn, Dumontier. Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data. ESWC 2014.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 14 / 30
Verifiable — Immutable — Permanent
Trusty URI artifacts are immutable, as any change in the content
also changes its URI, thereby making it a new artifact.
(as soon as your trusty URI has been picked up by third parties, e.g. cached or
linked from other resources, every change will be noticed)
http://trustyuri.net
Kuhn, Dumontier. Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data. ESWC 2014.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 15 / 30
Verifiable — Immutable — Permanent
Trusty URI artifacts are permanent, as they can be retrieved from
the cache of third-party websites if otherwise no longer available.
(if there are services regularly crawling and caching the artifacts on the web)
http://trustyuri.net
Kuhn, Dumontier. Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data. ESWC 2014.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 16 / 30
A Multi-Layer Architecture for
Reliable Scientific Data Publishing?
User Interfaces
Services:
Finding, querying, filtering,
and aggregating data
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 17 / 30
A Multi-Layer Architecture for
Reliable Scientific Data Publishing?
User Interfaces
Decentralized
Data Publishing
Network
hypotheses
Services:
Finding, querying, filtering,
and aggregating data
facts
measurements
opinions
meta-data
assessments
annotations
...
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 18 / 30
A Multi-Layer Architecture for
Reliable Scientific Data Publishing?
User Interfaces
Decentralized
Data Publishing
Network
hypotheses
Services:
Finding, querying, filtering,
and aggregating data
Science Bots:
Automated Tasks
facts
measurements
opinions
meta-data
assessments
annotations
...
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 19 / 30
Decentralized and Reliable Publishing with a
Nanopublication Server Network
Nanopublications
with Trusty URIs
Publication
Retrieval
Propagation /
Archiving
http://npmonitor.inn.ac
Kuhn et al. Publishing without Publishers: a Decentralized Approach to Dissemination, Retrieval, and Archiving of
Data. arXiv:1411.2749.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 20 / 30
Decentralized — Open — Real-time
• No a central authority: Everybody can set up a server and join
the network
• No restrictions on publication: Everybody can upload
nanopublications
• No delay between submission and publication: Nanopublications
are made public immediately
• No updates: If a nanopublication is modified, that makes it a
new nanopublication (enforced by trusty URIs)
• No queries: Only simple identifier-based lookup
Kuhn et al. Publishing without Publishers: a Decentralized Approach to Dissemination, Retrieval, and Archiving of
Data. arXiv:1411.2749.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 21 / 30
Fast Parallel Access
time from start of test in seconds
responsetimeinseconds
0 50 100 150 200 250 3000 50 100 150 200 250 300
0.1
1
10
100
0 20 40 60 80 100
number of clients accessing the service in parallel
Virtuoso triple store with SPARQL endpoint
nanopublication server
Kuhn et al. Publishing without Publishers: a Decentralized Approach to Dissemination, Retrieval, and Archiving of
Data. arXiv:1411.2749.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 22 / 30
Defining Datasets with Nanopublication Indexes
(which are themselves Nanopublications)
appends
has sub-index
has
element
(a) (b)
(c) (f)
(d) (e)
Kuhn et al. Publishing without Publishers: a Decentralized Approach to Dissemination, Retrieval, and Archiving of
Data. arXiv:1411.2749.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 23 / 30
Using Nanopublication Datasets
Once published in the network, nanopublication indexes can be cited:
[7] Nanopubs converted from OpenBEL’s Small and Large Corpus 20131211.
Nanopublication index, 4 March 2014,
http://np.inn.ac/RAR5dwELYLKGSfrOclnWhjOj-2nGZN_8BW1JjxwFZINHw
Researchers can then fetch and reuse the data in a reliable and
prefectly reproducible manner:
# Download data:
np get -c RAR5dwELYLKGSfrOclnWhjOj-2nGZN 8BW1JjxwFZINHw
# Analyze data:
...
Existing data can be recombined in new indexes; and researchers can
unambiguously refer to the used datasets for new results:
this: prov:wasDerivedFrom nps:RAR5dwELYLKGSfrOclnWhjOj-2nGZN 8BW1Jjx
Kuhn et al. Publishing without Publishers: a Decentralized Approach to Dissemination, Retrieval, and Archiving of
Data. arXiv:1411.2749.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 24 / 30
Could these techniques and infrastructures allow us to
make a step forward in terms of automation in science?
S C I E N C E B O T S
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 25 / 30
Science Bots —
Scientists’ Little Helpers in the Future?
“Science bots” that autonomously publish results in their own name
could cover a wide variety of applications, for example:
nanopub
-lications
PubMed
abstracts
nanopub
-lications
sensor
data
nanopub
-lications
nanopub
-lications
text mining bot
inference bot
sensor bot
Kuhn. Science Bots: A Model for the Future of Scientific Computation? SAVE-SD, WWW 2015 Companion
Proceedings.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 26 / 30
Quality Control with Reputation
Mechanisms and Network Metrics?
Robust automatic calculation of reputation metrics in a decentralized
and open system:
is contributed by
Kuhn. Science Bots: A Model for the Future of Scientific Computation? SAVE-SD, WWW 2015 Companion
Proceedings.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 27 / 30
Quality Control with Reputation
Mechanisms and Network Metrics?
Robust automatic calculation of reputation metrics in a decentralized
and open system:
gives positive
assessment for
is contributed by
Kuhn. Science Bots: A Model for the Future of Scientific Computation? SAVE-SD, WWW 2015 Companion
Proceedings.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 28 / 30
Quality Control with Reputation
Mechanisms and Network Metrics?
Robust automatic calculation of reputation metrics in a decentralized
and open system:
gives positive
assessment for
is contributed by
Eigenvector
centrality (0-100)
77
100
71 1
0
85
4
0.0
0 0
50
50 0
0.4 0.4
0
Kuhn. Science Bots: A Model for the Future of Scientific Computation? SAVE-SD, WWW 2015 Companion
Proceedings.
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 29 / 30
Thank you for your attention!
Questions?
Further information:
• Trusty URIs: http://trustyuri.net
• Nanopublications: http://nanopub.org
• Nanopublication Server Network:
http://arxiv.org/abs/1411.2749
• Science Bots: http://arxiv.org/abs/1503.04374
Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 30 / 30

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (9)

8.Real Poem1
8.Real Poem18.Real Poem1
8.Real Poem1
 
Pajaros picoteando
Pajaros picoteandoPajaros picoteando
Pajaros picoteando
 
Manager in Medical Coding specialization
Manager in Medical Coding specializationManager in Medical Coding specialization
Manager in Medical Coding specialization
 
Halong Paradise Suites Hotel Brochure
Halong Paradise Suites Hotel BrochureHalong Paradise Suites Hotel Brochure
Halong Paradise Suites Hotel Brochure
 
Zaragoza turismo 226
Zaragoza turismo 226Zaragoza turismo 226
Zaragoza turismo 226
 
Resemblance
ResemblanceResemblance
Resemblance
 
Mongodb With Mongoid
Mongodb With MongoidMongodb With Mongoid
Mongodb With Mongoid
 
Mobile app development process
Mobile app development processMobile app development process
Mobile app development process
 
L 11(gdr)(et) ((ee)nptel)
L 11(gdr)(et) ((ee)nptel)L 11(gdr)(et) ((ee)nptel)
L 11(gdr)(et) ((ee)nptel)
 

Ähnlich wie A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty URIs, and Science Bots

Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data PublishingTobias Kuhn
 
Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingTobias Kuhn
 
Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsTobias Kuhn
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Tobias Kuhn
 
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataA Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataTobias Kuhn
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsTobias Kuhn
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Tobias Kuhn
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsTobias Kuhn
 
Meme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation NetworksMeme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation NetworksTobias Kuhn
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for NanopublicationsTobias Kuhn
 
HathiTrust Research Center Secure Commons
HathiTrust Research Center Secure CommonsHathiTrust Research Center Secure Commons
HathiTrust Research Center Secure CommonsBeth Plale
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishingTobias Kuhn
 
Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014Beth Plale
 
The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference LinkingHerbert Van de Sompel
 
20191119_The OpenAIRE Research Graph
20191119_The OpenAIRE Research Graph 20191119_The OpenAIRE Research Graph
20191119_The OpenAIRE Research Graph OpenAIRE
 

Ähnlich wie A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty URIs, and Science Bots (20)

Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data Publishing
 
Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
 
Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and Nanopublications
 
Nanopubs
NanopubsNanopubs
Nanopubs
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
 
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataA Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication Reviews
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
 
Meme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation NetworksMeme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation Networks
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublications
 
HathiTrust Research Center Secure Commons
HathiTrust Research Center Secure CommonsHathiTrust Research Center Secure Commons
HathiTrust Research Center Secure Commons
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
 
Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014
 
Ld4 dh tutorial
Ld4 dh tutorialLd4 dh tutorial
Ld4 dh tutorial
 
The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference Linking
 
Open data and linked data
Open data and linked dataOpen data and linked data
Open data and linked data
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Keynote csws2013
Keynote csws2013Keynote csws2013
Keynote csws2013
 
20191119_The OpenAIRE Research Graph
20191119_The OpenAIRE Research Graph 20191119_The OpenAIRE Research Graph
20191119_The OpenAIRE Research Graph
 

Mehr von Tobias Kuhn

The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer Tobias Kuhn
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageTobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureTobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureTobias Kuhn
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiTobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...Tobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...Tobias Kuhn
 
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Tobias Kuhn
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageTobias Kuhn
 
AceWiki: A Natural and Expressive Semantic Wiki
AceWiki: A Natural and Expressive Semantic WikiAceWiki: A Natural and Expressive Semantic Wiki
AceWiki: A Natural and Expressive Semantic WikiTobias Kuhn
 
AceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic WikiAceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic WikiTobias Kuhn
 
How Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic WikisHow Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic WikisTobias Kuhn
 
How to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural LanguagesHow to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural LanguagesTobias Kuhn
 
Wissensrepräsentation in kontrolliertem Englisch
Wissensrepräsentation in kontrolliertem EnglischWissensrepräsentation in kontrolliertem Englisch
Wissensrepräsentation in kontrolliertem EnglischTobias Kuhn
 
An Introduction to AceWiki
An Introduction to AceWikiAn Introduction to AceWiki
An Introduction to AceWikiTobias Kuhn
 
Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Codeco: A Grammar Notation for Controlled Natural Language in Predictive EditorsCodeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Codeco: A Grammar Notation for Controlled Natural Language in Predictive EditorsTobias Kuhn
 

Mehr von Tobias Kuhn (16)

The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural Language
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen Wiki
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural Language
 
AceWiki: A Natural and Expressive Semantic Wiki
AceWiki: A Natural and Expressive Semantic WikiAceWiki: A Natural and Expressive Semantic Wiki
AceWiki: A Natural and Expressive Semantic Wiki
 
AceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic WikiAceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic Wiki
 
How Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic WikisHow Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic Wikis
 
How to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural LanguagesHow to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural Languages
 
Wissensrepräsentation in kontrolliertem Englisch
Wissensrepräsentation in kontrolliertem EnglischWissensrepräsentation in kontrolliertem Englisch
Wissensrepräsentation in kontrolliertem Englisch
 
An Introduction to AceWiki
An Introduction to AceWikiAn Introduction to AceWiki
An Introduction to AceWiki
 
Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Codeco: A Grammar Notation for Controlled Natural Language in Predictive EditorsCodeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
 

Kürzlich hochgeladen

Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsSérgio Sacani
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxmaryFF1
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxThermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxuniversity
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalMAESTRELLAMesa2
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 

Kürzlich hochgeladen (20)

Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive stars
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxThermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and Vertical
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 

A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty URIs, and Science Bots

  • 1. A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty URIs, and Science Bots Tobias Kuhn http://www.tkuhn.ch @txkuhn ETH Zurich CERN Workshop on Innovations in Scholarly Communication (OAI9) Geneva 17 June 2015
  • 2. Increasing Scientific Output: >1.5M New Articles Per Year Citation network of 30M scientific publications Image from: Kuhn et al. Inheritance patterns in citation networks reveal scientific memes. Physical Review X 4. 2014. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 2 / 30
  • 3. Increasing Importance of Scientific Data London Underground staff sorting 4M used tickets to analyse line use in 1939 Image from: http://www.telegraph.co.uk/travel/picturegalleries/9791007/ The-history-of-the-Tube-in-pictures-150-years-of-London-Underground.html?frame=2447159 Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 3 / 30
  • 4. Problem: Replication and Re-Use of Research Results Exemplary Situation: Sue publishes a script that should allow everybody to replicate her scientific analysis: # Download data: wget http://some-third-party.org/dataset/1.4 # Analyze data: ... Problems: • What if the resource becomes unavailable at this location? • What if the third party silently changes that version of the dataset? • What if the web site gets hacked and the data manipulated? Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 4 / 30
  • 5. Data Publishing, Archiving, and Re-Use Scientific datasets become increasingly important, and these data are increasingly produced and consumed directly by software. Published data should therefore be: • Verifiable (Is this really the data I am looking for?) • Immutable (Can I be sure that it hasn’t been modified?) • Permanent (Will it be available in 1, 5, 20 years from now?) • Reliable (Can it be efficiently retrieved whenever needed?) • Granular (Can I refer to individual data entries?) • Semantic (Can it be automatically interpreted?) • Linked (Does it use established identifiers and ontologies?) • Trustworthy (Can I trust the source?) Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 5 / 30
  • 6. Nanopublications: Provenance-Aware Semantic Publishing (based on RDF) assertion provenance publication info nanopublication http://nanopub.org / @nanopub org Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 6 / 30
  • 7. Vision: Changing Scholarly Communication Now Narrative articles at the center Future Nanopublications at the center Images from Mons et al. The value of data. Nature genetics, 43(4):281–283, 2011 Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 7 / 30
  • 8. Nanopublication Example sub:assertion { sub:_3 a rdf:Statement ; rdf:subject schem:Adenosine%20triphosphate ; rdf:predicate belv:decreases ; rdf:object sub:_1 ; occursIn: obo:UBERON_0001134 , species:9606 . sub:_1 a go:0003824 ; hasAgent: sub:_2 . sub:_2 a Protein: ; geneProductOf: hgnc:12517 . } sub:provenance { sub:assertion prov:hadPrimarySource pubmed:9703368 ; prov:wasDerivedFrom beldoc: , sub:_4 . beldoc: dce:description "Approximately 61,000 statements." ; dce:rights "Copyright (c) 2011-2012, Selventa. All rights reserved." ; dce:title "BEL Framework Large Corpus Document" ; pav:authoredBy sub:_5 ; pav:version "20131211" . sub:_4 prov:value "UCP1 contains six potential transmembrane a-helices (72) and prov:wasQuotedFrom pubmed:9703368 . sub:_5 rdfs:label "Selventa" . } sub:pubinfo { this: dct:created "2014-07-03T14:34:13.226+02:00"^^xsd:dateTime ; pav:createdBy orcid:0000-0001-6818-334X , orcid:0000-0002-1267-0234 . } Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 8 / 30
  • 9. Identifiability Problem of URIs (Web Links) http://some-third-party.org/dataset/1.4 ? Given a URI for a digital artifact, there is no reliable standard procedure of checking whether a retrieved file really represents the correct and original state of that artifact. Solution: Identifiers that include (iterative) cryptographic hash values (as applied, for example, by Git and Bitcoin) Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 9 / 30
  • 10. Cryptographic Hash Values A cryptographic hash value is a short random-looking sequence of bytes calculated on a given input: This is some text. ⇒ hRUv0M The same input always leads to exactly the same value: This is some text. ⇒ hRUv0M Even a minimally modified input leads to a completely different value: This is xome text. ⇒ sCtYbf The input is not reconstructible from the hash value (in practice): This is some text. hRUv0M Given an input and a matching hash value, we can therefore be perfectly sure that it was exactly that input that led to the hash. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 10 / 30
  • 11. Iterative Hashing Hash values can be used as identifiers in an iterative fashion: This is some text. ⇒ hRUv0M This text is based on hRUv0M. ⇒ LwGqwX This depends on LwGqwX. ⇒ civRbq From a single identifier (such as civRbq), the entire reference tree can be verified: This is some text. ⇒ hRUv0M This text is based on hRUv0M. ⇒ LwGqwX This depends on LwGqwX. ⇒ civRbq And any modification can be noticed: This is xome text. ⇒ hRUv0M This text is based on hRUv0M. ⇒ LwGqwX This depends on LwGqwX. ⇒ civRbq Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 11 / 30
  • 12. Trusty URIs: Cryptographic Hash Values for Verifiable and Immutable Web Identifiers Basic idea: Use of cryptographic hash values together with URIs as identifiers for digital artifacts such as nanopublications. Requirements: • To allow for the verification of entire reference trees, the hash should be part of the reference (i.e. the URI) • To allow for meta-data, digital artifacts should be allowed to contain self-references (i.e. their own URI) • Format-independent hash for different kinds of content (e.g. RDF) • The complete approach should be decentralized and open • We want to use them right away .trighttp://example.org/r1. RA 5AbXdpz5DcaYXCh9l3eI9ruBosiL5XDU3rxBbBaUO70 Kuhn, Dumontier. Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data. ESWC 2014. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 12 / 30
  • 13. Verifiable — Immutable — Permanent Whether or not a given resource is the one a given trusty URI is supposed to represent can be verified with perfect confidence. (assuming that the trusty URI for the required artifact is known, e.g. because another artifact contains it as a link) http://trustyuri.net Kuhn, Dumontier. Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data. ESWC 2014. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 13 / 30
  • 14. Extended Range of Verifiability Through Iterative Hashing http://...RAcbjcRI... http://...RAQozo2w... http://...RABMq4Wc... http://...RAcbjcRI... http://...RAQozo2w... http://.../resource23 http://.../resource23 ... http://...RAUx3Pqu... http://.../resource55 http://...RABMq4Wc... http://.../resource55 http://...RARz0AX-... ... http://...RAUx3Pqu... ... http://...RARz0AX... ... range of verifiability http://trustyuri.net Kuhn, Dumontier. Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data. ESWC 2014. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 14 / 30
  • 15. Verifiable — Immutable — Permanent Trusty URI artifacts are immutable, as any change in the content also changes its URI, thereby making it a new artifact. (as soon as your trusty URI has been picked up by third parties, e.g. cached or linked from other resources, every change will be noticed) http://trustyuri.net Kuhn, Dumontier. Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data. ESWC 2014. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 15 / 30
  • 16. Verifiable — Immutable — Permanent Trusty URI artifacts are permanent, as they can be retrieved from the cache of third-party websites if otherwise no longer available. (if there are services regularly crawling and caching the artifacts on the web) http://trustyuri.net Kuhn, Dumontier. Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data. ESWC 2014. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 16 / 30
  • 17. A Multi-Layer Architecture for Reliable Scientific Data Publishing? User Interfaces Services: Finding, querying, filtering, and aggregating data Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 17 / 30
  • 18. A Multi-Layer Architecture for Reliable Scientific Data Publishing? User Interfaces Decentralized Data Publishing Network hypotheses Services: Finding, querying, filtering, and aggregating data facts measurements opinions meta-data assessments annotations ... Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 18 / 30
  • 19. A Multi-Layer Architecture for Reliable Scientific Data Publishing? User Interfaces Decentralized Data Publishing Network hypotheses Services: Finding, querying, filtering, and aggregating data Science Bots: Automated Tasks facts measurements opinions meta-data assessments annotations ... Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 19 / 30
  • 20. Decentralized and Reliable Publishing with a Nanopublication Server Network Nanopublications with Trusty URIs Publication Retrieval Propagation / Archiving http://npmonitor.inn.ac Kuhn et al. Publishing without Publishers: a Decentralized Approach to Dissemination, Retrieval, and Archiving of Data. arXiv:1411.2749. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 20 / 30
  • 21. Decentralized — Open — Real-time • No a central authority: Everybody can set up a server and join the network • No restrictions on publication: Everybody can upload nanopublications • No delay between submission and publication: Nanopublications are made public immediately • No updates: If a nanopublication is modified, that makes it a new nanopublication (enforced by trusty URIs) • No queries: Only simple identifier-based lookup Kuhn et al. Publishing without Publishers: a Decentralized Approach to Dissemination, Retrieval, and Archiving of Data. arXiv:1411.2749. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 21 / 30
  • 22. Fast Parallel Access time from start of test in seconds responsetimeinseconds 0 50 100 150 200 250 3000 50 100 150 200 250 300 0.1 1 10 100 0 20 40 60 80 100 number of clients accessing the service in parallel Virtuoso triple store with SPARQL endpoint nanopublication server Kuhn et al. Publishing without Publishers: a Decentralized Approach to Dissemination, Retrieval, and Archiving of Data. arXiv:1411.2749. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 22 / 30
  • 23. Defining Datasets with Nanopublication Indexes (which are themselves Nanopublications) appends has sub-index has element (a) (b) (c) (f) (d) (e) Kuhn et al. Publishing without Publishers: a Decentralized Approach to Dissemination, Retrieval, and Archiving of Data. arXiv:1411.2749. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 23 / 30
  • 24. Using Nanopublication Datasets Once published in the network, nanopublication indexes can be cited: [7] Nanopubs converted from OpenBEL’s Small and Large Corpus 20131211. Nanopublication index, 4 March 2014, http://np.inn.ac/RAR5dwELYLKGSfrOclnWhjOj-2nGZN_8BW1JjxwFZINHw Researchers can then fetch and reuse the data in a reliable and prefectly reproducible manner: # Download data: np get -c RAR5dwELYLKGSfrOclnWhjOj-2nGZN 8BW1JjxwFZINHw # Analyze data: ... Existing data can be recombined in new indexes; and researchers can unambiguously refer to the used datasets for new results: this: prov:wasDerivedFrom nps:RAR5dwELYLKGSfrOclnWhjOj-2nGZN 8BW1Jjx Kuhn et al. Publishing without Publishers: a Decentralized Approach to Dissemination, Retrieval, and Archiving of Data. arXiv:1411.2749. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 24 / 30
  • 25. Could these techniques and infrastructures allow us to make a step forward in terms of automation in science? S C I E N C E B O T S Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 25 / 30
  • 26. Science Bots — Scientists’ Little Helpers in the Future? “Science bots” that autonomously publish results in their own name could cover a wide variety of applications, for example: nanopub -lications PubMed abstracts nanopub -lications sensor data nanopub -lications nanopub -lications text mining bot inference bot sensor bot Kuhn. Science Bots: A Model for the Future of Scientific Computation? SAVE-SD, WWW 2015 Companion Proceedings. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 26 / 30
  • 27. Quality Control with Reputation Mechanisms and Network Metrics? Robust automatic calculation of reputation metrics in a decentralized and open system: is contributed by Kuhn. Science Bots: A Model for the Future of Scientific Computation? SAVE-SD, WWW 2015 Companion Proceedings. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 27 / 30
  • 28. Quality Control with Reputation Mechanisms and Network Metrics? Robust automatic calculation of reputation metrics in a decentralized and open system: gives positive assessment for is contributed by Kuhn. Science Bots: A Model for the Future of Scientific Computation? SAVE-SD, WWW 2015 Companion Proceedings. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 28 / 30
  • 29. Quality Control with Reputation Mechanisms and Network Metrics? Robust automatic calculation of reputation metrics in a decentralized and open system: gives positive assessment for is contributed by Eigenvector centrality (0-100) 77 100 71 1 0 85 4 0.0 0 0 50 50 0 0.4 0.4 0 Kuhn. Science Bots: A Model for the Future of Scientific Computation? SAVE-SD, WWW 2015 Companion Proceedings. Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 29 / 30
  • 30. Thank you for your attention! Questions? Further information: • Trusty URIs: http://trustyuri.net • Nanopublications: http://nanopub.org • Nanopublication Server Network: http://arxiv.org/abs/1411.2749 • Science Bots: http://arxiv.org/abs/1503.04374 Tobias Kuhn, ETH Zurich A Decentralized Network for Publishing Linked Data 30 / 30