Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
The Digital Cavemen

of Linked Lascaux
Ruben Verborgh
The Lascaux paintings

are 17,300 years old.
How long will

your records last?
by Banksy
by Moyan Brenn
SUSTAINABILITY
SUSTAINABILITY
a threat to the Semantic Web
lack of a longterm plan for
=
SUSTAINABILITY
making promises you can keep
=
SUSTAINABILITY
a dialog becoming a contract
=
SUSTAINABILITY
remaining constant under change
=
How can we promise
to remain constant
in a changing world?
Changes
Constants
Promises
The Digital Cavemen

of Linked Lascaux
Changes
Constants
Promises
The Digital Cavemen

of Linked Lascaux
Changes
Data models
Technology
Interfaces
Changes
Data models
Technology
Interfaces
The oldest data model

is a simple table.
header
row
column
k
van Hooland, S. and Verborgh, R.

“Linked Data for Libraries...
Tables do not cope well

with changes in data or schema.
Title Artist Born Died
The Thrill is Gone B. B. King 1925 2015
Ri...
Relational databases provide

a multi-dimensional table model.
7
header
row
relation
key column
attributes
table/entity
va...
Databases cope with data changes

but schema changes are harder.
Title Artist
The Thrill is Gone 1
Riding with the King 2
...
There is no interoperability

with other databases.
Title Artist
The Thrill is Gone 1
Riding with the King 2
Riding with t...
XML allows reuse of schemas

and identifiers.
the same for all items; a header
line can indicate their name.
Rec
ers
root
p...
XML schema evolution

remains a tough nut to crack.
Tabular data Relational model
Meta-markup languages RDF
Each data item...
The RDF datamodel is flexible

for changes in data and schema.
RDF
Records in one table can relate to oth-
ers by referenci...
RDF involves a trade-off

between flexibility and reuse.
custom

ontology
reuse

ontologies
perfect

match
perfect

interope...
So far for change within models…

what about change between them?
1.1. INTRODUCTION 7
Tabular data Relational model
Each d...
There’s no ultimate model.

They co-exist. Change is inherent.
1.1. INTRODUCTION 7
Tabular data Relational model
Each data...
Changes
Data models
Technology
Interfaces
Even if your data doesn’t change,

technology does.
What happens to your data?
new software versions
new software manufact...
Is your software

holding your data hostage?
Is your software the owner of your data?
Intentional or unintentional vendor ...
The Cooper-Hewitt Design Museum
had trouble getting their own data.
Data in The Museum System
flexible, but complex relatio...
Changes
Data models
Technology
Interfaces
The Web has been designed

with change in mind.
Individual links are allowed to break

so the entire Web does not.
—Tim Be...
The Web is in rapid evolution

but continues on working.
What year is it? Then your users need…
1995 – HTML 2.0
2000 – XML...
At least HTML seems constant,

so the human Web is safe.
http://bib.org/books/978-1-85604-964-1/
around 2005: made in HTML...
Web APIs for machines suffer

from changes on many levels.
http://api.bib.org/v2/viewBookDetails.php?
id=978-1-85604-964-1&...
http://api.bib.org/v2/viewBookDetails.php?
id=978-1-85604-964-1&format=json

&apikey=WSDGU56VP
!
!
!
Web APIs for machines...
Plenty of excuses exist

to change machine interfaces.
But our new server does it faster!
But our new API has different fea...
Even funnier are the excuses

for requiring API keys.
But we need to rate limit!
But we need to track automated access!
Bu...
Once and for all:

API keys do not help with these.
But we need to rate limit!
But we need to track automated access!
But ...
Once and for all:

API keys do not help with these.
Your HTML interface is still open!
JSON is a convenience, not a necess...
Yet other possible changes

still appear to be a concern.
Remain constant if your server changes?
Remain constant if your ...
Changes
Constants
Promises
The Digital Cavemen

of Linked Lascaux
Constants
URIs
Ontologies
Resources
Constants
URIs
Ontologies
Resources
The RDF model is driven

by unique identifiers.
S
O
P
Constants allow clients

to establish a shared meaning.
S
O
P
http://bib.org/books/978-1-85604-964-1/
http://bib.org/autho...
Human semantics are in concepts

and their meaning to the world.
S
O
P
a book
a person
written by
Machine semantics are in symbols

and their structural interrelations.
S
O
P
http://digybe.wpq/dgjyj-dgu7945
http://aole.w...
We need to be very careful

about our choice of symbols.
S
O
P
http://bib.org/books/978-1-85604-964-1/
http://bib.org/auth...
We need to be very careful

about our choice of symbols.
http://bib.org/books/978-1-85604-964-1/
http://bib.org/authors/73...
Although designed for machines,

the example only works for humans.
S
O
P
http://bib.org/books/978-1-85604-964-1/
http://b...
Because, somehow, Web APIs

make machine access different.
S
O
P
http://api.bib.org/v2/viewBookDetails.php?
id=978-1-85604...
That’s why it’s a problem if

machines need different identifiers.
S
O
P
http://api.bib.org/v2/viewBookDetails.php?
id=978...
Only this triple is a global constant.

The other is volatile and local.
S
O
P
http://bib.org/books/978-1-85604-964-1/
htt...
Constants
URIs
Ontologies
Resources
Fortunately, we don’t have to

pick all the constants ourselves.
Ontologies provide identifiers of concepts

that are desig...
Of course, we get the benefits

only if we actually reuse.
Why have our own my:writtenBy property

when dc:creator already ...
Authors are not always in control:

external semantic drift happens.
foaf:knows was bidirectional…
spec: “some level of re...
Getting close to Derrida…
but we’re not philosophers.
There are only two hard things

in Computer Science:

cache invalida...
Constants
URIs
Ontologies
Resources
The constants you can touch

are the constants you can trust.
No matter how hard technology changes,

the books we describ...
The “success” story

of the Web API community.
e existence of more than 12.000 di↵erent micro-protocols to achieve essen
e...
Just imagine we had

15,000 different data models.
e existence of more than 12.000 di↵erent micro-protocols to achieve esse...
Find resources in your domain

and assign them an identifier.
http://bib.org/books/978-1-85604-964-1/
http://bib.org/author...
It’s just like building a web site.

When a user comes, serve HTML.
http://bib.org/books/978-1-85604-964-1/
U
GET
HTML
It’s just like building a web site.

When a client comes, serve JSON.
http://bib.org/books/978-1-85604-964-1/
C
GET
JSON
It’s just like building a web site.

When a client comes, serve RDF.
http://bib.org/books/978-1-85604-964-1/
C
GET
RDF
Content negotiation exists

for a long time in HTTP.
http://bib.org/books/978-1-85604-964-1/
C
GET
RDF
Resource
Representa...
This allows constant URIs

even with future changes.
http://bib.org/books/978-1-85604-964-1/
C
GET
RDF 2.0
It enables different users and

machines to talk about things.
http://bib.org/books/978-1-85604-964-1/
C
U
C
The best API is no API.
Your website is already an API.
Developers like to build complicated APIs.
API keys are especially...
Changes
Constants
Promises
The Digital Cavemen

of Linked Lascaux
Promises
Web Data
Integration
Scalability
Promises
Web Data
Integration
Scalability
The Semantic Web promised

data on the Web.
85,567,007,302 triples from 3,426 datasets
LODStats
38,606,408,765 from 657,89...
How much of this data

can we readily access?
data dumps
Linked Data documents
SPARQL endpoints
A data dump means downloading
everything and querying locally.
A data dump means downloading
everything and querying locally.
When was the last time

you downloaded the full Wikipedia

...
Dumps are not Web querying.

It’s kind of like giving up.
Semantic Web Semantic Basement?
What advantage do we have

compa...
Linked Data documents

allow you to traverse a dataset.
Linked Data documents

allow you to traverse a dataset.
That’s similar to what we also do:

consume information on Wikiped...
Much Linked Data is available

using the well-known principles.
Servers publish a light-weight interface.
Clients follow t...
Linked Data documents allow

query evaluation on the Web.
# Other books by the same author

SELECT DISTINCT ?book WHERE {
...
Some queries are hard

or impossible to evaluate.
# Books about Hamburg

SELECT DISTINCT ?book ?author WHERE {

?book dc:s...
SPARQL endpoints allow you

to ask any question you want.
SPARQL endpoints allow you

to ask any question you want.
When was the last time

you expected Wikipedia to answer

specifi...
A public SPARQL endpoint

happily answers this query.
# Other books by the same author

SELECT DISTINCT ?book WHERE {

boo...
A public SPARQL endpoint also

happily answers this query.
# Books about Hamburg

SELECT DISTINCT ?book ?author WHERE {

?...
A public SPARQL endpoint also

happily answers this query…
SELECT DISTINCT ?drug ?drug1 ?drug2 ?drug3 ?drug4 ?d1 WHERE {
?...
There’s a price to pay for being

the most expressive HTTP interface.
The majority of public SPARQL endpoints

has less th...
Promises
Web Data
Integration
Scalability
The main promise of Linked Data

is integration, preserving semantics.
RDF
Records in one table can relate to oth-
ers by ...
Integration is the promise.

But does it work on the Web?
data dumps
Linked Data documents
SPARQL endpoints
With data dumps, we just

build a bigger basement.
How far do we go?
How do we keep data up to date?
With Linked Data documents,

we keep on following our nose.
There are no dataset boundaries.
Some queries will remain hard.
With public SPARQL endpoints,

problems become worse.
1 endpoint has 95% availability.
1.5 days down each month
2 endpoint...
Promises
Web Data
Integration
Scalability
Can we think differently

about Linked Data on the Web?
high server costlow server cost
data

dump
SPARQL

endpoint
high av...
Can we think differently

about Linked Data on the Web?
data

dump
SPARQL

endpoint
Linked Data

documents
? ?
Let us combine the lessons on

changes, constants, and promises.
An interface that withstands change,
simple enough so it ...
Let us combine the lessons on

changes, constants, and promises.
Data dumps contain too much.
SPARQL endpoint results are ...
Each interface divides a dataset
into Linked Data Fragments.
Data dumps: 1 huge fragment
SPARQL endpoints: ∞ specific fragm...
Can we find a new interface

with a sustainable balance?
Triple Pattern Fragments:

1 fragment per subject / predicate / o...
Browse a dataset by triple pattern—

no less, no more.
Machines can access

the exact same interface as RDF.
Triple Pattern Fragments extend

Linked Data documents with forms.
That’s even more similar to what we do:

consume inform...
Machines solve complex queries

by breaking them down.
# Other books by the same author

SELECT DISTINCT ?book WHERE {

bo...
Machines solve complex queries

by breaking them down.
# Books about Hamburg

SELECT DISTINCT ?book ?author WHERE {

?book...
Promises can be kept, because

the interface is intelligently light.
Publishing Linked Data

that can be queried on the We...
Promises are negotiated contracts
so they always involve trade-offs.
Querying will be slower.
clients send many requests t...
Make your Linked Data

queryable on the Web.
Several open-source implementations:

linkeddatafragments.org/software/
Query...
Changes
Constants
Promises
The Digital Cavemen

of Linked Lascaux
Identify the constants,

separate them from changes.
Satisfy Linked Data needs

with promises you can keep.
Simple enough

to be usable,
complex enough

to be useful.
Sustainability means

promising the simplest

useful complexity.
@RubenVerborgh

ruben.verborgh.org
The Digital Cavemen of Linked Lascaux
The Digital Cavemen of Linked Lascaux
Nächste SlideShare
Wird geladen in …5
×
Nächste SlideShare
Implementing ISNIs and ORCIDs at La Trobe University
Weiter
Herunterladen, um offline zu lesen und im Vollbildmodus anzuzeigen.

Teilen

The Digital Cavemen of Linked Lascaux

Herunterladen, um offline zu lesen

Keynote at Semantic Web In Libraries 2015 (#SWIB15)

Ähnliche Bücher

Kostenlos mit einer 30-tägigen Testversion von Scribd

Alle anzeigen

The Digital Cavemen of Linked Lascaux

  1. 1. The Digital Cavemen
 of Linked Lascaux Ruben Verborgh
  2. 2. The Lascaux paintings
 are 17,300 years old. How long will
 your records last?
  3. 3. by Banksy
  4. 4. by Moyan Brenn
  5. 5. SUSTAINABILITY
  6. 6. SUSTAINABILITY a threat to the Semantic Web lack of a longterm plan for =
  7. 7. SUSTAINABILITY making promises you can keep =
  8. 8. SUSTAINABILITY a dialog becoming a contract =
  9. 9. SUSTAINABILITY remaining constant under change =
  10. 10. How can we promise to remain constant in a changing world?
  11. 11. Changes Constants Promises The Digital Cavemen
 of Linked Lascaux
  12. 12. Changes Constants Promises The Digital Cavemen
 of Linked Lascaux
  13. 13. Changes Data models Technology Interfaces
  14. 14. Changes Data models Technology Interfaces
  15. 15. The oldest data model
 is a simple table. header row column k van Hooland, S. and Verborgh, R.
 “Linked Data for Libraries, Archives and Museums” (Facet, 2014)
  16. 16. Tables do not cope well
 with changes in data or schema. Title Artist Born Died The Thrill is Gone B. B. King 1925 2015 Riding with the King John Hiatt 1952 Riding with the King B. B. King 1925 … … … …
  17. 17. Relational databases provide
 a multi-dimensional table model. 7 header row relation key column attributes table/entity van Hooland, S. and Verborgh, R.
 “Linked Data for Libraries, Archives and Museums” (Facet, 2014)
  18. 18. Databases cope with data changes
 but schema changes are harder. Title Artist The Thrill is Gone 1 Riding with the King 2 Riding with the King 1 … … ID Name Born Died 1 B. B. King 1925 2015 2 John Hiatt 1952 … … … …
  19. 19. There is no interoperability
 with other databases. Title Artist The Thrill is Gone 1 Riding with the King 2 Riding with the King 1 … … Wikipedia ?
  20. 20. XML allows reuse of schemas
 and identifiers. the same for all items; a header line can indicate their name. Rec ers root parent child siblings subje van Hooland, S. and Verborgh, R.
 “Linked Data for Libraries, Archives and Museums” (Facet, 2014)
  21. 21. XML schema evolution
 remains a tough nut to crack. Tabular data Relational model Meta-markup languages RDF Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. XML documents have a hierarchical structure, which gives them a tree- like appearance. Each element can Each fact about a data item is expressed as a triple, which connects a subject to an object through a precise relationship. root parent child siblings property subject object ?
  22. 22. The RDF datamodel is flexible
 for changes in data and schema. RDF Records in one table can relate to oth- ers by referencing their key column. ent child s property subject object van Hooland, S. and Verborgh, R.
 “Linked Data for Libraries, Archives and Museums” (Facet, 2014)
  23. 23. RDF involves a trade-off
 between flexibility and reuse. custom
 ontology reuse
 ontologies perfect
 match perfect
 interoperability
  24. 24. So far for change within models…
 what about change between them? 1.1. INTRODUCTION 7 Tabular data Relational model Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column attributes table/entity root Tabular data Relational model Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column table/entity root parent child siblings property subject object Tabular data Relational model Meta-markup languages RDF Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column table/entity root parent child siblings property subject object 1.1. INTRODUCTION 7 Tabular data Relational model Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column attributes table/entity root parent child property subject object
  25. 25. There’s no ultimate model.
 They co-exist. Change is inherent. 1.1. INTRODUCTION 7 Tabular data Relational model Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column attributes table/entity root Tabular data Relational model Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column table/entity root parent child siblings property subject object Tabular data Relational model Meta-markup languages RDF Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column table/entity root parent child siblings property subject object 1.1. INTRODUCTION 7 Tabular data Relational model Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column attributes table/entity root parent child property subject object
  26. 26. Changes Data models Technology Interfaces
  27. 27. Even if your data doesn’t change,
 technology does. What happens to your data? new software versions new software manufacturers
  28. 28. Is your software
 holding your data hostage? Is your software the owner of your data? Intentional or unintentional vendor lock-in? Or are you? Can you get your data out at any moment you want?
  29. 29. The Cooper-Hewitt Design Museum had trouble getting their own data. Data in The Museum System flexible, but complex relational design no export button Website had more flexible demands complex manual queries to liberate data parallel CMS to drive website
  30. 30. Changes Data models Technology Interfaces
  31. 31. The Web has been designed
 with change in mind. Individual links are allowed to break
 so the entire Web does not. —Tim Berners-Lee
  32. 32. The Web is in rapid evolution
 but continues on working. What year is it? Then your users need… 1995 – HTML 2.0 2000 – XML 2008 – JSON 2012 – HTML 5 2015 – RDF ? 2017 – … ?
  33. 33. At least HTML seems constant,
 so the human Web is safe. http://bib.org/books/978-1-85604-964-1/ around 2005: made in HTML 4 around 2015: made in HTML 5 Markup changes, the identifier does not. Tim Berners-Lee called these “Cool URIs”.
  34. 34. Web APIs for machines suffer
 from changes on many levels. http://api.bib.org/v2/viewBookDetails.php? id=978-1-85604-964-1&format=json
 &apikey=WSDGU56VP How does this identifier cope with change? How long does this identifier work unchanged? !
  35. 35. http://api.bib.org/v2/viewBookDetails.php? id=978-1-85604-964-1&format=json
 &apikey=WSDGU56VP ! ! ! Web APIs for machines suffer
 from changes on many levels. 
 
 dependency on server technology dependency on API version dependency on representation dependency on API key
  36. 36. Plenty of excuses exist
 to change machine interfaces. But our new server does it faster! But our new API has different features! But XML is obsolete now so we need JSON!
  37. 37. Even funnier are the excuses
 for requiring API keys. But we need to rate limit! But we need to track automated access! But we need to protect our data!
  38. 38. Once and for all:
 API keys do not help with these. But we need to rate limit! But we need to track automated access! But we need to protect our data!
  39. 39. Once and for all:
 API keys do not help with these. Your HTML interface is still open! JSON is a convenience, not a necessity. Anybody can still do whatever they want
 by scraping HTML pages with the same data. Protect your data, not just one interface.
  40. 40. Yet other possible changes
 still appear to be a concern. Remain constant if your server changes? Remain constant if your API changes? Remain constant if data models change?
  41. 41. Changes Constants Promises The Digital Cavemen
 of Linked Lascaux
  42. 42. Constants URIs Ontologies Resources
  43. 43. Constants URIs Ontologies Resources
  44. 44. The RDF model is driven
 by unique identifiers. S O P
  45. 45. Constants allow clients
 to establish a shared meaning. S O P http://bib.org/books/978-1-85604-964-1/ http://bib.org/authors/7356/ http://purl.org/dc/terms/creator
  46. 46. Human semantics are in concepts
 and their meaning to the world. S O P a book a person written by
  47. 47. Machine semantics are in symbols
 and their structural interrelations. S O P http://digybe.wpq/dgjyj-dgu7945 http://aole.wqq/mobd1.tihz http://yudgy.jdu/DHH8DHBtkixhj
  48. 48. We need to be very careful
 about our choice of symbols. S O P http://bib.org/books/978-1-85604-964-1/ http://bib.org/authors/7356/ http://purl.org/dc/terms/creator
  49. 49. We need to be very careful
 about our choice of symbols. http://bib.org/books/978-1-85604-964-1/ http://bib.org/authors/7356/ Is this a book
 or a description of a book? :printDate "2014-06-11" :lastModified "2015-11-25" Is this a person
 or a document? :birthDate "1987-02-28" :size "17kB"
  50. 50. Although designed for machines,
 the example only works for humans. S O P http://bib.org/books/978-1-85604-964-1/ http://bib.org/authors/7356/ http://purl.org/dc/terms/creator
  51. 51. Because, somehow, Web APIs
 make machine access different. S O P http://api.bib.org/v2/viewBookDetails.php? id=978-1-85604-964-1&format=json
 &apikey=WSDGU56VP http://api.bib.org/v2/viewAuthorProfile.php? id=7356&format=json&apikey=WSDGU56VP http://purl.org/dc/terms/creator
  52. 52. That’s why it’s a problem if
 machines need different identifiers. S O P http://api.bib.org/v2/viewBookDetails.php? id=978-1-85604-964-1&format=json
 &apikey=WSDGU56VP http://api.bib.org/v2/viewAuthorProfile.php? id=7356&format=json&apikey=WSDGU56VP http://purl.org/dc/terms/creator
  53. 53. Only this triple is a global constant.
 The other is volatile and local. S O P http://bib.org/books/978-1-85604-964-1/ http://bib.org/authors/7356/ http://purl.org/dc/terms/creator
  54. 54. Constants URIs Ontologies Resources
  55. 55. Fortunately, we don’t have to
 pick all the constants ourselves. Ontologies provide identifiers of concepts
 that are designed to be reused. They are necessary to make RDF work. They are necessary to create queries,
 especially over multiple datasources.
  56. 56. Of course, we get the benefits
 only if we actually reuse. Why have our own my:writtenBy property
 when dc:creator already exists? Maybe we have a more specific meaning? We can still relate both properties with RDF. But if we all use derivatives of the constants,
 what is the value of these constants?
  57. 57. Authors are not always in control:
 external semantic drift happens. foaf:knows was bidirectional… spec: “some level of reciprocity” An foaf:knows Pete Peter foaf:knows An …until somebody modeled Twitter followers Pete follows Angela Merkel Pete knows Angela Yet Angela doesn’t know Pete…
  58. 58. Getting close to Derrida… but we’re not philosophers. There are only two hard things
 in Computer Science:
 cache invalidation and naming things. —Phil Karlton
  59. 59. Constants URIs Ontologies Resources
  60. 60. The constants you can touch
 are the constants you can trust. No matter how hard technology changes,
 the books we describe remain the same. Any mechanism of identification
 should based on domain resources,
 not on inevitably changing technology.
  61. 61. The “success” story
 of the Web API community. e existence of more than 12.000 di↵erent micro-protocols to achieve essen en clients and servers over http. Of course, each application has its own t does that also warrant an entirely di↵erent way of exposing this, especially Each di↵erent api currently requires a di↵erent client, given the lack of a u pi’s response structure and functionality. Clearly, this approach to Web apis i 2005 2007 2009 2011 2013 2015 186 1,263 2,418 5,018 7,182 10,302 12,559 number of indexed Web s g number of Web apis is often named an indicator of their success, while the ove ssary—and detrimental to the development of generic Web api clients. (data: progra number of indexed Web APIs
 in ProgrammableWeb
  62. 62. Just imagine we had
 15,000 different data models. e existence of more than 12.000 di↵erent micro-protocols to achieve essen en clients and servers over http. Of course, each application has its own t does that also warrant an entirely di↵erent way of exposing this, especially Each di↵erent api currently requires a di↵erent client, given the lack of a u pi’s response structure and functionality. Clearly, this approach to Web apis i 2005 2007 2009 2011 2013 2015 186 1,263 2,418 5,018 7,182 10,302 12,559 number of indexed Web s g number of Web apis is often named an indicator of their success, while the ove ssary—and detrimental to the development of generic Web api clients. (data: progra number of indexed Web APIs
 in ProgrammableWeb
  63. 63. Find resources in your domain
 and assign them an identifier. http://bib.org/books/978-1-85604-964-1/ http://bib.org/authors/7356/
  64. 64. It’s just like building a web site.
 When a user comes, serve HTML. http://bib.org/books/978-1-85604-964-1/ U GET HTML
  65. 65. It’s just like building a web site.
 When a client comes, serve JSON. http://bib.org/books/978-1-85604-964-1/ C GET JSON
  66. 66. It’s just like building a web site.
 When a client comes, serve RDF. http://bib.org/books/978-1-85604-964-1/ C GET RDF
  67. 67. Content negotiation exists
 for a long time in HTTP. http://bib.org/books/978-1-85604-964-1/ C GET RDF Resource Representation
  68. 68. This allows constant URIs
 even with future changes. http://bib.org/books/978-1-85604-964-1/ C GET RDF 2.0
  69. 69. It enables different users and
 machines to talk about things. http://bib.org/books/978-1-85604-964-1/ C U C
  70. 70. The best API is no API. Your website is already an API. Developers like to build complicated APIs. API keys are especially cool to build. Every feature and change comes with a high cost. If you ask for an API, you’ll get one. Ask for new representations
 of your resources instead.
  71. 71. Changes Constants Promises The Digital Cavemen
 of Linked Lascaux
  72. 72. Promises Web Data Integration Scalability
  73. 73. Promises Web Data Integration Scalability
  74. 74. The Semantic Web promised
 data on the Web. 85,567,007,302 triples from 3,426 datasets LODStats 38,606,408,765 from 657,896 entries LOD Laundromat
  75. 75. How much of this data
 can we readily access? data dumps Linked Data documents SPARQL endpoints
  76. 76. A data dump means downloading everything and querying locally.
  77. 77. A data dump means downloading everything and querying locally. When was the last time
 you downloaded the full Wikipedia
 just because you had one question?
  78. 78. Dumps are not Web querying.
 It’s kind of like giving up. Semantic Web Semantic Basement? What advantage do we have
 compared to Big Data? Still the RDF data model… But the major difference is Web.
  79. 79. Linked Data documents
 allow you to traverse a dataset.
  80. 80. Linked Data documents
 allow you to traverse a dataset. That’s similar to what we also do:
 consume information on Wikipedia
 by following links.
  81. 81. Much Linked Data is available
 using the well-known principles. Servers publish a light-weight interface. Clients follow their nose
 to retrieve information.
  82. 82. Linked Data documents allow
 query evaluation on the Web. # Other books by the same author
 SELECT DISTINCT ?book WHERE {
 books:85604 dc:creator ?author.
 ?book dc:creator ?author.
 }
  83. 83. Some queries are hard
 or impossible to evaluate. # Books about Hamburg
 SELECT DISTINCT ?book ?author WHERE {
 ?book dc:subject dbpedia:Hamburg.
 ?book dc:creator ?author.
 }
  84. 84. SPARQL endpoints allow you
 to ask any question you want.
  85. 85. SPARQL endpoints allow you
 to ask any question you want. When was the last time
 you expected Wikipedia to answer
 specific questions automatically for you?
  86. 86. A public SPARQL endpoint
 happily answers this query. # Other books by the same author
 SELECT DISTINCT ?book WHERE {
 books:85604 dc:creator ?author.
 ?book dc:creator ?author.
 }
  87. 87. A public SPARQL endpoint also
 happily answers this query. # Books about Hamburg
 SELECT DISTINCT ?book ?author WHERE {
 ?book dc:subject dbpedia:Hamburg.
 ?book dc:creator ?author.
 }
  88. 88. A public SPARQL endpoint also
 happily answers this query… SELECT DISTINCT ?drug ?drug1 ?drug2 ?drug3 ?drug4 ?d1 WHERE { ?drug1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu- berlin.de/drugbank/resource/drugcategory/antibiotics> . ?drug2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu- berlin.de/drugbank/resource/drugcategory/antiviralAgents> . ?drug3 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu- berlin.de/drugbank/resource/drugcategory/antihypertensiveAgents> . ?drug4 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu- berlin.de/drugbank/resource/drugcategory/anti-bacterialAgents> . ?drug1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target> ?o1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/genbankIdGene> ?g1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/locus> ?l1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/molecularWeight> ?mw1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/hprdId> ?hp1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/swissprotName> ?sn1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/proteinSequence> ?ps1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/generalReference> ?gr1 . ?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target>?o1 . ?drug2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target> ?o2 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/genbankIdGene> ?g2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/locus> ?l2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/molecularWeight> ?mw2 .
  89. 89. There’s a price to pay for being
 the most expressive HTTP interface. The majority of public SPARQL endpoints
 has less than 95% uptime. This means we cannot query them
 for more than 1.5 days each month. This means we cannot rely on them
 to build Linked Data applications. Buil-Aranda – Hogan – Umbrich – Vandenbussche
 SPARQL Web-Querying Infrastructure: Ready for Action?
  90. 90. Promises Web Data Integration Scalability
  91. 91. The main promise of Linked Data
 is integration, preserving semantics. RDF Records in one table can relate to oth- ers by referencing their key column. ent child s property subject object
  92. 92. Integration is the promise.
 But does it work on the Web? data dumps Linked Data documents SPARQL endpoints
  93. 93. With data dumps, we just
 build a bigger basement. How far do we go? How do we keep data up to date?
  94. 94. With Linked Data documents,
 we keep on following our nose. There are no dataset boundaries. Some queries will remain hard.
  95. 95. With public SPARQL endpoints,
 problems become worse. 1 endpoint has 95% availability. 1.5 days down each month 2 endpoints have 90% availability. 3 days down each month 3 endpoints have 85% availability. 4.5 days down each month
  96. 96. Promises Web Data Integration Scalability
  97. 97. Can we think differently
 about Linked Data on the Web? high server costlow server cost data
 dump SPARQL
 endpoint high availability low availability high bandwidth low bandwidth out-of-date data live data low client costhigh client cost Linked Data
 documents
  98. 98. Can we think differently
 about Linked Data on the Web? data
 dump SPARQL
 endpoint Linked Data
 documents ? ?
  99. 99. Let us combine the lessons on
 changes, constants, and promises. An interface that withstands change, simple enough so it doesn’t break complex enough to query.
  100. 100. Let us combine the lessons on
 changes, constants, and promises. Data dumps contain too much. SPARQL endpoint results are too specific. Linked Data documents are unidirectional.
  101. 101. Each interface divides a dataset into Linked Data Fragments. Data dumps: 1 huge fragment SPARQL endpoints: ∞ specific fragments Linked Data: 1 fragment per subject
  102. 102. Can we find a new interface
 with a sustainable balance? Triple Pattern Fragments:
 1 fragment per subject / predicate / object
  103. 103. Browse a dataset by triple pattern—
 no less, no more.
  104. 104. Machines can access
 the exact same interface as RDF.
  105. 105. Triple Pattern Fragments extend
 Linked Data documents with forms. That’s even more similar to what we do:
 consume information on the Wikipedia
 by following links and using forms.
  106. 106. Machines solve complex queries
 by breaking them down. # Other books by the same author
 SELECT DISTINCT ?book WHERE {
 books:85604 dc:creator ?author.
 ?book dc:creator ?author.
 }
  107. 107. Machines solve complex queries
 by breaking them down. # Books about Hamburg
 SELECT DISTINCT ?book ?author WHERE {
 ?book dc:subject dbpedia:Hamburg.
 ?book dc:creator ?author.
 }
  108. 108. Promises can be kept, because
 the interface is intelligently light. Publishing Linked Data
 that can be queried on the Web
 is realistic because the workload is divided. The server doesn’t even need a triplestore. Since the client is in charge,
 querying multiple sources is easy.
  109. 109. Promises are negotiated contracts so they always involve trade-offs. Querying will be slower. clients send many requests to answer a query Query times are more consistent. 0.3 secs with a SPARQL endpoint… 95% of time 3 secs with Triple Pattern Fragments… 99.9% of time Experiment with more complex interfaces.
  110. 110. Make your Linked Data
 queryable on the Web. Several open-source implementations:
 linkeddatafragments.org/software/ Query one or multiple sources online:
 client.linkeddatafragments.org Example: bit.ly/harvard-hamburg
  111. 111. Changes Constants Promises The Digital Cavemen
 of Linked Lascaux
  112. 112. Identify the constants,
 separate them from changes. Satisfy Linked Data needs
 with promises you can keep.
  113. 113. Simple enough
 to be usable, complex enough
 to be useful.
  114. 114. Sustainability means
 promising the simplest
 useful complexity.
  115. 115. @RubenVerborgh
 ruben.verborgh.org
  • susannaanas

    Feb. 12, 2016
  • alfvelasco

    Dec. 13, 2015
  • kosson

    Dec. 12, 2015
  • asanchez75

    Nov. 29, 2015
  • woodskr

    Nov. 28, 2015
  • ninarossa

    Nov. 27, 2015
  • DieterDeWitte

    Nov. 26, 2015
  • MikelEganaAranguren

    Nov. 26, 2015
  • omisido

    Nov. 25, 2015
  • TeoSeo

    Nov. 25, 2015
  • ArturoMunguia1

    Nov. 25, 2015
  • mpetyx

    Nov. 25, 2015
  • paventurier

    Nov. 25, 2015

Keynote at Semantic Web In Libraries 2015 (#SWIB15)

Aufrufe

Aufrufe insgesamt

3.713

Auf Slideshare

0

Aus Einbettungen

0

Anzahl der Einbettungen

647

Befehle

Downloads

20

Geteilt

0

Kommentare

0

Likes

13

×