Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.


The future

is federated
Ruben Verborgh
Big Data
I think
is boring.
Big Data thrives

on centralization.
Knowledge

is inherently distributed.
Knowledge

is inherently heterogeneous.
Knowledge on the Web

is inherently linked.
Centralization
skips
interesting
the most

problems
Where to find
data you need?
How to access them?
How to integrate them?
Let’s create smart apps

over VIVO and Web data.
a light interface to VIVO data
queries over that interface
an app built on such queries
You’ll get to see 3 things:
We can integrate

multiple data sources

on the live Web,
but we need to set

our expectations right.


The future

is federated
Big Data fails at Web scale
Light interfaces rule
Engineer for serendipity


The future

is federated
Big Data fails at Web scale
Light interfaces rule
Engineer for serendipity
RDFTHE DATA LANGUAGE
<subject> <predicate> <object>.
triple
SPARQLTHE QUERY LANGUAGE
SPARQLTHE PROTOCOL
client
SPARQL

endpoint
SPARQL protocol
SPARQL

query
SELECT ?person ?name WHERE {
?person a dbo:Scientist.
?person rdfs:label ?name.
?person dbo:birthPlace dbp:Denver.
}
Hey, ...
SELECT DISTINCT ?drug ?drug1 ?drug2 ?drug3 ?drug4 ?d1 WHERE {
?drug1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/dr...
SPARQL endpoints

try to be the Web’s

Big Data processors.
for free
few endpoints exist
the average endpoint is

down for 1.5 days/month
Can I SPARQL

your endpoint?
Big Data fails

at Web scale
because Web Scale

is much bigger.
SEMANTIC

WEBSHOULDN’T TRY TO COMPETE WITH
BIG DATA
WEB
I WANT TO PUT THE
BACK INTO SEMANTIC WEB
IT’S OUR MAIN DIFFERENTIATOR

FROM BIG DATA
WEB
IF IT’S NOT
I’M NOT INTERESTED
That’s why I think

Big Data is boring.


The future

is federated
Big Data fails at Web scale
Light interfaces rule
Engineer for serendipity
AVERAGE

HUMAN
What would the
do?
SELECT ?person ?name WHERE {
?person a dbo:Scientist.
?person rdfs:label ?name.
?person dbo:birthPlace dbp:Denver.
}
AVERA...
AVERAGE

HUMAN
Which scientists were born in Denver?
You can use only Wikipedia.
AVERAGE

HUMAN
1. visit the page about Denver
2. make a list of people born there
3. read their pages to see if they’re a ...
WEB LINKING

IS UNIDIRECTIONAL
a Denver person’s page links to Denver
Denver doesn’t necessarily link to that person
AVERAGE

HUMAN
1. visit the page about Denver
2. make a list of people born there
3. read their pages to see if they’re a ...
AVERAGE

HUMAN
We need to empower the
but please not with a SPARQL endpoint

because they’re so expensive to keep up.
SIMPLEST

COMPLEXITY
WHAT IS THE
?
THE ESSENCE

OF RDF
<subject> <predicate> <object>.
THE ESSENCE

OF LINKED DATA
?subject <predicate> <object>.
THE ESSENCE

OF LINKED DATA
Denver <predicate> <object>.
THE ESSENCE

OF TPF
?subject ?predicate ?object.
THE ESSENCE

OF TPF
?subject ?predicate Denver.
TRIPLE

PATTERN

FRAGMENTS
Clients can ask

the server only

for triple patterns.
AVERAGE

HUMAN
Which scientists were born in Denver?
You can only use a TPF interface of DBpedia.
AVERAGE

HUMAN
1. “?people birthPlace Denver.”
2. “?person type Scientist.”
3. “?person fullName ?name.”
You can only use ...
AVERAGE

MACHINE
1. “?person birthPlace Denver.”
2. “?person type Scientist.”
3. “?person fullName ?name.”
You can only us...
SELECT ?person ?name WHERE {
?person a dbo:Scientist.
?person rdfs:label ?name.
?person dbo:birthPlace dbp:Denver.
}
AVERA...


The future

is federated
Big Data fails at Web scale
Light interfaces rule
Engineer for serendipity
Engineer for serendipity.
—Roy T. Fielding
If 1 endpoint is down

for 1.5 days each month,
then 2 endpoints might be

for 3 days each month.
Federated queries with

...
Just ask each of the questions

to different TPF servers.
Federated queries are

native to TPF clients.
But in federated scenarios,

performance can be on par

with SPARQL endpoints!
TPF trades server cost

for query performan...
TPF is not the final solution
—no API will ever be—
but an excellent starting point.
Lightweight interfaces

are easy to ex...
The Memento protocol

brings time to the Web.
Ask for representations at
a certain point in the past.
TPF and Memento

are a great match.
We combined them in collaboration

with Herbert Van de Sompel & team

at the Los Alamo...


The future

is federated
Big Data fails at Web scale
Light interfaces rule
Engineer for serendipity
VIVO






client SPARQL
VIVO today
TPF

server
VIVO






client TPF
VIVO tomorrow?
Federation
is a game changer.
Federation
is a game changer.
with the TPF interface
power
With great
responsibility
comes great
realistic
We need
expectations
about our
to be
Some queries will

always be hard

on an open Web.
You might need centralization

if you want answers fast.
*
*Terms and c...
…and streaming!
Many more queries

than you’d think

are pretty fast…
OPEN SOURCE
linkeddatafragments.org
@RubenVerborgh


and it

starts today


The future

is federated
Sie haben dieses Dokument abgeschlossen.
Lade die Datei herunter und lese sie offline.
Nächste SlideShare
Hypermedia APIs that make sense
Weiter
Nächste SlideShare
Hypermedia APIs that make sense
Weiter
Herunterladen, um offline zu lesen und im Vollbildmodus anzuzeigen.

Teilen

The Future is Federated

Herunterladen, um offline zu lesen

Invited talk at #VIVO16

Ähnliche Bücher

Kostenlos mit einer 30-tägigen Testversion von Scribd

Alle anzeigen

The Future is Federated

  1. 1. 
 The future
 is federated Ruben Verborgh
  2. 2. Big Data I think is boring.
  3. 3. Big Data thrives
 on centralization.
  4. 4. Knowledge
 is inherently distributed.
  5. 5. Knowledge
 is inherently heterogeneous.
  6. 6. Knowledge on the Web
 is inherently linked.
  7. 7. Centralization skips interesting the most
 problems
  8. 8. Where to find data you need? How to access them? How to integrate them?
  9. 9. Let’s create smart apps
 over VIVO and Web data.
  10. 10. a light interface to VIVO data queries over that interface an app built on such queries You’ll get to see 3 things:
  11. 11. We can integrate
 multiple data sources
 on the live Web, but we need to set
 our expectations right.
  12. 12. 
 The future
 is federated Big Data fails at Web scale Light interfaces rule Engineer for serendipity
  13. 13. 
 The future
 is federated Big Data fails at Web scale Light interfaces rule Engineer for serendipity
  14. 14. RDFTHE DATA LANGUAGE
  15. 15. <subject> <predicate> <object>. triple
  16. 16. SPARQLTHE QUERY LANGUAGE
  17. 17. SPARQLTHE PROTOCOL
  18. 18. client SPARQL
 endpoint SPARQL protocol SPARQL
 query
  19. 19. SELECT ?person ?name WHERE { ?person a dbo:Scientist. ?person rdfs:label ?name. ?person dbo:birthPlace dbp:Denver. } Hey, SPARQL endpoint… Sure!
  20. 20. SELECT DISTINCT ?drug ?drug1 ?drug2 ?drug3 ?drug4 ?d1 WHERE { ?drug1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu-berlin.de/drugban ?drug2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu-berlin.de/drugban ?drug3 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu-berlin.de/drugban ?drug4 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu-berlin.de/drugban ?drug1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target> ?o1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/genbankIdGene> ?g1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/locus> ?l1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/molecularWeight> ?mw1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/hprdId> ?hp1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/swissprotName> ?sn1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/proteinSequence> ?ps1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/generalReference> ?gr1 . ?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target>?o1 . ?drug2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target> ?o2 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/genbankIdGene> ?g2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/locus> ?l2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/molecularWeight> ?mw2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/hprdId> ?hp2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/swissprotName> ?sn2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/proteinSequence> ?ps2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/generalReference> ?gr2 . ?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target>?o2 . Hey, SPARQL endpoint… Sure!
  21. 21. SPARQL endpoints
 try to be the Web’s
 Big Data processors. for free
  22. 22. few endpoints exist the average endpoint is
 down for 1.5 days/month Can I SPARQL
 your endpoint?
  23. 23. Big Data fails
 at Web scale because Web Scale
 is much bigger.
  24. 24. SEMANTIC
 WEBSHOULDN’T TRY TO COMPETE WITH BIG DATA
  25. 25. WEB I WANT TO PUT THE BACK INTO SEMANTIC WEB IT’S OUR MAIN DIFFERENTIATOR
 FROM BIG DATA
  26. 26. WEB IF IT’S NOT I’M NOT INTERESTED That’s why I think
 Big Data is boring.
  27. 27. 
 The future
 is federated Big Data fails at Web scale Light interfaces rule Engineer for serendipity
  28. 28. AVERAGE
 HUMAN What would the do?
  29. 29. SELECT ?person ?name WHERE { ?person a dbo:Scientist. ?person rdfs:label ?name. ?person dbo:birthPlace dbp:Denver. } AVERAGE
 HUMAN You can use only Wikipedia.
  30. 30. AVERAGE
 HUMAN Which scientists were born in Denver? You can use only Wikipedia.
  31. 31. AVERAGE
 HUMAN 1. visit the page about Denver 2. make a list of people born there 3. read their pages to see if they’re a scientist You can use only Wikipedia.
  32. 32. WEB LINKING
 IS UNIDIRECTIONAL a Denver person’s page links to Denver Denver doesn’t necessarily link to that person
  33. 33. AVERAGE
 HUMAN 1. visit the page about Denver 2. make a list of people born there 3. read their pages to see if they’re a scientist You can use only Wikipedia.
  34. 34. AVERAGE
 HUMAN We need to empower the but please not with a SPARQL endpoint
 because they’re so expensive to keep up.
  35. 35. SIMPLEST
 COMPLEXITY WHAT IS THE ?
  36. 36. THE ESSENCE
 OF RDF <subject> <predicate> <object>.
  37. 37. THE ESSENCE
 OF LINKED DATA ?subject <predicate> <object>.
  38. 38. THE ESSENCE
 OF LINKED DATA Denver <predicate> <object>.
  39. 39. THE ESSENCE
 OF TPF ?subject ?predicate ?object.
  40. 40. THE ESSENCE
 OF TPF ?subject ?predicate Denver.
  41. 41. TRIPLE
 PATTERN
 FRAGMENTS
  42. 42. Clients can ask
 the server only
 for triple patterns.
  43. 43. AVERAGE
 HUMAN Which scientists were born in Denver? You can only use a TPF interface of DBpedia.
  44. 44. AVERAGE
 HUMAN 1. “?people birthPlace Denver.” 2. “?person type Scientist.” 3. “?person fullName ?name.” You can only use a TPF interface of DBpedia.
  45. 45. AVERAGE
 MACHINE 1. “?person birthPlace Denver.” 2. “?person type Scientist.” 3. “?person fullName ?name.” You can only use a TPF interface of DBpedia.
  46. 46. SELECT ?person ?name WHERE { ?person a dbo:Scientist. ?person rdfs:label ?name. ?person dbo:birthPlace dbp:Denver. } AVERAGE
 MACHINE You can only use a TPF interface of DBpedia.
  47. 47. 
 The future
 is federated Big Data fails at Web scale Light interfaces rule Engineer for serendipity
  48. 48. Engineer for serendipity. —Roy T. Fielding
  49. 49. If 1 endpoint is down
 for 1.5 days each month, then 2 endpoints might be
 for 3 days each month. Federated queries with
 SPARQL endpoints
 pose a problem.
  50. 50. Just ask each of the questions
 to different TPF servers. Federated queries are
 native to TPF clients.
  51. 51. But in federated scenarios,
 performance can be on par
 with SPARQL endpoints! TPF trades server cost
 for query performance.
  52. 52. TPF is not the final solution —no API will ever be— but an excellent starting point. Lightweight interfaces
 are easy to extend and combine with others.
  53. 53. The Memento protocol
 brings time to the Web. Ask for representations at a certain point in the past.
  54. 54. TPF and Memento
 are a great match. We combined them in collaboration
 with Herbert Van de Sompel & team
 at the Los Alamos National Laboratory.
  55. 55. 
 The future
 is federated Big Data fails at Web scale Light interfaces rule Engineer for serendipity
  56. 56. VIVO 
 
 
 client SPARQL VIVO today TPF
 server
  57. 57. VIVO 
 
 
 client TPF VIVO tomorrow?
  58. 58. Federation is a game changer.
  59. 59. Federation is a game changer. with the TPF interface
  60. 60. power With great responsibility comes great
  61. 61. realistic We need expectations about our to be
  62. 62. Some queries will
 always be hard
 on an open Web. You might need centralization
 if you want answers fast. * *Terms and conditions apply.
  63. 63. …and streaming! Many more queries
 than you’d think
 are pretty fast…
  64. 64. OPEN SOURCE linkeddatafragments.org
  65. 65. @RubenVerborgh 
 and it
 starts today 
 The future
 is federated
  • emchateau

    Apr. 6, 2018
  • Commissaresse

    Aug. 19, 2016
  • rcallewaert

    Aug. 19, 2016

Invited talk at #VIVO16

Aufrufe

Aufrufe insgesamt

1.833

Auf Slideshare

0

Aus Einbettungen

0

Anzahl der Einbettungen

18

Befehle

Downloads

13

Geteilt

0

Kommentare

0

Likes

3

×