Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Querying data on the Web:

client or server?
Ruben Verborgh
Ghent University – iMinds
The current Semantic Web

has many implicit assumptions.
We should be able

to answer all queries.
Complexity is more impo...
Those assumptions are

not necessarily wrong.
They’re also not necessarily

the only possible ones.
Some queries are

hard to answer.
Availability is
a top priority.
Low-cost data servers

have potential.
Let’s rethink our...
Different assumptions lead

to a different Semantic Web.
Maybe they bring us closer

to the Web We Want.
…but what do we want?
The Semantic Web’s assumptions
Client-side query execution
Querying data on the Web:

client or server?
New query opportun...
1. Clients need a different protocol.
The Web for humans offers
an HTTP interface to HTML.
client dataHTTP
HTML
The Web for applications offers
an HTTP interface to JSON.
client dataHTTP
JSON
The Web for applications offers
an HTTP interface to RDF.
client dataHTTP
RDF
The Web for applications offers
an SPARQL interface to RDF.
client dataHTTP
RDF
SPARQL
Documents need a new language.
Semantic Web clients were

perceived as very limited.
Querying needs a new protocol.
…unlik...
1. Clients need a different protocol.
2. Live queries require that protocol.
public SPARQL endpoints
There are 3 common ways

to publish Linked Data.
Linked Data documents
downloadable data dumps
…and that’s not always a good thing.
Public SPARQL endpoints

offer a very powerful interface.
Clients can ask any query…
…...
Low-cost to host.
Linked Data documents

seem to work like the Web.
Solve queries by traversing links.
Many queries cannot...
Set up your own endpoint.
Downloadable data dumps

have high availability.
Data is not live.
You’re not really querying th...
1. Clients need a different protocol.
2. Live queries require that protocol.
3. Clients can request any query.
The query language abstracts away

the steps needed to solve it.
In SPARQL, asking a simple query

is as easy as asking a ...
With a JSON interface, the server
decides how clients access data.
client dataHTTP
JSON
client dataHTTP
RDF
SPARQL
With a SPARQL interface, clients

decide how they access data.
Clients can ask anything, also

queries that bring servers down.
The majority

of public SPARQL endpoints

has less than 9...
If you have operational need

for SPARQL accessible data,

you must have your own infrastructure.
No public endpoints.

Pu...
SEMANTICthings we happen to have

downloaded from the
WEB
If you want to study

a subject on Wikipedia,
do you download all

4,614,000 articles first?
1. Clients need a different protocol.
2. Live queries require that protocol.
3. Clients can request any query.
The Semantic Web’s assumptions
Client-side query execution
New query opportunities
Querying data on the Web:

client or se...
data

dump
SPARQL

endpoint
Any fragment of a Linked Data set

is called a Linked Data Fragment.
derefer-

encing
high ser...
Each type of Linked Data Fragment

is defined by three characteristics.
selector
metadata
controls
What data does it conta...
a SPARQL query
(none)
(none)
SPARQL CONSTRUCT result
selector
metadata
controls
Each type of Linked Data Fragment

is defi...
a specific entity
creator, maintainer, …
links to other LD documents
Linked Data Document
selector
metadata
controls
Each t...
everything
(none)
data dump
number of triples, file size
selector
metadata
controls
Each type of Linked Data Fragment

is d...
Can we query fragments that

balance client and server effort?
data

dump
SPARQL

endpoint
triple

pattern

fragments
dere...
triple pattern
total number of matches
access to all other fragments
selector
metadata
controls
Triple pattern fragments a...
data (first 100)
controls (other fragments)
metadata (total count)
Other APIs exist, but are specific.
Triple pattern fragment servers

enable clients to execute queries.
Triple patterns wo...
How to answer this query using

only triple pattern fragments?
SELECT ?person ?city WHERE {
?person a dbpedia-owl:Artist.
...
Get the corresponding fragments

?person a dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "York...
Get the corresponding fragments

and read the count metadata.
?person a dbpedia-owl:Artist. ±61,000
±470,000
12
?person db...
Start with the smallest fragment.

Start with the first match.
?person a dbpedia-owl:Artist ±61,
±470,
12
?person dbpedia-...
How to answer this query using

only triple pattern fragments?
SELECT ?person WHERE {
?person a dbpedia-owl:Artist.
?perso...
Get the corresponding fragments

?person a dbpedia-owl:Artist.
?person dbpo:birthPlace dbpedia:York.
dbpedia:John_Flaxman ...
Get the corresponding fragments

and read the count metadata.
?person a dbpedia-owl:Artist. ±61,000
75?person dbpo:birthPl...
Start with the smallest fragment.

Start with the first match.
?person a dbpedia-owl:Artist ±61,
75?person dbpo:birthPlace...
How to answer this query using

only triple pattern fragments?
ASK {
dbp:John_Flaxman a dbpo:Artist.
dbp:John_Flaxman dbpo...
Get the corresponding fragment

and read the count metadata.
dbpedia:John_Flaxman a dbpedia-owl:Artist. 1
dbpedia:John_Fla...
Recursively repeat the process

for all bindings.
?person dbpo:birthPlace dbpedia:York.
dbpedia:John_Flaxman dbpo:birthPla...
Use the Web’s protocol HTTP.
This way of querying

changes the usual assumptions.
Don’t be smart; enable intelligence.
Som...
Querying semantic datasources

means managing expectations.
data

dump
SPARQL

endpoint
triple

pattern

fragments
derefer...
The Semantic Web’s assumptions
Client-side query execution
New query opportunities
Querying data on the Web:

client or se...
Coupling access and processing

leads to low availability.
SPARQL Server
Client
Client
Client
Client
Client
Client
Client
...
LDF Server
Client
ClientClient
Client
Client
Client
Client Client
Client
(b) ldf servers only support simple requests and ...
Show a sorted list of molecules

that match certain characteristics.
…
Molecules endpoint

approach
fragment

approach
Molecules
endpoint

approach
SPARQL

endpoint
Molecules
Show a sorted list of molecules

that match certain characteristic...
endpoint

approachSELECT DISTINCT(?mol) MIN(?name)
WHERE {
?mol rdfs:label ?name;
…
…
}
ORDER BY ?name
Show a sorted list ...
endpoint

approach
Show a sorted list of molecules

that match certain characteristics.
SELECT DISTINCT(?mol) MIN(?name)
W...
endpoint

approach
DISTINCT
MIN
SORT BY
keep all results in memory
keep all results in memory, blocking
keep all results i...
fragments

approach
No blocking operators; streaming matters.
Show a sorted list of molecules

that match certain characte...
Molecules
fragments

approach
MoleculesMolecules
Show a sorted list of molecules

that match certain characteristics.
The algorithm remains the same

when clients use one or multiple

triple pattern fragment servers.
Federation also becomes...
An optimal solution doesn’t exist.

We should look at all APIs.
data

dump
SPARQL

endpoint
triple

pattern

fragments
der...
Servers indicate what they do,

enabling clients to query optimally.
“This server supports triple patterns

and full-text ...
The Semantic Web’s assumptions
Client-side query execution
New query opportunities
Querying data on the Web:

client or se...
Different assumptions

lead to different trade-offs.
Live querying of public data
is possible at low cost,

but at slower ...
Let your browser

solve a SPARQL query:

client.linkeddatafragments.org
Ruben Verborgh
Ghent University – iMinds
Nächste SlideShare
Wird geladen in …5
×
Nächste SlideShare
Building Your First App with MongoDB
Weiter

Teilen

Querying data on the Web – client or server?

Slides for my talk at CrEDIBLE 2014 workshop.
https://credible.i3s.unice.fr/doku.php?id=2014_workshop

Ähnliche Bücher

Kostenlos mit einer 30-tägigen Testversion von Scribd

Alle anzeigen

Ähnliche Hörbücher

Kostenlos mit einer 30-tägigen Testversion von Scribd

Alle anzeigen

Querying data on the Web – client or server?

  1. 1. Querying data on the Web:
 client or server? Ruben Verborgh Ghent University – iMinds
  2. 2. The current Semantic Web
 has many implicit assumptions. We should be able
 to answer all queries. Complexity is more important
 than availability. Data servers
 need to be expensive.
  3. 3. Those assumptions are
 not necessarily wrong. They’re also not necessarily
 the only possible ones.
  4. 4. Some queries are
 hard to answer. Availability is a top priority. Low-cost data servers
 have potential. Let’s rethink our assumptions,
 just to see what’s possible.
  5. 5. Different assumptions lead
 to a different Semantic Web. Maybe they bring us closer
 to the Web We Want.
  6. 6. …but what do we want?
  7. 7. The Semantic Web’s assumptions Client-side query execution Querying data on the Web:
 client or server? New query opportunities
  8. 8. 1. Clients need a different protocol.
  9. 9. The Web for humans offers an HTTP interface to HTML. client dataHTTP HTML
  10. 10. The Web for applications offers an HTTP interface to JSON. client dataHTTP JSON
  11. 11. The Web for applications offers an HTTP interface to RDF. client dataHTTP RDF
  12. 12. The Web for applications offers an SPARQL interface to RDF. client dataHTTP RDF SPARQL
  13. 13. Documents need a new language. Semantic Web clients were
 perceived as very limited. Querying needs a new protocol. …unlike “simple” JSON clients.
  14. 14. 1. Clients need a different protocol. 2. Live queries require that protocol.
  15. 15. public SPARQL endpoints There are 3 common ways
 to publish Linked Data. Linked Data documents downloadable data dumps
  16. 16. …and that’s not always a good thing. Public SPARQL endpoints
 offer a very powerful interface. Clients can ask any query… …if the endpoint is available. Hosting an endpoint is costly.
  17. 17. Low-cost to host. Linked Data documents
 seem to work like the Web. Solve queries by traversing links. Many queries cannot be solved.
  18. 18. Set up your own endpoint. Downloadable data dumps
 have high availability. Data is not live. You’re not really querying the Web.
  19. 19. 1. Clients need a different protocol. 2. Live queries require that protocol. 3. Clients can request any query.
  20. 20. The query language abstracts away
 the steps needed to solve it. In SPARQL, asking a simple query
 is as easy as asking a difficult one. In contrast to the rest of the Web,
 clients are in control.
  21. 21. With a JSON interface, the server decides how clients access data. client dataHTTP JSON
  22. 22. client dataHTTP RDF SPARQL With a SPARQL interface, clients
 decide how they access data.
  23. 23. Clients can ask anything, also
 queries that bring servers down. The majority
 of public SPARQL endpoints
 has less than 95% availability. That means the endpoint
 —and thus your application—
 doesn’t work 1.5 days each month.
  24. 24. If you have operational need
 for SPARQL accessible data,
 you must have your own infrastructure. No public endpoints.
 Public endpoints are for lookups and discovery;
 sort of a dataset demo. —Orri Erling, OpenLink (2014)
  25. 25. SEMANTICthings we happen to have
 downloaded from the WEB
  26. 26. If you want to study
 a subject on Wikipedia, do you download all
 4,614,000 articles first?
  27. 27. 1. Clients need a different protocol. 2. Live queries require that protocol. 3. Clients can request any query.
  28. 28. The Semantic Web’s assumptions Client-side query execution New query opportunities Querying data on the Web:
 client or server?
  29. 29. data
 dump SPARQL
 endpoint Any fragment of a Linked Data set
 is called a Linked Data Fragment. derefer-
 encing high server efforthigh client effort all subject SPARQL querySELECTOR
  30. 30. Each type of Linked Data Fragment
 is defined by three characteristics. selector metadata controls What data does it contain? What do we know about it? What can we do next?
  31. 31. a SPARQL query (none) (none) SPARQL CONSTRUCT result selector metadata controls Each type of Linked Data Fragment
 is defined by three characteristics.
  32. 32. a specific entity creator, maintainer, … links to other LD documents Linked Data Document selector metadata controls Each type of Linked Data Fragment
 is defined by three characteristics.
  33. 33. everything (none) data dump number of triples, file size selector metadata controls Each type of Linked Data Fragment
 is defined by three characteristics.
  34. 34. Can we query fragments that
 balance client and server effort? data
 dump SPARQL
 endpoint triple
 pattern
 fragments derefer-
 encing high server efforthigh client effort all subject SPARQL querytriple pattern
  35. 35. triple pattern total number of matches access to all other fragments selector metadata controls Triple pattern fragments are cheap
 yet enable efficient querying.
  36. 36. data (first 100) controls (other fragments) metadata (total count)
  37. 37. Other APIs exist, but are specific. Triple pattern fragment servers
 enable clients to execute queries. Triple patterns work on all datasets. Combine data, metadata & controls.
  38. 38. How to answer this query using
 only triple pattern fragments? SELECT ?person ?city WHERE { ?person a dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "York"@en. }
  39. 39. Get the corresponding fragments
 ?person a dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "York"@en. dbpedia:York foaf:name “York”@en. dbpedia:York,_Ontario foaf:name “York”@en.
 … dbpedia:Ganesh_Ghosh …:birthPlace dbpedia:Bengal_Presidency. dbpedia:Jacques_L'enfant …:birthPlace dbpedia:Beauce.
 … dbpedia:Aamir_Zaki a dbpedia-owl:Artist. dbpedia:Ahmad_Morid a dbpedia-owl:Artist.
 …
  40. 40. Get the corresponding fragments
 and read the count metadata. ?person a dbpedia-owl:Artist. ±61,000 ±470,000 12 ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "York"@en. dbpedia:York foaf:name “York”@en. dbpedia:York,_Ontario foaf:name “York”@en.
 … dbpedia:Ganesh_Ghosh …:birthPlace dbpedia:Bengal_Presidency. dbpedia:Jacques_L'enfant …:birthPlace dbpedia:Beauce.
 … dbpedia:Aamir_Zaki a dbpedia-owl:Artist. dbpedia:Ahmad_Morid a dbpedia-owl:Artist.
 …
  41. 41. Start with the smallest fragment.
 Start with the first match. ?person a dbpedia-owl:Artist ±61, ±470, 12 ?person dbpedia-owl:birthPlace ?city foaf:name "York"@en. dbpedia:York foaf:name “York”@en. dbpedia:York,_Ontario foaf:name “York”@en.
 … dbpedia:Ganesh_Ghosh …:birthPlace dbpedia:Bengal_Presidency. dbpedia:Jacques_L'enfant …:birthPlace dbpedia:Beauce. … dbpedia:Aamir_Zaki dbpedia:Ahmad_Morid a dbpedia-owl:Artist. …
  42. 42. How to answer this query using
 only triple pattern fragments? SELECT ?person WHERE { ?person a dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace dbpedia:York. dbpedia:York foaf:name "York"@en. }
  43. 43. Get the corresponding fragments
 ?person a dbpedia-owl:Artist. ?person dbpo:birthPlace dbpedia:York. dbpedia:John_Flaxman dbpo:birthPlace dbpedia:York. dbpedia:Joseph_Hansom dbpo:birthPlace dbpedia:York.
 … dbpedia:Aamir_Zaki a dbpedia-owl:Artist. dbpedia:Ahmad_Morid a dbpedia-owl:Artist.
 …
  44. 44. Get the corresponding fragments
 and read the count metadata. ?person a dbpedia-owl:Artist. ±61,000 75?person dbpo:birthPlace dbpedia:York. dbpedia:John_Flaxman dbpo:birthPlace dbpedia:York. dbpedia:Joseph_Hansom dbpo:birthPlace dbpedia:York.
 … dbpedia:Aamir_Zaki a dbpedia-owl:Artist. dbpedia:Ahmad_Morid a dbpedia-owl:Artist.
 …
  45. 45. Start with the smallest fragment.
 Start with the first match. ?person a dbpedia-owl:Artist ±61, 75?person dbpo:birthPlace dbpedia:York. dbpedia:John_Flaxman dbpo:birthPlace dbpedia:York. dbpedia:Joseph_Hansom dbpo:birthPlace dbpedia:York.
 … dbpedia:Aamir_Zaki dbpedia:Ahmad_Morid a dbpedia-owl:Artist. …
  46. 46. How to answer this query using
 only triple pattern fragments? ASK { dbp:John_Flaxman a dbpo:Artist. dbp:John_Flaxman dbpo:birthPlace dbp:York. dbp:York foaf:name "York"@en. }
  47. 47. Get the corresponding fragment
 and read the count metadata. dbpedia:John_Flaxman a dbpedia-owl:Artist. 1 dbpedia:John_Flaxman a dbpedia-owl:Artist. ! Output the match: ?person = dbpedia:John_Flaxman
 ?city = dbpedia:York
  48. 48. Recursively repeat the process
 for all bindings. ?person dbpo:birthPlace dbpedia:York. dbpedia:John_Flaxman dbpo:birthPlace dbpedia:York. dbpedia:Joseph_Hansom dbpo:birthPlace dbpedia:York.
 … ?city foaf:name "York"@en. dbpedia:York foaf:name “York”@en. dbpedia:York,_Ontario foaf:name “York”@en.
 …
  49. 49. Use the Web’s protocol HTTP. This way of querying
 changes the usual assumptions. Don’t be smart; enable intelligence. Some queries will be hard / slow.
  50. 50. Querying semantic datasources
 means managing expectations. data
 dump SPARQL
 endpoint triple
 pattern
 fragments derefer-
 encing high server efforthigh client effort low availabilityhigh availability low freshness / speed high freshness / speed
  51. 51. The Semantic Web’s assumptions Client-side query execution New query opportunities Querying data on the Web:
 client or server?
  52. 52. Coupling access and processing
 leads to low availability. SPARQL Server Client Client Client Client Client Client Client (a) sparql endpoints perform all processing on the server, leading to fast query execution with low data bandwidth, and a rapidly overloaded server.
  53. 53. LDF Server Client ClientClient Client Client Client Client Client Client (b) ldf servers only support simple requests and can thus handle far higher loads. Clients perform the querying, so they need more (cacheable) data. Enabling clients to query
 leads to high scalability.
  54. 54. Show a sorted list of molecules
 that match certain characteristics. … Molecules endpoint
 approach fragment
 approach
  55. 55. Molecules endpoint
 approach SPARQL
 endpoint Molecules Show a sorted list of molecules
 that match certain characteristics.
  56. 56. endpoint
 approachSELECT DISTINCT(?mol) MIN(?name) WHERE { ?mol rdfs:label ?name; … … } ORDER BY ?name Show a sorted list of molecules
 that match certain characteristics.
  57. 57. endpoint
 approach Show a sorted list of molecules
 that match certain characteristics. SELECT DISTINCT(?mol) MIN(?name) WHERE { ?mol rdfs:label ?name; … … } ORDER BY ?name
  58. 58. endpoint
 approach DISTINCT MIN SORT BY keep all results in memory keep all results in memory, blocking keep all results in memory, blocking Consequences: Doesn’t matter; we’re waiting anyway. Show a sorted list of molecules
 that match certain characteristics.
  59. 59. fragments
 approach No blocking operators; streaming matters. Show a sorted list of molecules
 that match certain characteristics. SELECT ?mol ?name WHERE { ?mol rdfs:label ?name; … … }
  60. 60. Molecules fragments
 approach MoleculesMolecules Show a sorted list of molecules
 that match certain characteristics.
  61. 61. The algorithm remains the same
 when clients use one or multiple
 triple pattern fragment servers. Federation also becomes
 substantially easier. Avoid the unavailability cascade.
  62. 62. An optimal solution doesn’t exist.
 We should look at all APIs. data
 dump SPARQL
 endpoint triple
 pattern
 fragments derefer-
 encing
  63. 63. Servers indicate what they do,
 enabling clients to query optimally. “This server supports triple patterns
 and full-text search on objects.” “This server supports SPARQL queries
 with up to 2 joins.” “This server supports Linked Data documents.”
  64. 64. The Semantic Web’s assumptions Client-side query execution New query opportunities Querying data on the Web:
 client or server?
  65. 65. Different assumptions
 lead to different trade-offs. Live querying of public data is possible at low cost,
 but at slower speeds… …for now :-)
  66. 66. Let your browser
 solve a SPARQL query:
 client.linkeddatafragments.org Ruben Verborgh Ghent University – iMinds
  • MikelEganaAranguren

    Oct. 10, 2014
  • FreddyPriyatna

    Oct. 10, 2014

Slides for my talk at CrEDIBLE 2014 workshop. https://credible.i3s.unice.fr/doku.php?id=2014_workshop

Aufrufe

Aufrufe insgesamt

1.759

Auf Slideshare

0

Aus Einbettungen

0

Anzahl der Einbettungen

73

Befehle

Downloads

0

Geteilt

0

Kommentare

0

Likes

2

×