A practical guide on how to query and visualize Linked Open Data with eea.daviz Plone add-on.
In this presentation you will get an introduction to Linked Open Data and where it is applied. We will see how to query this large open data cloud over the web with the language SPARQL. We will then go through real examples and create interactive and live data visualizations with full data tracebility using eea.sparql and eea.daviz.
Presented at the PLOG2013 conference http://www.coactivate.org/projects/plog2013
OpenShift Commons Paris - Choose Your Own Observability Adventure
Visualize open data with Plone - eea.daviz PLOG 2013
1. Visualizing Open Data with Plone
a practical guide on how to query and visualize Linked Open Data
with eea.daviz product
Antonio De Marinis
Web Technology Management
European Environment Agency
www.eea.europa.eu
2. Linked Data evolution
2007
as per 2011
keeps growing...
> 1 million datasets
Watch video STRATA conference 2013
3. Open Data - what is it?
Open data is a philosophy and practice requiring
that certain data be freely available to everyone,
without restrictions from copyright, patents or
other mechanisms of control.
Linked Open Data (LOD) or simply Linked
Data is a technique to interlink all open datasets
into a web of data, aka semantic web, using
technologies like RDF and SPARQL.
5. SPARQL query structure
A SPARQL query comprises, in order:
● Prefix declarations, for abbreviating URIs
● Dataset definition, stating what RDF graph(s) are being queried
● A result clause, identifying what information to return from the query
● The query pattern, specifying what to query for in the underlying dataset
● Query modifiers, slicing, ordering, and otherwise rearranging query results
# prefix declarations
PREFIX foo: <http://example.com/resources/>
...
# dataset definition
FROM ...
# result
clause
SELECT ...
# query pattern
WHERE {
...
}
# query modifiers
ORDER BY ...
7. Let's dive into a real example
SELECT * WHERE {
?subject rdf:type <http://dbpedia.org/ontology/City>.
?subject rdfs:label ?label.
?subject rdfs:comment ?abstract.
?subject <http://dbpedia.org/ontology/populationTotal> ?populationTotal.
FILTER (lang(?label) = "en" && lang(?abstract) = "en" && (?populationTotal >= "5000000"^^xsd:
integer))
} LIMIT 5
8. Let's dive into a real example
PREFIX o: <http://dbpedia.org/ontology/>
PREFIX p: <http://dbpedia.org/property/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT DISTINCT * WHERE {
?subject a o:City.
?subject rdfs:label ?label.
OPTIONAL {?subject rdfs:comment ?abstract.}
?subject p:populationTotal ?populationTotal.
OPTIONAL {?subject geo:lat ?latitude.}
OPTIONAL {?subject geo:long ?longitude.}
FILTER (lang(?label) = "en" && lang(?abstract) = "en" && (?populationTotal >= "5000000"^^xsd:integer
&& ?populationTotal < "60000000"^^xsd:integer))
}
ORDER BY DESC(?populationTotal)
find all properties by exploring dbpedia e.g.
dbpedia http://dbpedia.org/page/Tokyo
Example without duplicates http://daviz.eionet.
europa.eu/data/local-sparql-queries/most-
populated-cities
9. Corresponding data visualisation with
Daviz
We have been able to create a data visualisation of
open linked data with filters/facets entirely through
the web in about 10 minutes!
live demo http://www.eea.europa.eu/sandbox/plog2013/most-populated-cities-
with-coordinates-plus
10. Removing redundancies
TIP: In order to get rid of some rednundancy you can use "SAMPLE" or "SELECT DISTINCT"
PREFIX o: <http://dbpedia.org/ontology/>
PREFIX p: <http://dbpedia.org/property/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT ?subject (sql:SAMPLE(?subject) as ?city)
(sql:SAMPLE(?label) as ?label)
(sql:SAMPLE(?latitude) as ?latitude)
live example http://daviz.eionet.europa.
(sql:SAMPLE(?longitude) as ?longitude)
max(?populationTotal) as ?maxPopulation
eu/visualisations/most-populated-cities
max(?rainyDays) as ?rainyDays
WHERE {
?subject a o:City.
?subject rdfs:label ?label.
OPTIONAL {?subject rdfs:comment ?abstract.}
?subject p:populationTotal ?populationTotal.
OPTIONAL {?subject geo:lat ?latitude.}
OPTIONAL {?subject geo:long ?longitude.}
OPTIONAL {?subject p:yearPrecipitationDays ?rainyDays.}
FILTER (lang(?label) = "en" && lang(?abstract) = "en" && (?populationTotal >= "5000000"^^xsd:integer))
}
GROUP BY ?subject
ORDER BY DESC(?maxPopulation)
14. More resources
● SPARQL endpoints and their status: http://labs.mondeca.
com/sparqlEndpointsStatus/index.html
● SPARQL tutorial by example: http://www.cambridgesemantics.com/semantic-
university/sparql-by-example
● eea.sparql package gives you a sparql client and data holder for plone
available on pypi
● eea.daviz bundle includes eea.sparql and the visualisations tools