The document describes an upcoming summer school on Linked Open Data and smart cities to be held from June 7-12 in Cercedilla, Spain. It provides an agenda that will include topics like Linked Open Data guidelines for data generation, discussions, and hands-on sessions. Presenters will include researchers from universities and organizations in Spain. The goal is to discuss how open data can be used to power applications for smart cities and improve areas like transportation, accessibility, and public services.
LD4SC Summer School Guide to Generating Linked Open Data
1. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
1st
Summer
School
on
Smart
Ci2es
and
Linked
Open
Data
(LD4SC-‐15)
Linked
Data
Genera=on
Process
Raúl
García-‐Castro,
Filip
Radulovic,
Oscar
Corcho,
María
Poveda,
Víctor
Rodríguez-‐Doncel,
Asunción
Gómez-‐Pérez,
Daniel
Vila-‐Suero
Presenter:
Raúl
García-‐Castro
2. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Index
• Linked
Open
Data
in
Smart
Ci2es
• Guidelines
for
the
Genera=on
of
Linked
Data
• Discussion
• Hands-‐on
Descrip=on
2
3. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Data
in
smart
ci=es
hQp://br.fiberhomegroup.com/pt/Enterprise/324/2282.aspx
3
4. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
• For
example,
(re)using
open
transport
data
– Provide
travel
informa=on
to
persons
– Allow
beQer
mul=modal
route
planning
– Facilitate
public
transport
management
– …
– Accessibility
• Which
metro
accesses
are
accessible
for
wheelchair
users?
• In
which
bus
stops
is
it
safer
and
more
convenient
for
a
wheelchair
user
to
wait?
• Is
there
any
accessible
parking
space
nearby
a
bus
stop?
• etc.
Open
data…
for
what?
4
5. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Legal
framework
and
open
data
ini=a=ves
• Aarhus
Conven=on
(1998)
– Right
to
par=cipa=on
and
access;
41
countries
and
the
EU
• Open
Access
Ini=a=ve
(2001)
– Scien=fic
informa=on
on
the
Web;
>
510
organisa=ons
• PSI
Direc=ve
– PSI
Reuse
(2003/98/EC)
• Conven=on
for
the
access
to
official
documents
(2009)
– Signed
by
12
countries
– Belgium,
Finland,
Norway,
Sweden,
Hungary,
Estonia,
Lithuania,
Slovenia,
Georgia,
Montenegro,
Serbia
and
Macedonia
• Law
37/2007.
PSI
Reuse
• Law
11/2007.
Ci=zen
access
to
public
services
and
right
to
the
quality
of
services
• RD
4/2010
Na=onal
Interoperability
Scheme
– Open
standards
– Technology
neutral
– Open
source
solware
• RD
1495/2011
It
develops
law
37/2007
• Norma
Técnica
de
Interoperabilidad
(19/02/2013,
BOE
4/3/2013)
Adapted
from
Antonio
Rodríguez
Pascual
(IGN)
5
6. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
The
problem:
lack
of
interoperability
Publish
Extract
Publish
Extract
Publish
Extract
I
want
to
publish
data
in
an
interoperable
structure
and
format
I
use
GTFS
I
use
my
own
CSV
structure
I
provide
a
web
service
Build
an
app
that
is
available
all
over
the
world
6
7. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Scenario:
open
transport
data
Is
there
any
open
transport
data
already?
We
are
surrounded
by
them
7
8. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Open
data
and
how
they
are
published
1)
In
no2ce
boards
– For
those
who
have
a
lot
of
free
=me
– Or
those
who
are
there
at
the
right
moment
in
=me
Adapted
from
Antonio
Rodríguez
Pascual
(IGN)
DATA
8
9. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Open
data
and
how
they
are
published
2)
In
web
pages
and
mobile
apps
– For
people
Adapted
from
Antonio
Rodríguez
Pascual
(IGN)
On
the
Web,
open
license
DATA
9
10. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Open
data
and
how
they
are
published
2)
In
web
pages
and
mobile
apps
– For
people
Adapted
from
Antonio
Rodríguez
Pascual
(IGN)
On
the
Web,
open
license
DATA
Machine-‐readable
Non-‐proprietary
format
11. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Open
data
and
how
they
are
published
3)
As
web
files
– So
that
they
can
be
loaded
by
humans
in
their
informa=on
systems
(XML,
HTML,
CSV,
etc.)
– Hopefully
it
is
not
a
scanned
PDF
Adapted
from
Antonio
Rodríguez
Pascual
(IGN)
On
the
Web,
open
license
DATA
Machine-‐readable
Non-‐proprietary
format
11
12. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Adapted
from
Antonio
Rodríguez
Pascual
(IGN)
Open
data
and
how
they
are
published
4)
Via
web
services
– For
humans
and
machines
– It
allows
genera=ng
added-‐value
services
– And
can
be
integrated
in
the
applica=on
business
logic
On
the
Web,
open
license
DATA
Machine-‐readable
Non-‐proprietary
format
12
13. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
What
is
open
data?
• Open
data
are
data
that
can
be
freely
used,
reused
and
redistributed
by
anyone
-‐
subject
only,
at
most,
to
the
requirement
to
a9ribute
and
sharealike.
• The
most
important
aspects
to
consider:
– Availability
and
Access:
data
must
be
available
as
a
whole
and
at
no
more
than
a
reasonable
reproduc2on
cost,
preferably
by
downloading
over
the
Internet.
Data
must
also
be
available
in
a
convenient
and
modifiable
form.
– Reuse
and
Redistribu2on:
data
must
be
provided
under
terms
that
permit
reuse
and
redistribu2on
including
the
intermixing
with
other
datasets.
– Universal
Par2cipa2on:
everyone
must
be
able
to
use,
reuse
and
redistribute
-‐
there
should
be
no
discrimina2on
against
fields
of
endeavour
or
against
persons
or
groups.
For
example,
‘non-‐
commercial’
or
‘only
in
educa=on’
restric=ons.
Source:
Open
Data
Handbook
13
14. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Scenario:
open
transport
data
Is
there
any
open
transport
data
already?
Can
we
do
it
beSer?
14
15. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Going
into
4
and
5
Linked
Data
Make
it
available
as
structured
data
(e.g.,
Excel
instead
of
image
scan
or
a
table)
Use
non-‐proprietary
formats
(e.g.,
CSV
instead
of
Excel)
Use
URIs
to
iden2fy
things,
so
that
people
can
point
at
your
stuff
Link
your
data
to
other
data
to
provide
context
Make
your
stuff
available
on
the
Web
(whatever
format)
under
an
open
license
15
16. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
USE
URIs
+
RDF
RDF
standards
José
Mobility
impairment
Boardgames
API
Mirasierra
Ven=squero
de
la
Condesa
Yes
CSV
Mega
Games
Ven=squero
de
la
Condesa
Yes
CSV
Mega
Games
Conquer
&
Smash!
MG
29,95
HTML
José
Mobility
Impairment
hasImpairment
Wheelchair
Accessibility
requires
Boardgame
likes
Mirasierra
address
Ven=squero
de
la
Condesa
Wheelchair
Accessibility
hasAccessibility
Mega
Games
address
hasAccessibility
Wheelchair
Accessibility
Ven=squero
de
la
Condesa
Mega
Games
Conquer
&
Smash!
is
a
Boardgame
sells
API
RDF
CSV
RDF
CSV
RDF
HTML
RDF
17. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Link
your
data
Linked
RDF
José
Mobility
impairment
Boardgames
Mirasierra
Ven=squero
de
la
Condesa
Yes
Mega
Games
Ven=squero
de
la
Condesa
Yes
Mega
Games
Conquer
&
Smash!
MG
29,95
API
CSV
CSV
HTML
José
Mobility
Impairment
hasImpairment
Wheelchair
Accessibility
requires
Boardgame
likes
Mirasierra
address
Ven=squero
de
la
Condesa
Wheelchair
Accessibility
Mega
Games
address
hasAccessibility
Wheelchair
Accessibility
Mega
Games
Conquer
&
Smash!
is
a
hasAccessibility
Boardgame
Ven=squero
de
la
Condesa
sells
API
RDF
CSV
RDF
CSV
RDF
HTML
RDF
18. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Wheelchair
Accessibility
Ven=squero
de
la
Condesa
Boardgame
Link
your
data
Linked
RDF
José
Mobility
impairment
Boardgames
Mirasierra
Ven=squero
de
la
Condesa
Yes
Mega
Games
Ven=squero
de
la
Condesa
Yes
Mega
Games
Conquer
&
Smash!
MG
29,95
API
CSV
CSV
HTML
José
Mobility
Impairment
hasImpairment
Wheelchair
Accessibility
requires
Boardgame
likes
Mirasierra
address
Ven=squero
de
la
Condesa
hasAccessibility
Wheelchair
Accessibility
Mega
Games
address
Ven=squero
de
la
Condesa
hasAccessibility
Wheelchair
Accessibility
Mega
Games
sells
Conquer
&
Smash!
is
a
Boardgame
API
RDF
CSV
RDF
CSV
RDF
HTML
RDF
19. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Make
complex
queries
Where
can
I
buy
the
Conquer
&
Smash!
game?
Which
are
the
most
accessible
routes
for
Christmas
shopping?
Expansion
pack
for
Conquer
&
Smash!
Take
metro
line
9
and
in
35
minutes
we
can
demo
it
to
you!
Or
beQer
take
bus
231
because
it
is
sunny
and
you
can
take
a
glance
at
the
outdoor
art
exhibi=on
in
Plaza
de
Cas=lla
MG
20. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Using
Linked
Open
Transport
Data
• Calculate
accessible
routes
– Combined
with
geographical
data
(IGN)
– Which
stop
should
I
use
if
I
have
mobility
problems?
• Commercial
routes
by
bus
– Combined
with
Madrid’s
shop
census
(from
Ayto.
Madrid)
• Geomarke=ng
decisions
for
enterpreneurs
– Where
should
I
open
my
shop?
Based
on
the
combina=on
of
the
number
of
travellers
per
stop,
demographic
data,
data
about
other
businesses
and
shops
around,
etc.
• Personalised
offers
to
travellers
– With
real-‐=me
data
and
data
about
consump=on
paQerns
(e.g.,
credit
card
transac=ons)
• …
20
21. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Index
• Linked
Open
Data
in
Smart
Ci=es
• Guidelines
for
the
Genera2on
of
Linked
Data
• Discussion
• Hands-‐on
Descrip=on
21
22. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
life
cycle
Specification
Modelling
GenerationPublication
Exploitation
Linking
22
23. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Requirements
(smart
ci=es
domain)
1. Tabular
formats
(i.e.,
SQL,
XLS
or
CSV)
– Other
data
structures
(e.g.,
XML)
less
important
in
prac=ce
or
are
unstructured
and
would
require
much
more
work
2. Changing
data
(dynamic
or
streaming
data),
versioning,
(automa=c)
data
quality
assurance
and
reliability
3. Data
access
through
web
services,
proprietary
APIs
and
data
files
4. Legal
aspects
(e.g.,
licensing,
data
ownership)
5. Access
rights
management
or
mechanisms
for
extrac=ng
public
data
(plenty
of
confiden=al
data)
23
24. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
genera=on
process
Select data
source
Obtain
access to
data source
Analyse data
source
Analyse
licensing of
the data
source
Define resource
naming strategy
Transform
data source
Link with
other
datasets
Data source
Access, data
License
Schema, data
Resource naming strategy
Ontology
RDF data
Linked dataset
Ontology
Develop
ontology
24
F.
Radulovic,
M.
Poveda-‐Villalón,
D.
Vila-‐Suero,
V.
Rodríguez-‐Doncel,
R.
García-‐Castro
and
A.
Gómez-‐
Pérez,
Guidelines
for
Linked
Data
genera=on
and
publica=on:
An
example
in
building
energy
consump=on,
Automa=on
in
Construc=on,
Special
Issue
on
Linked
Data
in
Architecture
and
Construc=on.
Available
online
April
2015.
25. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
genera=on
process
Select data
source
Obtain
access to
data source
Analyse data
source
Analyse
licensing of
the data
source
Define resource
naming strategy
Transform
data source
Link with
other
datasets
Data source
Access, data
License
Schema, data
Resource naming strategy
Ontology
RDF data
Linked dataset
Ontology
Develop
ontology
DATA PREPARATION
25
26. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Select
data
source
• Select
the
data
source
that
will
be
transformed
into
Linked
Data
• Steps:
– To
define
the
requirements
for
selec=on
– To
select
one
or
several
data
sources
• The
data
set
may
be:
– Owned
by
your
organiza=on…
– …
or
not
(external
data
sources)
26
27. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Select
data
source
–
LCmple
• Requirements
– Real-‐world
scenario
in
the
smart
city
domain
– Available
for
use
– Available
in
machine-‐processable
format
(the
more
structured
the
data
are,
the
beQer)
– Can
be
linked
with
generic
en==es
(e.g.,
loca=on)
• Leeds
City
Council
–
energy
consump=on
– hQp://data.gov.uk/dataset/council-‐energy-‐consump=on
27
28. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Obtain
access
to
data
source
• Data
access
means
– Technical
means
to
retrieve
the
data
– Legal
rights
to
use
the
data
• If
the
data
is
not
accessible:
– To
iden=fy
the
person
to
contact
– To
request
the
access
– To
obtain
access
and
to
retrieve
the
data
• Access
alterna=ves:
– file,
– programming
interface,
– database,
– data
stream,
– etc.
28
29. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Obtain
access
to
data
source
–
Lample
• Data
set
already
available
as
a
CSV
file
29
30. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Analysing
licensing
of
the
data
source
• Licenses
specify
the
legal
terms
under
which
a
data
set
can
be
used
and
exploited
• Neither
legal
prescrip=ons
on
how
to
declare
licenses
nor
common
standard
prac=ces
to
do
so
• Steps
(not
automatable):
– To
iden=fy
the
rightsholder
and
the
authorita=ve
publisher
• Righstholder
vs.
authorized
distributor
– To
find
the
applicable
license
• Web
page,
data
set
metadata,
data
themselves
• Contact
the
publisher
– To
read
the
license
and
analyse
legal
terms
• Tips
– Analysis
should
be
performed
upon
all
copies
and
formats
of
the
data
– Ensure
license
compa=bility
when
integra=ng
several
data
sources
30
31. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
resources
can
be
protected
Ontologies are intellectual works,
they can be protected by copyright
RDF Datasets can be considered as
databases, also legally protected in the EU
31
32. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Create, consume, aggregate,
derive and publish Linked Data
in a lawful environment
0
Always
license
your
data
…
Data
shops
Government
Individuals
32
33. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Licensed
Linked
Data
Non-‐licensed
Linked
Data
Licensed
Linked
Data
+License
Unless there is a license allowing to
do so, the resource cannot be copied,
modified or published.
In practice, non-licensed resources
are useless in industrial settings
Licensed Linked Data can be used
33
34. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Licensed
Linked
Data
in
prac=ce
Linked Open Data
Published
Open License
(Published) Linked Data
Published
No Open License
Linked Data
Not Published
No Open License
34
35. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
ç
Guidelines
for
licensing
linked
data
35
Add
"rights"
metadata
in
the
dataset
descrip=on
(e.g.,
VoID,
DCAT)
1
Use
standard
predicates
to
declare
"rights"
statements
(e.g.,
Dublin
Core
terms:
dc:rights,
dct:license)
2
?
Use
rights
declara2on
language,
e.g.,
ODRL
Yes
Use
URI
of
standard
license
e.g.,
CC0
3b
3a
No
Standard license available
ODRL
Open
Digital
Rights
Language
DCAT
Data
catalog
vocabulary
36. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Licensing
Linked
Data
is
Simple…
The
Bri=sh
Na=onal
Bibliography
(BNB)
lists
the
books
and
new
journal
=tles
published
or
distributed
in
the
United
Kingdom
and
Ireland
since
1950.
J
36
37. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
…
or
complex
depending
your
needs
Policies
can
be
expressed
with
ODRL
2.0
to
govern
access
to
Linked
Data
Example
of
access
to
Linked
Data
for
a
price
(15EUR
for
the
dataset
or
0.01EUR
for
a
triple
thereof)
@prefix gr: <http://purl.org/goodrelations/> .
@prefix dcat: <http://www.w3.org/ns/dcat#> .
<http://salonica.dia.fi.upm.es/ldr/policy/cdaddba4-fc2e-4ee0-a784-e62f1db259bf>
a odrl:Set ;
rdfs:label "License Offering Paid Linked Data" ;
odrl:permission [ a odrl:Permission ;
odrl:target <http://example.org/dataset/ds01> ;
odrl:action odrl:reproduce ;
odrl:duty [ a odrl:Duty ;
rdfs:label "Pay" ;
gr:UnitOfMeasurement dcat:Dataset ;
gr:amountOfThisGood "1" ;
odrl:action odrl:pay ;
odrl:target "15,00 EUR"
]
] , [ a odrl:Permission ;
odrl:action odrl:reproduce ;
odrl:target <http://example.org/dataset/ds01> ;
odrl:duty [ a odrl:Duty ;
rdfs:label "Pay" ;
gr:UnitOfMeasurement rdf:Statement ;
gr:amountOfThisGood "1" ;
odrl:action odrl:pay ;
odrl:target "0,01 EUR"
]
] ..
The target can be an ontology, a
dataset, a SPARQL endpoint…
…or a SPARQL query itself or a triple
pattern: {mysubject, ?p , ?o}
37
38. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
And
you
have
support
for
that
• Condi=onal
access
to
Linked
Data
– hQp://condi=onal.linkeddata.es
• Dataset
of
licenses
in
RDF
– hQp://rdflicense.appspot.com
• ODRL
Profile
for
Linked
Data
– hQp://purl.oclc.org/NET/ldr/ns#
– hQps://www.w3.org/community/odrl/profile/linkeddata/
38
40. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Analyse
data
source
• Get
insight
into
the
data
structure
and
organiza=on
• Steps:
– To
analyse
the
characteris=cs
of
the
data
• Data
values,
data
ranges,
etc.
– To
obtain
the
schema
of
the
data
• Concepts
and
their
rela=onships
• Data
can
be
available
as:
– Structured
data
– Unstructured
data
• If
the
schema
does
not
exist:
– Use
a
standard
modeling
language
for
describing
the
data
schema
(e.g.,
UML)
40
41. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Analyse
data
source
–
LCmple
• Metadata
not
quite
descrip=ve:
– Different
types
of
council
sites
(mostly
buildings)
– Electricity,
gas
and
oil
consump=ons
– 1-‐year
intervals
-‐
2010/11,
2011/12,
2012/13
• Analysis
required
contac=ng
with
people
from
LCC
open
data
41
42. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Analyse
data
source
–
LCmple
42
hQp://localhost:3333/
43. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Analyse
data
source
–
LCmple
43
44. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Analyse
data
source
–
LCmple
• Analyse
the
characteris=cs
of
data
using
facets
• Obtain
the
schema
of
the
data
44
45. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Data
characteris=cs
and
schema
–
LCCLLIDD
Column
Type
Comments
/
Range
(rounded)
Problems
uprn
String
Not
unique,
empty
values
Site
Name
String
Unique?
Site
types
+
name
4
repeated
sites
Address
2
String
Not
unique,
empty
values
Address
3
String
Not
unique,
empty
values
Village?
Civil
Parish?
Address
4
String
Not
unique,
empty
values
City?
Metropolitan
district?
“leeds”
vs
“Leeds”
PostCode
String
Not
unique,
empty
values
Electricity
10/11
Decimal
0
—
2.700.000
Electricity
11/12
Decimal
0
—
2.300.000
Electricity
12/13
Decimal
0
—
2.400.000
Gas
10/11
Decimal
-‐100,000
—
6,100,000
Nega=ve
values
Gas
11/12
Decimal
-‐100,000
—
7,800,000
Nega=ve
values
Gas
12/13
Decimal
-‐100,000
—
8,300,000
Nega=ve
values
Oil
12/13
Decimal
-‐1,000,000
—
13,000,000
Nega=ve
values
45
46. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
genera=on
process
Select data
source
Obtain
access to
data source
Analyse data
source
Analyse
licensing of
the data
source
Define resource
naming strategy
Transform
data source
Link with
other
datasets
Data source
Access, data
License
Schema, data
Resource naming strategy
Ontology
RDF data
Linked dataset
Ontology
Develop
ontology
DEFINE RESOURCE
NAMING STRATEGY
46
47. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Hash
and
slash
URIs
• Hash
URIs
(#)
– hQp://www.energycompany.com/about#energyCompany
– The
fragment
part
has
to
be
stripped
off
when
the
URI
is
requested
from
the
server
(i.e.,
the
resource
cannot
be
retrieved
directly)
– Hash
URIs
can
be
used
to
iden=fy
non-‐document
resources
• Slash
URIs
(/)
– hQp://www.energycompany.com/about/energyCompany
– Imply
a
303
redirec=on
to
the
loca=on
of
a
document
that
represents
the
resource
(+
content
nego=a=on)
• E.g.,
hQp://www.energycompany.com/about/energyCompany.rdf
– Drawbacks:
HTTP
round-‐trip,
redirects,
web
server
configura=on
47
48. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Hash
or
slash?
• Depends
on
the
data
and
on
their
expected
use
• Small
data:
– Hash
namespace
– Access
all
the
data
as
a
whole
– HTTP
GET
would
return
a
single
informa=on
resource
with
everything
• Large
/
frequently-‐updated
/
modular
data:
– Slash
namespace
– Access
resources
individually
or
in
groups
– Resource
descrip=ons
may
be
divided
among
many
informa=on
resources
or
may
be
managed
via
a
query
service
(e.g.,
SPARQL)
– Progressively
greater
detail
about
resources
may
be
retrieved
through
mul=ple
accesses
48
49. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Define
resource
naming
strategy
• Steps:
– To
choose
a
URI
form
(hash
or
slash)
– To
choose
a
domain
for
the
URIs.
– To
choose
a
path
for
the
URIs.
– To
choose
a
paQern
for
ontology
classes
and
proper=es
in
the
ontology,
as
well
as
for
individuals
• Tips:
– One
URI
must
iden=fy
only
one
item
(e.g.,
avoid
mixing
with
web
pages
and
real-‐world
objects)
– URIs
should
be
persistent
and
should
not
change
over
=me
(e.g.,
state
informa=on);
PURL
may
support
this
– Use
a
domain
that
is
under
your
control
(or
a
service
such
as
PURL)
– Separate
the
ontology
model
from
its
instances
– Define
meaningful
URIs
49
51. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
genera=on
process
Select data
source
Obtain
access to
data source
Analyse data
source
Analyse
licensing of
the data
source
Define resource
naming strategy
Transform
data source
Link with
other
datasets
Data source
Access, data
License
Schema, data
Resource naming strategy
Ontology
RDF data
Linked dataset
Ontology
Develop
ontology
DEVELOP ONTOLOGY
51
52. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Ontology
development
6. Ontology
implementation
5. Ontology selection
1. Requirements definition
Can you
represent all
your data?
7. Ontology evaluation
2. Terms extraction
3. Ontology conceptualization
4. Ontology search
6.2 Ontology
completion
3.1 Initial model drafting
3.2 Detailed model definition
6.1 Ontology integration
You
did
this
yesterday
52
53. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Ontology
development
–
LCCDD
53
54. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
genera=on
process
Select data
source
Obtain
access to
data source
Analyse data
source
Analyse
licensing of
the data
source
Define resource
naming strategy
Transform
data source
Link with
other
datasets
Data source
Access, data
License
Schema, data
Resource naming strategy
Ontology
RDF data
Linked dataset
Ontology
Develop
ontology
TRANSFORM
DATA
54
55. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Data
transforma=on
• Steps:
– To
select
the
RDF
serializa=on
• RDF/XML,
Turtle,
N-‐Triples,
JSON-‐LD
– To
select
a
tool.
Depends
on:
• The
format
of
the
data
(database,
spreadsheets,
etc.),
• Concrete
needs
of
the
transforma=on
process
(e.g.,
dynamicity)
– To
transform
the
data
into
RDF
• Usually
requires
a
mapping
between
the
data
and
the
ontology
• The
mapping
implements
the
resource
naming
strategy
– To
evaluate
the
obtained
RDF
data:
• Syntax,
Completeness,
Accuracy,
Conciseness,
Modelling,
Understandability,
Versa=lity,
Usage,
Licensing,
…
55
56. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Data
transforma=on
tools
Database
to
RDF
Data
streams
to
RDF
• morph-‐RDB
• D2R
Server
• TopBraid
Composer
• morph-‐streams
• D2R
Server
Spreadsheets
to
RDF
XML
to
RDF
• TopBraid
Composer
• Excel2RDF
• RDF123
• XLWrap
• OpenRefine/LODRefine
• XML2RDF
• TopBraid
Composer
• OpenRefine/LODRefine
56
57. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Data
transforma=on
tools
Database
to
RDF
Data
streams
to
RDF
• morph-‐RDB
• D2R
Server
• TopBraid
Composer
• morph-‐streams
• D2R
Server
Spreadsheets
to
RDF
XML
to
RDF
• TopBraid
Composer
• Excel2RDF
• RDF123
• XLWrap
• OpenRefine/LODRefine
• XML2RDF
• TopBraid
Composer
• OpenRefine/LODRefine
Overview
of
OpenRefine
57
58. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
OpenRefine
basic
opera=ons
• Installing
• Crea=ng
a
new
project
• Data
analysis
– Exploring
data
– Sor=ng
data
– Face=ng
data
– Filtering
data
• Basic
data
transforma=on
(cleaning/preparing)
– Columns:
• Move
• Rename
• Remove
columns
• Collapse
and
expand
• Common
transforma=ons
– Rows:
• Remove
rows
• Export
whole
project
58
59. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Adding
derived
columns
Edit
column
à
Add
column
based
on
this
column...
59
60. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Spli‚ng
data
accross
columns
Edit
column
à
Split
into
several
columns...
60
62. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Rows
and
records
Show
as:
rows
records
Record
Row
62
63. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Clustering
similar
cells
Edit
cells
à
Cluster
and
edit...
63
64. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Transposing
rows
and
columns
Transpose
à
Transpose
cells
across
columns
into
rows...
Transpose
à
Columnize
by
key/value
columns...
64
65. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Other
useful
u=li=es
• Regular
expressions
– Java
regular
expressions
• Custom
transforma=ons
– General
Refine
Expression
Language
(GREL)
– Jython
(Python
implemented
in
Java)
– Clojure
(func=onal
language
that
resembles
Lisp)
65
66. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
66
Using
the
project
history
• Project
history:
– Access
opera=on
history
– Undo
opera=ons
– Extract
opera=ons
(in
JSON)
– Apply
opera=ons
• Cau=on:
– Transforma=ons
are
registered
in
the
history;
filters
and
facets
are
not
73. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Evalua=ng
the
exported
data
• Manual
inspec=on
• Syntax
evalua=on
(with
syntax
validator)
• Consistency
with
the
ontologies
(with
reasoner)
• Usage
evalua=on
(e.g.,
by
running
SPARQL
queries)
– Show
all
electricity
consump=ons
and
the
related
=me
periods
for
all
council
sites
related
to
culture
– Show
all
energy
consump=ons
and
the
related
=me
periods
of
council
sites
from
the
Wakefield
district
73
74. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Index
• Linked
Open
Data
in
Smart
Ci=es
• Guidelines
for
the
Genera=on
of
Linked
Data
• Discussion
• Hands-‐on
Descrip=on
74
76. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
are
just
data
01000000
electric1011
01000000
electric1112
01000000
0 20 40 60 80 100
electric1213
Building
Electrical consumption
0e+00
2e+06
4e+06
6e+06
8e+06
0 500000 1000000 1500000 2000000
Electricity
Gas
Electricity vs gas consumption 12/13
0.0e+00
4.0e+06
8.0e+06
1.2e+07
0 500000 1000000 1500000 2000000
Electricity
Oil
Electricity vs oil consumption 12/13
76
77. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
77
Benefits
of
linking
data
resPlus$electricTotal
0e+00
2e+06
4e+06
6e+06
Total
electric
consump2on
Original
data
+
geoloca=on
resP
Total
electric
consump2on
in
loca2ons
with
popula2on
>
20.000
Original
data
+
geoloca=on
+
popula=on
78. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Benefits
of
reasoning
resPlus
25
50
75
10
Total
electric
consump2on
in
cultural
buildings
schema:CivicStructure
CulturalSite
Museum Library
78
79. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Index
• Linked
Open
Data
in
Smart
Ci=es
• Guidelines
for
the
Genera=on
of
Linked
Data
• Discussion
• Hands-‐on
Descrip2on
79
80. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
What
are
we
going
to
do?
Specification
Modelling
GenerationPublication
Exploitation
Linking
80
81. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
What
are
we
going
to
do?
Select data
source
Obtain
access to
data source
Analyse data
source
Analyse
licensing of
the data
source
Define resource
naming strategy
Transform
data source
Link with
other
datasets
Data source
Access, data
License
Schema, data
Resource naming strategy
Ontology
RDF data
Linked dataset
Ontology
Develop
ontology
81
82. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Hands-‐on
task
1
• Goal:
to
get
familiar
with
the
first
steps
in
the
Linked
Data
genera=on
process
• The
students
will
have
to
take
their
selected
dataset(s)
and
perform
the
following
tasks:
– Analyse
Data
Set
• Both
the
data
(quan==es,
value
ranges,
etc.)
and
the
schema
– Analyse
Licensing
of
the
Data
Source
• Who
is
the
publisher
and
the
rightsholder?
• What
is
the
licence?
• Which
will
be
the
license
to
be
used
for
the
generated
dataset?
– Define
Resource
Naming
Strategy
• For
the
ontology
and
the
data
(URI
form,
content
nego=a=on,
URIs
domain,
path,
paQerns,
etc.)
– Finish
Ontology
Development
• Lightweight
ontology
(i.e.,
classes,
proper=es,
domains
and
ranges)
82
83. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Hands-‐on
task
1
-‐
Deliverables
• A
document
that
includes:
– The
analyses
performed
over
the
data
source
– The
licensing
of
the
data
source
and
the
poten=al
license
– The
resource
naming
strategy
defined
• An
OWL
file
with
the
ontology
developed,
according
to
the
resource
naming
strategy
defined
83
84. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Hands-‐on
task
2
• Goal:
to
get
familiar
with
the
transforma=on
of
CSV
data
into
RDF
using
LODRefine
• The
students
will
have
to
take
their
selected
dataset(s)
and
perform
the
following
tasks:
– Import
data
into
LODRefine
– Analyse
and
fix
data
• Analysis
performed
in
the
previous
class,
but
can
be
updated
with
new
findings
• Fix
the
data
to
remove
errors
• Transform
the
data
to
facilitate
RDF
genera=on
– Export
data
to
RDF
• Define
an
RDF
skeleton
for
the
data
• Export
the
data
to
RDF
(Turtle
syntax)
84
85. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Hands-‐on
task
2
-‐
Deliverables
For
each
dataset:
• An
RDF
file
in
the
Turtle
syntax
with
the
data
transformed
into
RDF
85
86. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
1st
Summer
School
on
Smart
Ci2es
and
Linked
Open
Data
(LD4SC-‐15)
Thank
you
for
your
aQen=on!