Weitere ähnliche Inhalte
Ähnlich wie Vanessa lopez linked data and search (20)
Mehr von Dublinked . (20)
Kürzlich hochgeladen (20)
Vanessa lopez linked data and search
- 1. IBM Research – Ireland
Linked
Data
and
Search
Vanessa
Lopez
Smarter
Ci*es
Technology
Centre
IBM
Research
Ireland
© 2012 IBM Corporation
- 2. IBM Research – Ireland
Background:
Why
Linked
Data
Provides
explicit
seman9cs
Extensible
Interoperability-‐focused:
to
enable
automa9c
discovery
and
inges9on
Large
exis9ng
corpora
Fundamentally
incremental
(like
the
Web)
W3C
standard
representa9on
and
common
format
Government
push
(e.g.
data.gov,
data.gov.uk,
Linked
Government
Data)
© 2012 IBM Corporation
- 3. IBM Research – Ireland
Yes,
yes..
Richer
structured
queries
but
..
..
Limited
usability
for
both
data
publishers
and
consumers
© 2012 IBM Corporation
- 4. IBM Research – Ireland
How can
we
help
users
in
querying
and
exploring
the
Seman9c
Web
content?
© 2012 IBM Corporation
- 5. IBM Research – Ireland
State
of
the
art
• Seman9c
search
over
messy,
heterogeneous
data
and
mash-‐ups
• Exploratory
and
Faceted
systems
• Query
Builders
and
rela9onship
finders
• Ques9on
Answer
over
Linked
Data
sources
• Google
knowledge
graph
hVp://technologies.kmi.open.ac.uk/poweraqua
© 2012 IBM Corporation
- 7. IBM Research – Ireland
Linked
Data
and
Search
-‐
Problem
domain:
What
makes
City
Data
so
special?
How
can
we
make
it
more
accessible?
© 2012 IBM Corporation
- 8. IBM Research – Ireland
Seman9c
processing
of
urban
data
–
why
is
different?
• How
can
we
go
from
raw
data
to
insight
into
the
opera9on
of
a
city
with
minimal
effort?
Return-‐on-‐Investment
(because
data
integra9on
is
expensive)
Fit-‐for-‐all
(ci9zen
engagement)
© 2012 IBM Corporation
- 9. IBM Research – Ireland
Challenges:
Big
city
data
Volume
Velocity
• Lots
of
relevant
informa*on
• Not
linked
to
authorita*ve
sources
• Streams
• Frequent
updates
Variety
Veracity
• Different
models
and
file
formats
• Open
domain
-‐
Unknown
schema
• Diverse
sources
• Difficult
to
do
assess
quality
© 2012 IBM Corporation
- 10. IBM Research – Ireland
Business
case:
open
data
as
a
means
to
an
end
© 2012 IBM Corporation
- 11. IBM Research – Ireland
Business
case
• Why
are
ambulances
late?
Sources
of
informa*on
• 100’s
of
datasets
from
four
municipal
authori9es
in
Dublin
• Most
sta9c,
some
dynamic
• Social
Media:
twiVer,
LiveDrive,
even_ul,
eventBright,
…
• Linked
Data:
DBpedia,
..
• Vocabularies:
IPSV,
FOAF,
VOID,
PROV,
DCAT,
WSG
Domain
of
informa*on
• Loca9ons
of
Health
Services
• Ambulance
call
outs
and
response
9mes
• Tweets
about
traffic
conges9on
• Geo-‐located
tweets
about
people
movement
• Road
network
• Event
Web
Services
• …
© 2012 IBM Corporation
- 12. IBM Research – Ireland
Issues
• Linked
Data
to
enrich
data
and
give
contextual
insight
for
publishers
and
consumers:
– Publish
(vocabularies,
annota9on)
– Discovery
and
Search
(metadata
/
cataloguing,
full-‐text
indexing,
seman9c
en99es)
– Link
(schema
alignment,
linked
data,
social
media)
– Extract
interes9ng
views
– Reason
(diagnose
traffic
problems)
Ubiquitous
aspects:
Provenance,
Governance,
Performance,
Security,
Privacy
© 2012 IBM Corporation
- 13. IBM Research – Ireland
Approach–
Data
model
Documents
+
Metadata
Structure
Tabular
Graph
C1
a
Cell
C1
inRow
r1
C1
value
“name”
…
En**es
En9ty
Graph
e1
a
En9ty
e1
inRow
r1
e1
inCol
c2
…
Links
Views
Annota9on
Graph
Mapping
Graph
e1
a
En9ty
e1
a
En9ty
e1
rdfs:label
“name”
e1
sameAs
e2
e1
addr
“X
st”
…
e1
lat
:53.23”
…
Pay-‐as-‐you-‐go,
Gain-‐as-‐you-‐go
•
•
•
•
Structured
metadata
-‐>
Queries
over
the
metadata
Files
into
a
standard
representa9on
-‐>
Queries
over
the
data.
Par9ally
integrate
schemata
-‐>
Queries
across
datasets.
Integrate
globally
-‐>
Queries
across
Web
data
© 2012 IBM Corporation
Insight
- 14. IBM Research – Ireland
Discovery:
Publishing
and
Cataloguing
• METADATA
– Many
data
publishers
and
disconnected
datasets
– Link
metadata
using
domain
vocabularies:
IPSV
– Convert
to
simple
RDF
format
Vocabulary
matching
IPSV
© 2012 IBM Corporation
- 16. IBM Research – Ireland
Search
and
linking
• Full
text
indexing
for
search
over
metadata
and
content
• En9ty
linking
and
naviga9on
(keywords,
categories,
publishing
agencies,
regions,..)
• Open
metadata
and
vocabularies
(VOID,
PROV,
etc)
for
data
discovery
and
linking
• Mining
descrip9ons
(Dbpedia
spotlight)
Open
metadata
Full
text
indexing
En9ty
linking
Mining
descrip9ons
© 2012 IBM Corporation
- 17. IBM Research – Ireland
Faceted
search:
“beaches
in
Fingal”
© 2012 IBM Corporation
- 19. IBM Research – Ireland
Content
integra9on
• Incrementally
lij
data
content
(beyond
search
to
querying
across
datasets
content)
– Extract
en99es
represented
in
RDF
(PAYGO)
– Label
extrac9on
and
annota9on
– Link
when
we
have
higher
confidence
(lat,
long)
– Geo-‐coding
and
taxonomy
of
tweets
(traffic)
Geocoding
Label
extrac9on
Minimal
Entry
cost
Provenance-‐based
dataset
ranking
© 2012 IBM Corporation
- 20. IBM Research – Ireland
Views
• Beyond
search
to
guiding
the
user
to
create
meaningful
views:
– Guide
the
users
to
annotate
data,
recommend
related
datasets
and
create
dataviews
on
the
fly
– Ranking
and
context-‐based
recommenda9ons
– Allow
seman9c
based
analysis
on
mul9ple
views
Hidden
informa9on
discovery
Cross
domain
queries
Mul9ple
endpoints
Mul9ple
interpreta9ons
© 2012 IBM Corporation
- 21. IBM Research – Ireland
Demo
• Currently:
Web
services
and
technology
demonstrator
• Next:
Open
RDF-‐based
data
management
deployed
in
Dublin
City
(read/write).
Deployment
of
traffic
diagnoser.
• SPUD:
Seman*c
Processing
of
Urban
Data
(2nd
prize
at
the
Seman*c
Web
Challenge
–
ISWC)
• Live
demo:
www.dublinked.ie/sandbox/Seman9cWebChall
Spyros
Kotoulas,
Vanessa
Lopez,
Raymond
Lloyd,
Marco
Luca
Sbodio,
Freddy
Lecue,
Mar;n
Stephenson,
Elizabeth
Daly,
Veli
Bicer,
Aris
Gkoulalas-‐Divanis,
Giusy
Di
Lorenzo,
Anika
Schumann,
Denis
PaFerson,
and
Pol
Mac
Aonghusa
© 2012 IBM Corporation
- 22. IBM Research – Ireland
Thank
you!
Reference
Publica9on:
• QuerioCity:
A
Linked
Data
PlaZorm
for
Urban
Informa*on
Management
V.
Lopez,
S.
Kotoulas,
M.
L.
Sbodio,
M.
Stephenson,
A.
Gkoulalas-‐Divanis,
P.
Mac
Aonghusa.
In
Use
track
at
the
11th
Interna;onal
Seman;c
Web
Conference
(ISWC).
City
Fabric
Team:
© 2012 IBM Corporation