Talk about Exploring the Semantic Web, and particularly Linked Data, and the Rhizomer approach. Presented August 14th 2012 at the SRI AIC Seminar Series, Menlo Park, CA
Long journey of Ruby standard library at RubyConf AU 2024
Exploring the Semantic Web
1. Exploring Semantic Web Data
and particularly Linked Data
Roberto García
AIC Seminar Series
SRI International, Menlo Park, August 14th 2012
Human-Computer Interaction
Universitat de Lleida
and Data Integration
Spain
Research Group
2. Who
• Associate Professor, Universitat de Lleida, Spain
• Visiting Associate Professor, Standford University
– Stanford HCI Group
• +12 years Semantic Web research
– 1999 MSc Thesis: Knowledge Management using
RDF plus reasoning (SiLRI)
– 2006 PhD Thesis: A Semantic Web approach to DRM
– 2006- Copyright Ontology
– 2007- Lleida HCI Group, Semantic Web User
Interfaces
3. What is Open Data?
“Open data is data that can be freely used, reused
and redistributed by anyone - subject only, at most,
to the requirement to attribute and sharealike”
Open Knowledge Foundation
• Make your data OPEN
– Available online with open license
• For instance Creative Commons CC-BY
– No more than reproduction cost
– No matter format
4. Open Data Worldwide
• 169 initiatives Rate:
– City (40), Country, Region or State (125),
Supranational (4)
http://datos.fundacionctic.org/sandbox/catalog/faceted/
6. Open Data Formats
• However, encourage formats that facilitate
reuse and interoperability
– Tim Berners-Lee 5 stars classification
http://5stardata.info
7. ★ Open Data
• Make data available on the Web under an
open license
– Data licenses:
• Public Domain Dedication and License (PDDL), Open Data
Commons Attribution License (ODC-by) or Creative Commons
Public Domain Dedication (CC0)
• Whatever format
– Example: PDF
• But… data is locked-up in a document
– Hard to get data out, custom scrapers
http://5stardata.info
8. ★★ Open Data
• Make it available as structured data
– Example: Excel instead of image scan of a table
• But… data still locked-up
– You depend on proprietary software
http://5stardata.info
9. ★★★ Open Data
• Use non-proprietary formats
– Example: CSV instead of Excel
"Temperature forecast for Galway",
"Day","Lowest Temperature (C)"
"Saturday, 13 November 2010",2
"Sunday, 14 November 2010",4
"Monday, 15 November 2010",7
• But… data on the Web and not data in the Web
– What does “Galway” mean? Is it a temperature?
What is the unit? Local time?...
http://5stardata.info
10. Galway (disambiguation)
• Places
– Ireland
• Galway
• County Galway
• Galway Bay
– Sri Lanka
• Galway's Land National Park
– United States
• Galway (town), New York
• Galway (village), New York
• Things
– Galway (sheep), a breed of sheep that originated in Galway, Ireland
– Galway harp, a type of harp
– Galway Hooker, a type of sailing boat
– Galway or Claddagh Ring, a type of wedding ring made in Galway
• …
11. ★★★★ Open Data
• Use URIs to identify things,
so that people can point at your stuff
– Example: RDF1 (but also Atom, OData, JSON-LD,…)
@prefix meteo: <http://purl.org/ns/meteo#> .
@prefix galweather: <http://5stardata.info/galweather#> . Vocabularies
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> . Ontologies
<http://example.org/Galway> meteo:forecast
[ meteo:predicted "2010-11-13T12:00:00Z"^^xsd:dateTime ;
meteo:temperature [ meteo:celsius "2"^^xsd:decimal ] ] .
• But… what if we (humans or computers) don’t
know what http://example.org/Galway means?
1
Resource Description Framework, http://www.w3.org/RDF/
12. ★★★★★ Linked Open Data
• Link your data to other data to provide context
(semantics, meaning)
– Example: http://dbpedia.org/resource/Galway HTTP GET
@prefix dbpedia: <http://dbpedia.org/resource/> .
...
dbpedia:Galway a <http://dbpedia.org/ontology/Place>, <http://dbpedia.org/ontology/PopulatedPlace>,
<http://dbpedia.org/ontology/Settlement>;
rdfs:label "Galway"@en;
dbp:populationBlank "Galwegian, Tribesman"@en;
dbp:populationTotal "75529"^^xsd:int;
dbp:populationUrban "76778"^^xsd:int;
dcterms:subject <http://dbpedia.org/resource/Category:Cities_in_the_Republic_of_Ireland>,
<http://dbpedia.org/resource/Category:County_towns_in_the_Republic_of_Ireland>,
<http://dbpedia.org/resource/Category:Port_cities_and_towns_in_the_Republic_of_Ireland>,
<http://dbpedia.org/resource/Category:University_towns>;
rdfs:comment "Galway or City of Galway (Cathair na Gaillimhe) is a city on the west coast of
Ireland. It is located on the River Corrib between Lough Corrib and Galway Bay and is surrounded by
County Galway. It is the third largest city within the state, though if the wider urban area is
included then it falls into fourth place behind Limerick. The population of Galway city at the 2011
census was 75,529, rising to 76,778 across the entire urban area."@en;
geo:lat 53.2719;
geo:long -9.04889;
foaf:homepage <http://www.galwaycity.ie> .
… and also dbp:PopulationTotal, dct:subject,…
14. Fine for computers… but people?
C. Warren
(blogger)
I’m writing
about “Films I
Like”.
Can I reuse
LinkedMDB?
M. Harper
(developer)
I’m developing
a bird watching
application.
Can I reuse
DBPedia?
http://linkeddata.org
15. User Testing
• Users typical questions:
– Where do I start?
– Where do I go now?
– What is this data about?
– How do I find this?
– …
• What do Linked Data user interfaces offer?
16. DBPedia Scenario
• Linked Data version of Wikipedia
– 3.5 million things described
• Ontology: 257 classes y 1276 properties
18. Semantic Query Languages
• SPARQL:
– select distinct(?c) (count(?i) as ?n)
where {?i a ?c} order by desc(?n)
c n
http://www.w3.org/2002/07/owl#Thing 1668503
http://www.w3.org/2004/02/skos/core#Concept 632607
http://www.opengis.net/gml/_Feature 571764
http://dbpedia.org/ontology/Place 462349
http://dbpedia.org/ontology/Person 363751
http://dbpedia.org/ontology/Work 355100
http://dbpedia.org/ontology/PopulatedPlace 340443
http://xmlns.com/foaf/0.1/Person 296595
19. Text Search
• What to type? A URI? A URI label?
• How to take advantage from semantics?
21. Proposal
Ontologies and dataset structure
Automatic UI Generation Information
Architecture
Components
[Morville]
Overview Menus, Sitemaps,…
Interaction
Patterns for Zoom & Filter Facets
Data Analysis
[Shneiderman]
Details Lists, Maps, Timelines…
22. IA Components. Menus
– From dataset ontologies and thesaurus
• For each class/topic
– URI, label, # instances/uses, subclasses/subtopics
– Flatten to desired # entries and subentries
• When there is room, entries or subentries,
divide class/topic with the most instances
• When too many, group that with the fewest
– “Other” is the generic group
24. DEMO
http://rhizomik.net/dbpedia/
IA Components. Menus
Provide DBPedia overview…
…but what about 12.334 birds?
25. IA Components. Facets
• Pre-computed list of facets / class or topic
– Ontologies or thesaurus + instance data
– Facet metrics:
• frequency, #values, most common value
cardinality…
• DBPedia Birds class:
– 226 properties
• dbo:kingdom, 100%, 3 values,
6846 (Animalia),…
28. Testing LinkedMDB
• Evaluation with lay users as part of RITE1
development process
– Iteration test with 6 users
– LinkedMDB (Linked Data version of iMDb)
User Task:
“Find three films where
Woody Allen is director and
also actor”.
1
Rapid Iterative Testing and Evaluation
29.
30. Evaluation Results
• Seemed easy but…
no user completed task without help
• Really, just 1 issue:
– Users started from “Actor” instead than from
“Film”, and got lost from there
• User interaction is too constrained by
underlying “explicit” data structure
• Lack of context while browsing graph
31. New Features
• Facets for all inverse properties
(explicit or implicit)
– Actor actor – Film:
• Actor has facet “is actor of Film”
• Breadcrumbs show “query” built so far
– Click Film, then for facet “Actor”
search “Woody Allen”:
• “Showing Film has actor
where actor name is Woody Allen”
32. New Features
• What about getting from Actors to Films to
restrict by director?
• Add Actor facet “directed by”?
– DANGER: facets explosion
• Director Film Country Continent
Director facet:
“continents of countries where films directed”!
33. New Features
• Pivoting: switch from faceted view to
related faceted view (keeping filters)
– E.g.: from Actors facets move to Films facets
through “is Actor of Film” facet
• For each class facet also compute:
– Most specific class for target instances
• Actor “is Actor of” Film and TV Episode
Audiovisual Work
– Pivot that facet to get:
• Faceted view for target class… + filters so far
39. Next Round Evaluation
• Semantic Web Exploration Tools
Quality in Use Model:
– Task success, Task time, Satisfaction,…
– UI Component Efficiency, Task Flexibility, Layout
Flexibility,…
• Task: “Films Woody Allen director and actor”
– Task time:
Pre-pivot Pivot Reduction
Minimum 1.05 0.89 15%
Maximum 5.23 2.23 57%
Mean 2.41 1.69 30%
St. Dev. 1.49 0.57 62%
40. Summary
• Menus
– Dataset classes (topics) overview
• Facets
– Filter class using properties and values
• Pivoting
– Switch faceted views, carrying filters
41. DEMO
http://rhizomik.net/linkedmdb/
Conclusions
• Users build queries without SPARQL or
dataset structure knowledge
• Example:
– Who has directed more films in Oceania?
SELECT DISTINCT ?r1 WHERE {
?r1 a movie:Director .
?r2 movie:director ?r1 .
?r2 a movie:Film.
?r2 movie:country ?r3 .
?r3 movie:country_continent ?r3var0
FILTER(str(?r3var0)="Oceania") }
42. Work in Progress
• Interaction design
– Explore the best way to make pivoting, and un-
pivoting, evident for users
– Improve “breadcrumbs”
• Specialized facets:
– Range dependent: histogram for numbers,
calendar for dates,…
49. Thanks for your attention
Roberto García
http://rhizomik.net/~roberto
roberto.garcia@udl.cat
Human-Computer Interaction
Universitat de Lleida
and Data Integration
Spain
Research Group
Hinweis der Redaktion
Faceted view for Species > Bird Looking for pigeons (Columbidae) in “Mediterranean Countries”… Filter on direct properties like Familia = Columbidae