Interlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
1. Digital Enterprise Research Institute deri.ie
Interlinking Personal Semantic Data
on the Desktop and the Web
Laura Drǎgan
2. Outline
Digital Enterprise Research Institute www.deri.ie
// Introduction
Background and motivation
Research questions
// Directions and results
Within the Semantic desktop
To the Web of Data
A use case to rule them all
// Conclusion
Research answers
Future work
1
3. Outline
Digital Enterprise Research Institute www.deri.ie
// Introduction
Background and motivation
Research questions
// Directions and results
Within the Semantic desktop
To the Web of Data
A use case to rule them all
// Conclusion
Research answers
Future work
1
4. Outline
Digital Enterprise Research Institute www.deri.ie
// Introduction
Background and motivation
Research questions
// Directions and results
Within the Semantic desktop
To the Web of Data
A use case to rule them all
// Conclusion
Research answers
Future work
1
11. Motivation
Digital Enterprise Research Institute www.deri.ie
Use the framework provided by
the Semantic Desktop to build useful
applications and services
3
12. Research questions
Digital Enterprise Research Institute www.deri.ie
Q1. How to build semantic applications and tools for
the Semantic Desktop to provide the best
experience for the users, while creating reusable
semantic data?
4
13. Research questions
Digital Enterprise Research Institute www.deri.ie
Q1. How to build semantic applications and tools for
the Semantic Desktop to provide the best
experience for the users, while creating reusable
semantic data?
4
15. Research questions
Digital Enterprise Research Institute www.deri.ie
Q1. How to buildsemantic applications and tools for the
Semantic Desktop?
Q2. How to expand the scope of the Semantic Desktop
into the realm of the Web of Data, to benefit the
users and enhance their experience?
4
16. Research questions
Digital Enterprise Research Institute www.deri.ie
Q1. How to buildsemantic applications and tools for the
Semantic Desktop?
Q2. How to expand the scope of the Semantic Desktop
into the realm of the Web of Data, to benefit the
users and enhance their experience?
4
17. Research questions
Digital Enterprise Research Institute www.deri.ie
Q1. How to buildsemantic applications and tools for the
Semantic Desktop?
Q2. How to expand the scope of the Semantic Desktop
into the Web of Data?
4
18. Q1 sub-questions
Digital Enterprise Research Institute www.deri.ie
semantic applications for the Semantic Desktop
Q1.1. How to create semantic data that is complete, correct,
safe, and provides a high degree of interlinking with the
already existing network of semantic data on the desktop?
Q1.2. How to reuse existing Semantic Desktop data in an
application?
Q1.3. How to design the human-computer interaction in an
application for the Semantic Desktop?
Q1.4. How to correctly evaluate a semantic application?
5
19. Q1 sub-questions
Digital Enterprise Research Institute www.deri.ie
semantic applications for the Semantic Desktop
Q1.1. How to create semantic data that is complete, correct, safe, and
provides a high degree of interlinking with the already existing network
of semantic data on the desktop?
Q1.2. How to reuse existing Semantic Desktop data in an application?
Q1.3. How to design the human-computer interaction in an application
for the Semantic Desktop?
Q1.4. How to correctly evaluate a semantic application?
5
20. Q2 sub-questions
Digital Enterprise Research Institute www.deri.ie
connect the Semantic Desktop with the Web of Data
Q2.1. How to find Web instances representing the same real-
world thing described by a Semantic Desktop resource?
Q2.2. How to use the Web information which is related to a
desktop resource?
Q2.3. How to make desktop data available online safely?
6
21. Q2 sub-questions
Digital Enterprise Research Institute www.deri.ie
connect the Semantic Desktop with the Web of Data
Q2.1. How to find Web instances representing the same real-world
thing described by a Semantic Desktop resource?
Q2.2. How to use the Web information which is related to a desktop
resource?
Q2.3. How to make desktop data available online safely?
6
28. Within the Semantic Desktop
Digital Enterprise Research Institute www.deri.ie
8
29. SemNotes
Digital Enterprise Research Institute www.deri.ie
Challenges described by Q1
create new semantic data
– Data representation
– Data management
reuse existing Semantic Desktop data
– Interlinking
design the human-computer interaction
– Visualisation
correctly evaluate a semantic application
– Task-based comparison to Evernote
9
33. Data representation
Digital Enterprise Research Institute www.deri.ie
<nepomuk:/a_note>
a pimo:Note ;
nao:prefLabel “holiday plans” ;
nao:created “2010-09-16T21:08:54.29Z”ˆˆxsd:dateTime ;
nao:lastModified “2010-09-17T10:59:01.58Z”ˆˆxsd:dateTime
;
nao:numericRating “9”ˆˆxsd:int ;
10
34. Data representation
Digital Enterprise Research Institute www.deri.ie
<nepomuk:/a_note>
a pimo:Note ;
nao:prefLabel “holiday plans” ;
nao:created “2010-09-16T21:08:54.29Z”ˆˆxsd:dateTime ;
nao:lastModified “2010-09-17T10:59:01.58Z”ˆˆxsd:dateTime
;
nao:numericRating “9”ˆˆxsd:int ;
nao:description “<html >... </ html>”ˆˆxsd:string ;
10
35. Data representation
Digital Enterprise Research Institute www.deri.ie
<nepomuk:/a_note>
a pimo:Note ;
nao:prefLabel “holiday plans” ;
nao:created “2010-09-16T21:08:54.29Z”ˆˆxsd:dateTime ;
nao:lastModified “2010-09-17T10:59:01.58Z”ˆˆxsd:dateTime
;
nao:numericRating “9”ˆˆxsd:int ;
nao:description “<html >... </ html>”ˆˆxsd:string ;
nao:hasTag <nepomuk:/res/travel> ;
10
36. Data representation
Digital Enterprise Research Institute www.deri.ie
<nepomuk:/a_note>
a pimo:Note ;
nao:prefLabel “holiday plans” ;
nao:created “2010-09-16T21:08:54.29Z”ˆˆxsd:dateTime ;
nao:lastModified “2010-09-17T10:59:01.58Z”ˆˆxsd:dateTime
;
nao:numericRating “9”ˆˆxsd:int ;
nao:description “<html >... </ html>”ˆˆxsd:string ;
nao:hasTag <nepomuk:/res/travel> ;
pimo:isRelated <nepomuk:/res/Rome>,
<nepomuk:/res/Jane> . 10
37. Interlinking
Digital Enterprise Research Institute www.deri.ie
Annotation suggestions:
Based on the content of the note.
Certain types preferred.
Preference based on past use and matched length.
“ ... brian ... “
Brian Davis
Brian Wall
“ ... brian davis ... “
Brian Davis
11
38. Interlinking algorithm
Digital Enterprise Research Institute www.deri.ie
Algorithm
scan text; identify possible entities
for each possible entity find a list of desktop resource
candidates
– compute score for each possible candidate
– filter list by score
– sort by score
present the candidates to the user
create the relation only if the user chooses a resource
43. Evaluation
Digital Enterprise Research Institute www.deri.ie
The effort of interlinking lower than the effort spent when searching.
Task-based experiment
Comparation of SemNotes to Evernote
14
44. Evaluation
Digital Enterprise Research Institute www.deri.ie
Experimental setup
20 participants
– 14 use note-taking regularly
– 5 use Evernote in their daily activity
Familiar data
– 130 contacts
– 20 scientific papers
– 50 notes
8 tasks
– 2 tasks - familiarise the participants with the dataset
– 6 tasks focused on note-taking, varying the complexity
Measurements
– Time spent
– Mouse clicks
– Keystrokes
15
45. Evaluation
Digital Enterprise Research Institute www.deri.ie
Tasks
T1. Find notes tagged with “todo”
T2. Find to-dos that are related to DERI
T3. Find a to-do related to a presentation given by John
T4. Take a note about planning a social event for your group
T5. Find a note containing minutes from the last meeting
about the NICE project. Change the date of the next
meeting planned
T6. Take a note for the action item assigned to you at the last
meeting
46. Evaluation
Digital Enterprise Research Institute www.deri.ie
Quantitative results
Time spent note-taking
– no significant differences
Time spent searching
– SemNotes significantly faster for complex queries
– no significant difference for simple queries
16
47. Evaluation
Digital Enterprise Research Institute www.deri.ie
Quantitative results
Time spent note-taking
– no significant differences
Time spent searching
– SemNotes significantly faster for complex queries
– no significant difference for simple queries
Questionnaire results
Faster Better
16
48. Evaluation
Digital Enterprise Research Institute www.deri.ie
Quantitative results
Time Clicks
Task
Avg Med t Avg Med t
T1 0.5 0 0.152 0.167 0 0.692
T2 -8 -8 -2.94 -0.333 -1 -0.48
T3 -0.125 1 -0.046 0.857 1 1.426
T4 0.063 0.016 0.486 6.067 8 2.026
T5 14.357 13 1.713 4.812 2 1.527
T6 0.249 0.243 1.004 20.8 12 3.08
49. But ...
Digital Enterprise Research Institute www.deri.ie
The desktop is not any more the sole repository of
personal information
Social networks
Mobile devices
Cloud services
17
50. To the Web of Data
Digital Enterprise Research Institute www.deri.ie
Challenges described by Q2 (Q2.1.)
find Web aliases of Semantic Desktop resources
18
51. Finding Web Aliases
Digital Enterprise Research Institute www.deri.ie
Web alias
= Web resource representing the same
real-world entity as the desktop resource
19
58. 2 Step approach
Digital Enterprise Research Institute www.deri.ie
1. Candidate Selection
Query various Web of Data sources
Identify candidate URIs
Retrieve data for each of the candidate
20
59. 2 Step approach
Digital Enterprise Research Institute www.deri.ie
1. Candidate Selection
Query various Web of Data sources
Identify candidate URIs
Retrieve data for each of the candidate
2. Candidate Filtering
20
60. 2 Step approach
Digital Enterprise Research Institute www.deri.ie
1. Candidate Selection
Query various Web of Data sources
Identify candidate URIs
Retrieve data for each of the candidate
2. Candidate Filtering
Compute similarity score.
Filter the candidates.
20
61. Candidate Selection
Digital Enterprise Research Institute www.deri.ie
Determined set of sources
Specific requirements
Restricted domain
Semantic search engine
Generic domain
Unknown data sources
21
62. Candidate Selection
Digital Enterprise Research Institute www.deri.ie
Determined set of sources
Specific requirements
Restricted domain
Semantic search engine
Generic domain
Unknown data sources
21
63. Candidate Filtering
Digital Enterprise Research Institute www.deri.ie
(local, web)
1. Filter by type
2. Compute similarity score
3. Filter by score
return
score
22
64. Matching Module
Digital Enterprise Research Institute www.deri.ie
(local, web) Type No
return 0
matching
Yes
Compute score
score ≥
threshold No
Yes return
score
65. Matching Parameters
Digital Enterprise Research Institute www.deri.ie
String matching (SM)
Exact matching versus approximate string matching
Koeln vs. Köln
Weighted properties (WP)
Weighted participation of properties in the final score
Email address more exact than name
Multi-valued properties (MVP)
All matching values for a property contribute to
the score
e.g. Authors' names for a paper
66. Score Calculation
Digital Enterprise Research Institute www.deri.ie
Driven by the local data
•weighted sum of matching props
•score =
•total sum of all weighted props
67. Evaluation
Digital Enterprise Research Institute www.deri.ie
Manually constructed gold standard
Data collection
Relevance judgements
IR measures
Effect of parameter settings
Adjust thresholds
23
68. Data collection
Digital Enterprise Research Institute www.deri.ie
Desktop data
50 people – nco:PersonContact
50 music albums – nmo:MusicAlbum
50 publications – nfo:PaginatedTextDocument
11.917 triples
Web data
20 candidates for each desktop resource -> 3000 URIs
1.530.686 triples
24
70. Relevance Judgements
Digital Enterprise Research Institute www.deri.ie
3000 pairs x 3 experts
Fleiss' K = 0.638 ± 0.214
Average pairwise agreement 92.252%
25
71. IR Measures
Digital Enterprise Research Institute www.deri.ie
MAP
NDCG
P@k (k=1,2,3,4,5)
Baseline:
exact match
all properties count equally
single value considered for each property
72. Evaluation Results
Digital Enterprise Research Institute www.deri.ie
Approximate string matching
improves results for albums and people
does not help for publications
Weights and multiple values
when combined improve results for publications,
but not for the other types
26
73. Merging the two directions
Digital Enterprise Research Institute www.deri.ie
2.
1.
27
74. A use case
Digital Enterprise Research Institute www.deri.ie
Note Blog post
[Semantic] note-taking [Semantic] blogging
[Preserve context]
[Preserve privacy]
28
75. Steps
Digital Enterprise Research Institute www.deri.ie
Transformation
On the local side
Extension to SemNotes
Publication
On the server side
According to Linked Data principles
29
76. Steps
Digital Enterprise Research Institute www.deri.ie
(Note-taking & annotation)
(Entity matching)
Transformation
On the local side
Extension to SemNotes
Publication
On the server side
According to Linked Data principles
29
78. Ontology level
Digital Enterprise Research Institute www.deri.ie
Local - Nepomuk ontologies
Remote – SIOC, FOAF, DC, ...
pimo:Note sioc:Post nao:prefLabel rdfs:label
nao:Tag sioct:Tag nao:created dcterms:created
pimo:Person foaf:Person nao:lastModified dcterms:modified
pimo:Project doap:Project nao:hasTag sioc:topic
pimo:Event ical:Vevent pimo:isRelated sioc:related_to
79. Data level
Digital Enterprise Research Institute www.deri.ie
Local – notes, desktop resources (tags included)
Remote – blog posts, Web resources, tags
http://semnotes.deri.ie/notes/note/id
http://semnotes.deri.ie/notes/resource/id
http://semnotes.deri.ie/notes/tag/label
80. Application level - local
Digital Enterprise Research Institute www.deri.ie
Plugin for SemNotes
Ask server for server URLs for the new note and resources
Replace desktop URIs with the server URLs in the note
Add RDFa to the note
Push the transformed note to the server
81. Application level - remote
Digital Enterprise Research Institute www.deri.ie
Web server with MySQL, PHP, ARC2
Create new URLs for resources
Receive and process the note
Publish the data online
83. Research answers
Digital Enterprise Research Institute www.deri.ie
Q1. How to buildsemantic applications and tools for the
Semantic Desktop?
Q2. How to expand the scope of the Semantic Desktop
into the Web of Data?
32
84. Research answers
Digital Enterprise Research Institute www.deri.ie
Q1. How to buildsemantic applications and tools for the
Semantic Desktop?
SemNotes
– Create new data
– Reuse existing data
– HCI
– Evaluation
Q2. How to expand the scope of the Semantic Desktop
into the Web of Data?
32
85. Research answers
Digital Enterprise Research Institute www.deri.ie
Q1. How to buildsemantic applications and tools for the
Semantic Desktop?
SemNotes
– Create new data
– Reuse existing data
– HCI
– Evaluation
Q2. How to expand the scope of the Semantic Desktop
into the Web of Data?
Web aliases
Semantic blogging use case
32
86. Future work
Digital Enterprise Research Institute www.deri.ie
Information Extraction algorithms and methods
create multiple types of relations based on the text
extract new entities from text
extract links between entities mentioned in the notes
Explore visualisations
personal data browser
Large scale user study of semantic personal
information usage and behaviours
33
And then of course the 2 directions can and should and are combined
But the semantic desktop, as efficient as it might become with semantic tools and interconnected data, is no longer the only repository or even the main one some would say of personal data.