The document discusses Steffen Staab's presentation on "The Web We Want" at the WebSci '17 conference. It covers several topics related to making the web more inclusive, healthy, and useful. For social inclusion, it describes the MAMEM project which aims to measure how accessible the web is for people with disabilities. For a healthy web, it discusses using techniques from social network analysis to identify harmful roles and behaviors. For a useful semantic web, it presents principles for interlinking data sets in ways that meaningfully extend entity descriptions and connectivity. The overall goal is to engineer and measure how well the web achieves important values like inclusion, health, and usefulness.
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
The Web We Want
1. Steffen Staab The Web We Want @ WebSci ‘17 1Institute for Web Science and Technologies · University of Koblenz-Landau, Germany
Web and Internet Science Group · ECS · University of Southampton, UK &
The Web We Want
Steffen Staab
@ststaab
http://west.uni-koblenz.de
http://wais.soton.ac.uk
2.
3.
4.
5. Steffen Staab The Web We Want @ WebSci ‘17 5Institute for Web Science and Technologies · University of Koblenz-Landau, Germany
Web and Internet Science Group · ECS · University of Southampton, UK &
The Web We Want
Steffen Staab
@ststaab
http://west.uni-koblenz.de
http://wais.soton.ac.uk
6. Steffen Staab The Web We Want @ WebSci ‘17 6
Web Science is the study of the World Wide Web
and its impact on both society and technology,
positioning the Web as an object of scientific
study unto itself.
https://tw.rpi.edu/web/WhatIsWebScience
Web Science
7. Steffen Staab The Web We Want @ WebSci ‘17 7
Running in cycles
James Hendler, Nigel Shadbolt, Wendy Hall, Tim Berners-Lee, and Daniel Weitzner.
Communications of the ACM 2008
8. Steffen Staab The Web We Want @ WebSci ‘17 8
Where do the issues come from?
9. Steffen Staab The Web We Want @ WebSci ‘17 9
• „Including all people“ to use the Web
• „Healthy“ Web
• „Useful“ Semantic Web
The Web We Want: Some Values
Engineer these
values!
Measure these
values!
10. Steffen Staab The Web We Want @ WebSci ‘17 10
Social inclusion:
Include all people to use the Web
With
Raphael Menges, Daniel Müller, Chandan Kumar, Korok Sengupta
& all the MAMEM team
http://www.mamem.eu
11. Steffen Staab The Web We Want @ WebSci ‘17 11
digital indicators with social inclusion areas
MAMEM Measurement approach concerning social inclusion
Figure by
Agnes Mariakaki &
Sissy Chlomisiou,
MDA Hellas
13. Steffen Staab The Web We Want @ WebSci ‘17 13
MAMEM https://youtu.be/42yGmr3NE0k
14. Steffen Staab The Web We Want @ WebSci ‘17 14
Gaze The Web https://youtu.be/x1ESgaoQR9Y
15. Steffen Staab The Web We Want @ WebSci ‘17 15
GazeTheWeb
What it is
• Flexible browser framework (based on Chromium)
• Open source: https://github.com/MAMEM/GazeTheWeb
How it works
• Observe DOM tree nodes of interest
• Overlays with gaze-controlled interfaces
• Combination of emulation and direct DOM node
interaction over JavaScript
• Optional, extensible multi-modal input
• Physical buttons
• Touch input
• Brain-Computer interfaces
• Voice input
16. Steffen Staab The Web We Want @ WebSci ‘17 16
• Text input field
• Displaying gaze-controlled overlay over fields
• Link elements
• Highlight during click emulation and used for minor coordinate
errors
• Fixed elements
• Detected for prohibiting automatic scrolling when gaze is upon
those elements
• Overflowing elements
• Employ automatic scrolling approach for revealing hidden content
DOM Node Driven Interface
17. Steffen Staab The Web We Want @ WebSci ‘17 17
Challenges for GazeThe Web
• JavaScript Mutation Observer
does not observe
single CSS values
• High density of text input fields
• Semantic information of DOM nodes not provided
• Extensive use of JavaScript and CSS
It‘s all about semantics of UI elements.
18. Steffen Staab The Web We Want @ WebSci ‘17 18
Challenges for GazeThe Web
• JavaScript Mutation Observer
does not observe
single CSS values
• High density of text input fields
• Semantic information of DOM nodes not provided
• Extensive use of JavaScript and CSS, especially in
modern Web applications7
7Bad practice: z-stacking of text inputs on Google search field
Most used search engine is not accessible
because of programming tricks
19. Steffen Staab The Web We Want @ WebSci ‘17 19
Healthy Web:
Transferring Experiences
With Jun Sun & Jerome Kunegis
and the previous teams of ROBUST and REVEAL
20. Steffen Staab The Web We Want @ WebSci ‘17 20
• Benefit from Experience with Social Networks
• Early response to
– trolls
– attacks
– spam
• Social networks are easy to ruin!
What do we want: Healthy social networks
21. Steffen Staab The Web We Want @ WebSci ‘17 21
https://www.youtube.com/watch?v=xxmS77q4XiM
22. Steffen Staab The Web We Want @ WebSci ‘17 22
• Role: two nodes belong to the same role if they have
similar structural behavior
• Using structural features of nodes for classification
Roles in Social Networks
23. Steffen Staab The Web We Want @ WebSci ‘17 23
• Idea: to learn knowledge from a domain (source
domain) and apply it to another domain (target
domain)
• Challenge: feature distributions differ between the
source and target domains
Transfer Learning
24. Steffen Staab The Web We Want @ WebSci ‘17 24
Degree Distribution vs.
Transformed Degree Distribution
25. Steffen Staab The Web We Want @ WebSci ‘17 25
Transfer Learning Procedure
• Step 1 - Feature Extraction
• Step 2 - Feature Transformation
• Step 3 - Feature Aggregation
• Step 4 - Classification
26. Steffen Staab The Web We Want @ WebSci ‘17 26
Application
Source dataset:
• Boards.ie (Irish forum) user interaction
• n=138134, m=6877447
• Roles found:
Role Name
Number of
Users
Percentage
Intra-network
ROC-AUC
Administrator 6 0.004% 0.664
Banned 9420 6.819% 0.651
Moderator 367 0.266% 0.958
Registered User 128217 92.821% N/A
Subscriber 124 0.090% 0.911
Total 138134 100%
27. Steffen Staab The Web We Want @ WebSci ‘17 27
Application
Target dataset:
• Software AG ARIS Community user interaction
• 9566 threads and 20538 comments by 4216 people
28. Steffen Staab The Web We Want @ WebSci ‘17 28
• (Indirectly) evaluate predicted user roles with user
trustworthiness
Result
29. Steffen Staab The Web We Want @ WebSci ‘17 29
Useful Semantic Web
With Cristina Sarasua, Matthias Thimm
30. Steffen Staab The Web We Want @ WebSci ‘17 30
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those
names.
3. When someone looks up a URI, provide useful
information, using the standards (RDF*, SPARQL)
4. Include links to other URIs, so that they can discover
more things.
[Berners-Lee, 2006]
“The Semantic Web isn't just about putting data on the
Web. It is about making links, so that a person or machine
can explore the Web of Data.”
Values for a Useful Semantic Web
31. Steffen Staab The Web We Want @ WebSci ‘17 31
Current Link Analyses
Symmetry: “ the symmetry of
entity links varies between
different pairs of datasets.”
[Hu et al., 2015 on Life sciences
data sets]
Transitivity: “ the transitivity of
entity links is often topic-
dependent.”
[Hu et al., 2015 on Life sciences
data sets]
Link Properties
In- and Out-degree: “Bibsonomy
has links to external 91 data sets.”
Overall link predicate usage:
“owl:sameAs as the most widely
used linking predicate.”
[Schmachtenberg et al., 2014]
Link count: “The Eurostat data set
has owl:sameAs 149 links to
DBpedia.”
[CKAN][Ermilov et al., 2013][Hogan
et al.,2012]
Descriptive Statistics
32. Steffen Staab The Web We Want @ WebSci ‘17 32
Current Link Quality Assessment Methods
Network measures and link
properties count as proxy:
“centrality, clustering coefficient, ,
sameAs chains are shown to be
partially effective at detecting
semantically correct / incorrect
links.”
[Guéret et al., 2012]
Crowdsourcing: “hybrid
methods can improve semantic
accuracy.”
[Demartini et al., 2012, Sarasua et
al.,2012, Acosta et al., 2013]
Entity Connectivity:
“50 % of the entities in
the source data set
contain links to
external entities.”
[Albertoni et al., 2013]
CompletenessSemantic Accuracy
Deadlinks: “we
found 302,855,189
unverified links, and
12,430,800 dead
links.”
[Neto et al., 2016]
Availability
33. Steffen Staab The Web We Want @ WebSci ‘17 33
Current Link Quality Assessment Methods
Network measures and link
properties count as proxy:
“centrality, clustering coefficient,
sameAs chains are shown to be
partially effective at detecting
semantically correct / incorrect
links.”
[Guéret et al., 2012]
Crowdsourcing: “hybrid
methods can improve semantic
accuracy.”
[Demartini et al., 2012]
[Sarasua et al.,2012]
[Acosta et al., 2013]
Semantic Accuracy
Entity Connectivity:
“50 % of the entities in
the source data set
contain links to
external entities.”
[Albertoni et al., 2013]
Completeness
Deadlinks: “we found
302,855,189
unverified links, and
12,430,800 dead
links.”
[Neto et al., 2016]
Availability
To what extent do existing links
add “more things” to the
source entities and
make the Semantic Web useful?
34. Steffen Staab The Web We Want @ WebSci ‘17 34
Why?
Help
• to understand the impact
of existing links
• to spot weak points
• with guidance for
improving existing links
Encourage iterative
maintenance
Data Publisher
(predisposed to improve her links)
35. Steffen Staab The Web We Want @ WebSci ‘17 35
Our Task
Given a data set D containing the interlinking I
• compare D and DI and
• analyse the value that I gives to the source
data
In terms of the principles for data interlinking in the
Web of Data.
Bibsonomy ACM
owl:sameAs
owl:sameAs
owl:sameAs
36. Steffen Staab The Web We Want @ WebSci ‘17 36
Our Task
Bibsonomy ACM
owl:sameAs
owl:sameAs
owl:sameAs
Given a data set D containing the interlinking I
• compare D and DI and
• analyse the value that I gives to the source
data
In terms of the principles for data interlinking in the
Web of Data.
37. Steffen Staab The Web We Want @ WebSci ‘17 37
Our Task
Bibsonomy ACM
owl:sameAs
owl:sameAs
owl:sameAs
Given a data set D containing the interlinking I
• compare D and D I and
• analyse the value that I gives to the source
data
In terms of the principles for data interlinking in the
Web of Data.
38. Steffen Staab The Web We Want @ WebSci ‘17 38
Our Task
38
Bibsonomy ACM
owl:sameAs
owl:sameAs
owl:sameAs
38
Given a data set D containing the interlinking I
• compare D and D I and
• analyse the value that I gives to the source
data
In terms of the principles for data interlinking in the
Web of Data.
39. Steffen Staab The Web We Want @ WebSci ‘17 39
Principles
d1:nn “Natasha Noy”
foaf:name
d1:
stan
dbo:affiliation
d1:
p2012
dbo:swrc:publication
Data set 1
40. Steffen Staab The Web We Want @ WebSci ‘17 40
Principles
Extend entity description (P1)
d1:nn “Natasha Noy”
foaf:name
d1:
stan
dbo:affiliation
d1:
p2012
dbo:swrc:publication
d2:nn “Natasha Noy”
foaf:name
Data set 1 Data set 2
owl:sameAs
41. Steffen Staab The Web We Want @ WebSci ‘17 41
Principles
Extend entity description (P1)
d1:nn “Natasha Noy”
foaf:name
d1:
stan
dbo:affiliation
d1:
p2012
dbo:swrc:publication
d2:nn “Natalya Noy”
foaf:name
d2:
goog
dbo:affiliation
d2:
p2015
dbo:swrc:publication
Data set 1 Data set 2
owl:sameAs
Better!
42. Steffen Staab The Web We Want @ WebSci ‘17 42
Principles
Extend entity description (P1)
d1:nn “Natasha Noy”
foaf:name
d1:
stan
dbo:affiliation
d1:
p2012
dbo:swrc:publication
Data set 1
owl:sameAs
d2:nn “Natalya Noy”
foaf:name
d2:
goog
dbo:affiliation
d2:
p2015
dbo:swrc:publication
Data set 2
vivo:
Femal
e
rdf:type
Even
Better!
43. Steffen Staab The Web We Want @ WebSci ‘17 43
Principles
Extend entity connectivity (P2)
d1:nn “Natasha Noy”
foaf:name
d1:
stan
dbo:affiliation
d1:
p2012
dbo:swrc:publication
d2:nn “Natasha Noy”
foaf:name
d2:
stan
dbo:affiliation
Data set 1 Data set 2
owl:sameAs
44. Steffen Staab The Web We Want @ WebSci ‘17 44
Principles
Extend entity connectivity (P2)
d1:nn “Natasha Noy”
foaf:name
d1:
goog
dbo:affiliation
d1:
p2012
dbo:swrc:publication
d2:nn “Natalya Noy”
foaf:name
d2:
goog
dbo:affiliation
Data set 1
Data set 2
owl:sameAs
d3:nn “Natalya Noy”
foaf:name
d3:
goog
dbo:affiliation
Data set 3
owl:sameAs
Better!
45. Steffen Staab The Web We Want @ WebSci ‘17 45
Principles
Increase number of vocabularies used (P3)
d1:nn “Natasha Noy”
foaf:name
d1:
stan
dbo:affiliation
d1:
p2012
dbo:swrc:publication
Data set 1
owl:sameAs
d2:nn “Natasha Noy”
foaf:name
d2:
prote
ge
foaf:pastProject
d2:
biopo
rtal
foaf:project
Data set 2
foaf:
Person
rdf:type
46. Steffen Staab The Web We Want @ WebSci ‘17 46
Principles
Increase number of vocabularies used (P3)
d1:nn “Natasha Noy”
foaf:name
d1:
stan
dbo:affiliation
d1:
p2012
dbo:swrc:publication
Data set 1
owl:sameAs
d2:nn “Natasha Noy”
foaf:name
d2:
post2
sioc:creator
foaf:
Perso
n
rdf:type
Data set 2
proto
n:Hum
an
rdf:type
Better!
48. Steffen Staab The Web We Want @ WebSci ‘17 48
The Web We Want: Guided by Values
values
49. Steffen Staab The Web We Want @ WebSci ‘17 49
Two sides of one coin:
• Measuring achievement of values
inspires engineering
• Engineering requires measurement
of extent of achievement of values
The Web We Want: Measured/Engineering
https://commons.wikimedia.org/w/index.php?curid=31105116
50. Steffen Staab The Web We Want @ WebSci ‘17 50
• Include people
– Key: semantic UI elements!
– Not enough HTML5 high-level description
– Beyond HTML5
• Healthy Web
– Transferring experiences
• Useful Semantic Web
– Not all links are equal
– Some links are more powerful than others
The Web We Want: Examples
51. Steffen Staab The Web We Want @ WebSci ‘17 51
• Include people
– Key: semantic UI elements!
– Not enough HTML5 high-level description
– Beyond HTML5
• Healthy Web
– Transferring experiences
• Useful Semantic Web
– Not all links are equal
– Some links are more powerful than others
The Web We Want: The Values that Guide Us?
What are values that
you want to pursue/see pursued?
values
52. Steffen Staab The Web We Want @ WebSci ‘17 52Institute for Web Science and Technologies · University of Koblenz-Landau, Germany
Web and Internet Science Group · ECS · University of Southampton, UK &
Thanks to my team members and all
the other collaborators:
Raphael Menges, Daniel Müller, Chandan
Kumar, Korok Sengupta, Jun Sun, Jerome
Kunegis Cristina Sarasua, Matthias Thimm
Project teams:
MAMEM, ROBUST, REVEAL