Mcleodganj Call Girls đ„° 8617370543 Service Offer VIP Hot Model
Â
SSSW 2016 Cognition Tutorial
1. Cognition for the Semantic Web
Involving Humans in
Semantic Data Management
Irene Celino (irene.celino@cefriel.com)
CEFRIEL â Milano, Italy
2. Agenda
Methods to involve people (a.k.a. crowdsourcing and its brothers)
Motivation and incentives (a.k.a. letâs have fun with games)
Crowdsourcing and the Semantic Web (a.k.a. this is SSSW after allâŠ)
3. Methods to
involve people
What goals can humans help
machines to achieve? Which
scientific communities âexploitâ
people? How to involve a crowd of
persons?
- Citizen Science
- Crowdsourcing
- Human Computation
4. Wisdom of crowds [1]
âWhy the Many Are Smarter Than the Few and
How Collective Wisdom Shapes Business, Economies, Societies and Nationsâ
Criteria for a wise crowd
Diversity of opinion (importance of interpretation)
Independence (not a âsingle mindâ)
Decentralization (importance of local knowledge)
Aggregation (aim to get a collective decision)
The are also failures/risks in crowd decisions:
Homogeneity, centralization, division, imitation, emotionality
5. Citizen Science [2]
Problem: a scientific experiment
requires the execution of a lot of
simple tasks, but researchers are busy Solution: engage the general audience
in solving those tasks, explaining that
they are contributing to science,
research and the public good
Example: https://www.zooniverse.org/
6. Crowdsourcing [3]
Problem: a company needs to execute
a lot of simple tasks, but cannot afford
hiring a person to do that job
Solution: pack tasks in bunches
(human intellingence tasks or HITs)
and outsource them to a very cheap
workforce through an online platform
Example: https://www.mturk.com/
7. Human Computation [4]
Problem: an Artificial Intelligence
algorithm is unable to achieve an
adequate result with a satisfactory
level of confidence
Solution: ask people to intervene when
the AI system fails, âmaskingâ the task
within another human process
Example: https://www.google.com/recaptcha/
8. Spot the differenceâŠ
Similarities:
- Involvement of people
- No automatic replacement
Variations:
- Motivation
- Reward (glory, money, passion)
Hybrids or parallel!
Citizen Science
Crowdsourcing
Human
Computation
9. Motivation and
incentives
Apart from extrinsic rewards
(money, prizes, etc.) what are the
intrinsic incentives we can adopt to
motivate people? How can we
leverage âfunâ through games and
game-like applications?
- Gamification
- Games with a Purpose (GWAP)
10. Gamification [5,6]
Problem: motivate people to execute
boring or mandatory tasks (especially
in cooperative environments) that they
are usually not very happy to do
Solution: introduce typical game
elements (e.g. points, badges,
leaderboards) within the more
traditional processes and systems
Example: https://badgeville.com/
11. Games with a Purpose (GWAP) [7,8]
Problem: same of Human
Computation (ask
humans when AI fails)
Solution: hide the task within a game, so
that users are motivated by game
challenges, often remaining unaware of
the hidden purpose, task solution
comes from agreement between players
Example: http://www.gwap.com/
12. Crowdsourcing
and the
Semantic Web
Can we involve people in Semantic
Web systems? What semantic data
management tasks can we
effectively âoutsourceâ to humans?
13. Why Crowdsourcing in the Semantic Web?
Knowledge-intensive and/or context-speciïŹc character of Semantic Web tasks:
e.g., conceptual modelling, multi-language resource labelling, content
annotation with ontologies, concept/entity similarity recognition, âŠ
Crowdsourcing can help to engage users and involve them in executing tasks:
e.g., wikis for semantic content authoring, folksonomies to bootstrap formal
ontologies, human computation approaches, âŠ
14. Semantic Crowdsourcing [9]
Many tasks in Semantic Web data management/curation can exploit Crowdsourcing
Fact level
Schema level
Collection Creation CorrectionValidation Filtering Ranking Linking
Conceptual
modelling
Ontology
population
Quality
assessment
Ontology re-
engineering
Ontology
pruning
Ontology
elicitation
Knowledge
acquisition
Ontology
repair
Knowledge
base update
Data search/
selection Link
generation
Ontology
alignment
Ontology
matching
15. Focus on Data Linking
Creation of links in the form of RDF triples (subject, predicate, object)
Within the same dataset (i.e. generating new connections between resources of
the same dataset)
Across different datasets (i.e. creating RDF links, as named in the Linked Data world)
Notes:
Generated links can have an associated score Ï â 0,1 expressing a sort of âconfidenceâ in the
truth value or in the ârelevanceâ of the triple
In literature, data linking often means finding equivalent resources (similarly to record linkage in
database research), i.e. triples with correspondence/match predicate (e.g. owl:sameAs) ï in the
following, data linking is intended in its broader meaning (i.e. links with any predicate)
17. Set of all links: <asset> foaf:depiction <photo>
Goal: assign score đ to rank links on their ârecognisabilityâ/ârepresentativenessâ
Link ranking [10]
http://bit.ly/indomilando
Pure GWAP with
hidden purpose
Points, badges,
leaderboard as
intrinsic reward
The score đ is a
function of đ đ
where đ is the no. of
successes (=recogni-
tions) and đ the no.
of trials of the
Bernoulli process
(guess or not guess)
realized by the game
Link ranking is a result
of the âagreementâ
between players
18. Link validation [11]
http://bit.ly/foss4game
Set of links: <land-area> clc:hasLandCover <land-cover>
Automatic classifications: <land-cover-assigned-by-DUSAF> â <land-cover-assigned-by-GL30>
Goal: assign score đ to each link to discover the ârightâ land cover class
Pure GWAP with
not-so-hidden purpose
(played by âexpertsâ)
Points, badges,
leaderboard as
intrinsic reward
https://youtu.be/Q0ru1hhDM9Q
A player scores if he/she
guess one of the two
disagreeing classifications
Score đ of each link is
updated on the basis of
playersâ choices
(incremented if link
selected, decremented if
link not selected)
When the score of a link
overcomes a threshold
đ â„ đĄ , the link is
considered âtrueâ
Link validation is a result
of the âagreementâ
between players
19. Caveat 1: Mice and Men (or: keep it simple)
Crowdsourcing workers behave like mice [12]
Mice prefer to use their motor skills (biologically cheap, e.g. pressing a lever to get
food) rather than their cognitive skills (biologically expensive, e.g. going through a
labyrinth to get food)
Workers prefer/are better at simple tasks (e.g. those that can be solved at first
sight) and discard/are worse at more complex tasks (e.g. those that require logics)
Crowdsourcing tasks should be carefully designed
Tasks as simple as possible for the workers to solve
Complex tasks together with other incentives (e.g. variety/novelty)
20. Suggestion 1: Divide et impera (or: Find-Fix-Verify)
Find-Fix-Verify crowd programming pattern [13]
A long and âexpensiveâ taskâŠ
Summarize a text to shorten its total length
âŠis decomposed in more atomic tasksâŠ
1. find sentences that need to be shortened
2. fix a sentence by shortening it
3. verify which summarized sentence maintains original meaning
âŠand the complex task is turned into a workflow of simple
tasks, and each step is outsourced to a crowd
21. Caveat 2: aggregation and disagreement
Are all contributors âcreated equalâ?
Contributions/results on the same task are usually aggregated on different workers
(âwisdom of crowdsâ, âcollective intelligenceâ)
Update formula for đ score should weight contributions differently by including
some evaluation of contributorsâ reliability (e.g. gold standard)
Is there always a âright answerâ? Or is there a âcrowd truthâ? [14]
Not always true/false, because of human subjectivity, ambiguity and uncertainty
Disagreement across contributors is not necessarily bad, but a sign of: different
opinions, interpretations, contexts, perspectives, âŠ
22. Suggestion 2: compare and contrast
From âwisdom of crowdsâ to âwisdom of the crowdsourcing methodsâ: different
approaches to solve the same problem could be put in parallel to compare results
Which is the best crowdsourcing approach
for a specific use case?
Which is the most suitable crowd?
Is crowdsourcing better/faster/cheaper
than automatic means (e.g. AI)?
Examples:
Detecting quality issues in DBpedia [15]: find-verify strategy
with different crowds (experts and workers) behind the find step
Employing people for ontology alignment [16]: outcomes comparable to results of OAEI systems
input
task
output
solution
Citizen Science
Human Computation
Crowdsourcing
Automatic/machine
computation
23. Bibliography (1/2)
SSWS 2016 - Bertinoro, 22th July 2016
[1] James Surowiecki. The wisdom of crowds, Anchor, 2005.
[2] Alan Irwin. Citizen science: A study of people, expertise and sustainable development. Psychology
Press, 1995.
[3] JeïŹ Howe. Crowdsourcing: How the power of the crowd is driving the future of business. Random
House, 2008.
[4] Edith Law and Luis von Ahn. Human computation. Synthesis Lectures on ArtiïŹcial Intelligence and
Machine Learning, 5(3):1â121, 2011.
[5] Jane Mc Gonigal. Reality is broken: Why games make us better and how they can change the world.
Penguin, 2011.
[6] Kevin Werbach and Dan Hunter. For The Win: How Game Thinking Can Revolutionize Your Business.
Wharton Digital Press, 2012.
[7] Luis Von Ahn. Games with a purpose. Computer, 39(6):92â94, 2006.
[8] Luis Von Ahn and Laura Dabbish. Designing games with a purpose. Communications of the ACM,
51(8):58â67, 2008.
24. Bibliography (2/2)
SSWS 2016 - Bertinoro, 22th July 2016
[9] Cristina Sarasua, Elena Simperl, Natasha Noy, Abraham Bernstein, Jan Marco Leimeister.
Crowdsourcing and the Semantic Web: A Research Manifesto, Human Computation 2 (1), 3-17, 2015.
[10] Irene Celino, Andrea Fiano, and Riccardo Fino. Analysis of a Cultural Heritage Game with a Purpose
with an Educational Incentive, ICWE 2016 Proceedings, pp. 422-430, 2016.
[11] Maria Antonia Brovelli, Irene Celino, Monia Molinari, Vijay Charan Venkatachalam. A crowdsourcing-
based game for land cover validation, ESA Living Planet Symposium 2016 Proceedings, 2016.
[12] Panos Ipeirotis. On Mice and Men: The Role of Biology in Crowdsourcing, Keynote talk at Collective
Intelligence, 2012.
[13] M. Bernstein, G. Little, R. Miller, B. Hartmann, M. Ackerman, D. Karger, D. Crowell, K. Panovich.
Soylent: A Word Processor with a Crowd Inside, UIST Proceedings, 2010.
[14] Lora Aroyo, Chris Welty. Truth is a Lie: 7 Myths about Human Annotation, AI Magazine 2014.
[15] Maribel Acosta, Amrapali Zaveri, Elena Simperl, Dimitris Kontokostas, Fabian Flöck, Jens Lehmann.
Detecting Linked Data Quality Issues via Crowdsourcing: A DBpedia Study, Semantic Web Journal, 2016.
[16] Cristina Sarasua, Elena Simperl, Natasha Noy. Crowdmap: Crowdsourcing ontology alignment with
microtasks, ISWC 2012 Proceedings, pp. 525-541, 2012.