SlideShare a Scribd company logo
1 of 43
HUMAN COMPUTATION
Irene Celino – irene.celino@cefriel.com
Cefriel, Viale Sarca 226, 20126 Milano
Seminar @ Data Semantics course – April 11th, 2018
1. Introduction
2. Linked Data and Knowledge Graph Refinement
3. Human Computation and Games with a Purpose
4. Examples of GWAP for Data Linking
5. Truth Inference and Open Science
6. Guidelines
7. Indirect People Involvement
2copyright © 2018 Cefriel – All rights reserved
from ideation to business value
3
1. INTRODUCTION
Is the Web a pure technological artefact?
What role can people play on the Web?
copyright © 2018 Cefriel – All rights reserved
WEB AS A SOCIAL ARTEFACT
“The Web isn’t about what you can do with computers.
It’s people and, yes, they are connected by computers.
But computer science, as the study of
what happens in a computer, doesn’t tell
you about what happens on the Web”
– sir Tim Berners-Lee
4copyright © 2018 Cefriel – All rights reserved
Open Source Software
“Given enough eyeballs, all bugs are
shallow.” Eric S. Raymond
(The Cathedral and the Bazaar)
OPEN EVERYTHING
Open Content
“It is easy when you skip the
intermediaries”
original motto of Creative Commons
(EN video) (IT video)
Open Data
5copyright © 2018 Cefriel – All rights reserved
“Raw. Data. Now.” Tim Berners-Lee
(The year open data went worldwide –
TED Talk)
COOPERATION ON THE WEB TO PRODUCE OPEN KNOWLEDGE
6copyright © 2018 Cefriel – All rights reserved
WISDOM OF CROWDS
• “Why the Many Are Smarter Than the Few and
How Collective Wisdom Shapes Business, Economies, Societies and Nations”
• Criteria for a wise crowd
• Diversity of opinion (importance of interpretation)
• Independence (not a “single mind”)
• Decentralization (importance of local knowledge)
• Aggregation (aim to get a collective decision)
• The are also failures/risks in crowd decisions:
• Homogeneity, centralization, division, imitation, emotionality
7copyright © 2018 Cefriel – All rights reserved
James Surowiecki
The wisdom of crowds
Anchor, 2005
from ideation to business value
8
2. LINKED DATA & KNOWLEDGE
GRAPH REFINEMENT
Do we need to involve people in Semantic Web systems?
What semantic data management tasks can we effectively “outsource” to humans?
copyright © 2018 Cefriel – All rights reserved
HUMANS IN THE SEMANTIC WEB
• Knowledge-intensive and/or context-specific character of Semantic Web tasks:
• e.g., conceptual modelling, multi-language resource labelling, content annotation with
ontologies, concept/entity similarity recognition, …
• Need to engage users and involve them in executing tasks:
• e.g., wikis for semantic content authoring, folksonomies to bootstrap formal
ontologies, instance creation by data entry, …
9copyright © 2018 Cefriel – All rights reserved
SEMANTIC WEB TASKS (ALSO) FOR HUMANS
10copyright © 2018 Cefriel – All rights reserved
Fact level
Schema level
Collection Creation CorrectionValidation Filtering Ranking Linking
Conceptual
modelling
Ontology
population
Quality
assessment
Ontology re-
engineering
Ontology
pruning
Ontology
elicitation
Knowledge
acquisition
Ontology
repair
Knowledge
base update
Data search/
selection Link
generation
Ontology
alignment
Ontology
matching
AUTOMATIC METHODS IN THE SEMANTIC WEB?
• Knowledge Graph Refinement (and, in general, linked dataset refinement) is an
emerging and hot topic to (1) identify and correct errors and (2) add missing knowledge
• e.g., completing type assertions via classification, predicting relations from textual
sources, finding erroneous type assertions, identifying erroneous literal values
through anomaly/outlier detection, …
• Statistical and machine learning approaches require some partial gold standard,
i.e. a “ground truth” dataset to train automatic models
• Ground truth is usually put together manually by expert
• Sourcing gold standard from humans is expensive!
11copyright © 2018 Cefriel – All rights reserved
Heiko Paulheim. Knowledge graph refinement: A survey of
approaches and evaluation methods. Semantic Web Journal, 2017
DATA LINKING
• Creation of links in the form of RDF triples (subject, predicate, object)
• Within the same dataset (i.e. generating new connections between resources of the
same dataset or knowledge graph)
• Across different datasets (i.e. creating RDF links, as named in the Linked Data world)
• Note:
• In literature, data linking often means finding equivalent resources (similarly to record
linkage in database research), i.e. triples with correspondence/match predicate (e.g.
owl:sameAs)  in the following, data linking is intended in its broader meaning (i.e. links
with any predicate)
12copyright © 2018 Cefriel – All rights reserved
DATA LINKING: SOME DEFINITIONS
• Resources R is the set of all resources (and literals), whenever possible also described by the
respective types. More specifically: R = Rs ∪ Ro, where Rs is the set of resources that can take the
role of subject in a triple and Ro is the set of resources that can take the role of object in a triple; as said
above the two sets are not necessarily disjoint, i.e. it can happen that Rs ∩ Ro ≠ ∅.
• Predicates P is the set of all predicates, whenever possible also described by the respective domain
and range.
• Links L is the set of all links; since links are triples created between resources and predicates it is:
L ⊂ Rs × P × Ro; each link is defined as l = (rs,p,ro) ∈ L with rs ∈ Rs, p ∈ P, ro ∈ Ro.
L is usually smaller than the full Cartesian product of Rs, P, Ro, because in each link (rs,p,ro) it
must be true that rs ∈ domain(p) and ro ∈ range(p).
• Link scores σ is the score of a link, i.e. a value indicating the confidence on the truth value of the link;
usually σ ∈ [0,1]; each link l ∈ L can have an associated score.
13copyright © 2018 Cefriel – All rights reserved
CASES OF DATA LINKING
• Link creation: a link l is created: given R = Rs ∪ Ro and P, the link l = (rs,p,ro), with rs ∈ Rs,
p ∈ P, ro ∈ Ro is created and added to L
• e.g., music classification: assign one or more music styles to audio tracks by creating the link
(track,genre,style)
• Link ranking: given the set of links L, a score σ ∈ [0,1] is assigned to each link l. The score
represents the probability of the link to be recognized as true. Links can be ordered on the basis of their
score σ, thus obtaining a ranking
• e.g., ranking photos depicting a specific person (an actor, a singer, a politician) to identify the
pictures in which the person is more recognizable or more clearly depicted
• Link validation: given the set of links L, a score σ ∈ [0,1] is assigned to each link l. The score
represents the actual truth value of the link. A threshold t ∈ [0,1] is set so that all links with score
σ ≥ t are considered true
• e.g., assessing the correct music style identification in audio tracks (music classification)
14copyright © 2018 Cefriel – All rights reserved
from ideation to business value
15
3. HUMAN COMPUTATION &
GAMES WITH A PURPOSE
What goals can humans help machines to achieve? How to involve a crowd of persons?
What extrinsic rewards (money, prizes, etc.) or intrinsic incentives can we adopt to
motivate people?
copyright © 2018 Cefriel – All rights reserved
HUMAN COMPUTATION
• Human Computation is a computer science technique in which a computational process
is performed by outsourcing certain steps to humans. Unlike traditional computation,
in which a human delegates a task to a computer, in Human Computation the computer
asks a person or a large group of people to solve a problem; then it collects, interprets
and integrates their solutions
• The original concept of Human Computation by its inventor Luis von Ahn derived from the
common sense observation that people are intrinsically very good at solving some
kinds of tasks which are, on the other hand, very hard to address for a computer;
this is the case of a number of targets of Artificial Intelligence (like image recognition or
natural language understanding) for which research is still open
16copyright © 2018 Cefriel – All rights reserved
Edith Law and Luis von Ahn. Human computation.
Synthesis Lectures on Artificial Intelligence and Machine Learning, 2011
HUMAN COMPUTATION
17copyright © 2018 Cefriel – All rights reserved
Problem: an Artificial Intelligence
algorithm is unable to achieve an
adequate result with a satisfactory
level of confidence
Solution: ask people to intervene
when the AI system fails, “masking”
the task within another human
process
Example: https://www.google.com/recaptcha/
CROWDSOURCING
• Crowdsourcing is the process to outsource tasks to a “crowd” of distributed people.
The possibility to exploit the Internet as vehicle to recruit contributors and to assign
tasks led to the rise of micro-work platforms, thus often (but not always) implying a
monetary reward. The term Crowdsourcing, although quite recent, is used to indicate a
wide range of practices; however, the most common meaning of Crowdsourcing implies
that the “crowd” of workers involved in the solution of tasks is different from the traditional
or intended groups of task solvers
18copyright © 2018 Cefriel – All rights reserved
Jeff Howe. Crowdsourcing: How the power of the crowd
is driving the future of business. Random House, 2008
CROWDSOURCING
19copyright © 2018 Cefriel – All rights reserved
Problem: a company needs to
execute a lot of simple tasks,
but cannot afford hiring a
person to do that job
Solution: pack tasks in
bunches (human intelligence
tasks or HITs) and outsource
them to a very cheap workforce
through an online platform
Example: https://www.mturk.com/
CITIZEN SCIENCE
• Citizen Science is the involvement of volunteers to collect or process data as part of
a scientific or research experiment; those volunteers can be the scientists and
researchers themselves, but more often the name of this discipline “implies a form of
science developed and enacted by citizens” including those “outside of formal scientific
institutions”, thus representing a form of public participation to science. Formally, Citizen
Science has been defined as “the systematic collection and analysis of data; development
of technology; testing of natural phenomena; and the dissemination of these activities by
researchers on a primarily avocational basis”.
20copyright © 2018 Cefriel – All rights reserved
Alan Irwin. Citizen science: A study of people, expertise
and sustainable development. Psychology Press, 1995
CITIZEN SCIENCE
21copyright © 2018 Cefriel – All rights reserved
Example: https://www.zooniverse.org/
Problem: a scientific
experiment requires the
execution of a lot of simple
tasks, but researchers are busy
Solution: engage the general
audience in solving those tasks,
explaining that they are
contributing to science,
research and the public good
SPOT THE DIFFERENCE…
• Similarities:
• Involvement of people
• No automatic replacement
• Variations:
• Motivation
• Reward (glory, money, passion/need)
• Hybrids or parallel!
22copyright © 2018 Cefriel – All rights reserved
Citizen Science
Crowdsourcing
Human
Computation
GAMES WITH A PURPOSE
• A GWAP lets to outsource to humans some steps of a computational process in an
entertaining way
• The application has a “collateral effect”, because players’ actions are exploited to
solve a hidden task
• The application *IS* a fully-fledged game (opposed to gamification, which is the use
of game-like features in non-gaming environments)
• The players are (usually) unaware of the hidden purpose, they simply meet game
challenges
23copyright © 2018 Cefriel – All rights reserved
Luis Von Ahn. Games with a purpose. Computer, 39(6):92–94, 2006
Luis Von Ahn and Laura Dabbish. Designing games with a purpose.
Communications of the ACM, 51(8):58–67, 2008
GAMES WITH A PURPOSE (GWAP)
24copyright © 2018 Cefriel – All rights reserved
Problem: it’s the same of
Human Computation (ask
humans when AI fails)
Solution: Solution: hide the
task within a game, so that
users are motivated by game
challenges, often remaining
unaware of the hidden purpose,
task solution comes from
agreement between players
from ideation to business value
25
4. GWAPS FOR DATA LINKING
Can we embed data linking tasks within Games with a Purpose?
copyright © 2018 Cefriel – All rights reserved
26
• Input: set of all links
<asset>
foaf:depiction
<photo>
• Goal: assign score 𝜎 to
rank links on their
recognisability/representa-
tiveness
• The score 𝜎 is a function of
𝑋 𝑁 where 𝑋 is the no. of
successes (=recognitions)
and 𝑁 the no. of trials of
the Bernoulli process
(guess or not guess)
realized by the game
• Cultural heritage assets in Milano and their pictures
LINK RANKING
copyright © 2018 Cefriel – All rights reserved
http://bit.ly/indomilando
Pure GWAP with
hidden purpose
Points, badges,
leaderboard as
intrinsic reward
Link ranking is a result
of the “agreement”
between players
But also an
educational
“collateral effect”
Irene Celino, Andrea Fiano, Riccardo Fino. Analysis of a Cultural Heritage Game with a Purpose
with an Educational Incentive. 16th International Conference on Web Engineering, 2016
27
• Input: set of links
<land-area>
clc:hasLandCover
<land-cover>
• Goal: assign score 𝜎 to
each link to discover the
“right” land cover class
• Score 𝜎 of each link is
updated on the basis of
players’ choices
(incremented if link
selected, decremented if
link not selected)
• When the score of a link
overcomes the threshold
𝜎 ≥ 𝑡 , the link is considered
“true” (and removed from
the game)
• Two automatic classifications in disagreement:
<land-cover-assigned-by-DUSAF> ≠ <land-cover-assigned-by-GL30>
LINK VALIDATION
copyright © 2018 Cefriel – All rights reserved
https://youtu.be/Q0ru1hhDM9Q
http://bit.ly/foss4game
Pure GWAP with
not-so-hidden purpose
(played by “experts”)
Points, badges,
leaderboard as
intrinsic reward
A player scores if he/she
guess one of the two
disagreeing classifications
Link validation is a result
of the “agreement”
between players
Maria Antonia Brovelli, Irene Celino, Andrea Fiano, Monia Elisa Molinari, Vijaycharan Venkatachalam.
A crowdsourcing-based game for land cover validation. Applied Geomatics, 2017
28
• Input: set of subject
resources (pictures) and
object resources
(classification categories)
• Goal: create links
<picture> hasCategory
<category> and assign
score 𝜎 to each link
• Score 𝜎 of each link is
updated on the basis of
players’ choices
(incremented if link
selected)
• When the score of a link
overcomes the threshold
𝜎 ≥ 𝑡 , the link is considered
“true” (and the picture is
removed from the game)
• Identify pictures of cities from above between those taken on board of the ISS (the pictures are
used then in a scientific process in light pollution research)
LINK COLLECTION & VALIDATION
copyright © 2018 Cefriel – All rights reserved
http://nightknights.eu
Pure GWAP with
not-so-hidden purpose
(but played by anybody)
Points, badges,
leaderboard as
intrinsic reward
A player scores if he/she
agrees with another player
“Bonus” intrinsic reward
with NASA pictures!
Gloria Re Calegari, Gioele Nasi, Irene Celino. Human Computation vs. Machine Learning:
an Experimental Comparison for Image Classification. Human Computation Journal, 2018.
from ideation to business value
29
5. TRUTH INFERENCE &
OPEN SCIENCE
How do we aggregate the contributions from the crowd?
Are individual contribution of any value?
copyright © 2018 Cefriel – All rights reserved
AGGREGATION OF CONTRIBUTIONS
• The same task is usually given to multiple human contributors (named workers in crowdsourcing)
• Results on the same task are then aggregated across different contributors (“wisdom of crowds”)
• How to perform the truth inference process?
• Simplistic solution: majority voting across all contributors
• But… are all contributors “created equal”? No! Less simplistic solutions:
• Majority voting across “quality” contributors (filtering out “spammers”)
• Weighted majority voting with estimation of contributors “reliability”
• Expectation maximization
• Message passing… and a lot more!
• How to compute contributor reliability?
• Assessment tasks (gold standard) with known solution to measure reliability
• History of contributions/past behaviours to compute a “reputation” value
30copyright © 2018 Cefriel – All rights reserved
TRUTH INFERENCE GENERIC ALGORITHM
31copyright © 2018 Cefriel – All rights reserved
Yudian Zheng, Guoliang Li, Yuanbing Li, Caihua Shan, Reynold Cheng.
Truth Inference in Crowdsourcing: Is the Problem Solved? VLDB 2017
Input: contributions
Output: truth and reliability
Step 2: compute an estimation
of contributor reliability (e.g.
precision on truth estimation)
Step 1: compute an
estimation of the truth
(e.g. majority voting)
Iterate until convergence (e.g.
until some difference w.r.t.
previous step is really small)
OPEN SCIENCE: ENABLING COMPARE & CONTRAST
• Open Science has the aim to make scientific research and data accessible to all levels of society
• Repeatability and reproducibility are among the foundational principles of open science
• Human Computation aims at involving people in some step of the scientific process
• Human contributors generate data to solve assigned tasks
• Algorithms aggregate contributions in the truth inference process
• Can we compare different truth inference algorithms?
• Yes, if we make available the data of the Human Computation process!
• What can we share, e.g. in the case of data linking tasks?
• “True” and “false” links
• Confidence scores of the links
• Individual contributions and aggregation process
32copyright © 2018 Cefriel – All rights reserved
PROV-O AND HUMAN COMPUTATION ONTOLOGY
• Provenance is information about entities, activities, and people involved in producing
a piece of data or thing (used to assess its quality, reliability or trustworthiness)
• W3C defined the PROV-O ontology to capture provenance information
https://www.w3.org/TR/prov-o/
• The Human Computation ontology extends PROV-O
to describe the data shared within a Human Computation Process
http://swa.cefriel.it/ontologies/hc
• Data linking process information can be published
according to linked data principles described with the HC ontology
(e.g. data from the Urbanopoly GWAP at http://swa.cefriel.it/linkeddata/)
33copyright © 2018 Cefriel – All rights reserved
aggregatedFrom
Contributor
Contribution
Human
Computation Task
provo:Agent
provo:Entity
provo:Activity
Consolidated
Information
solvedBy
enabledBy
contributionFrom
solutionTo
aggregatedBy
Human
Computation
Algorithm
Irene Celino. Human Computation VGI Provenance: Semantic Web-based Representation and Publishing.
IEEE Transactions on Geoscience and Remote Sensing, 2013
from ideation to business value
34
6. GUIDELINES
Is it that easy to involve people on the Web?
What should we care of when designing a human computation system?
copyright © 2018 Cefriel – All rights reserved
MICE AND MEN (OR: KEEP IT SIMPLE)
• Crowdsourcing workers behave like mice
• Mice prefer to use their motor skills (biologically cheap, e.g. pressing a lever to get food) rather
than their cognitive skills (biologically expensive, e.g. going through a labyrinth to get food)
• Workers prefer/are better at simple tasks (e.g. those that can be solved at first sight) and
discard/are worse at more complex tasks (e.g. those that require logics)
• Crowdsourcing tasks should be carefully designed
• Tasks as simple as possible for the workers to solve
• Complex tasks together with other incentives (e.g. variety/novelty)
35copyright © 2018 Cefriel – All rights reserved
Panos Ipeirotis. On Mice and Men: The Role of Biology in Crowdsourcing,
Keynote talk at Collective Intelligence, 2012.
DIVIDE ET IMPERA (OR: FIND-FIX-VERIFY)
• Find-Fix-Verify crowd programming pattern
• A long and “expensive” task…
• Summarize a text to shorten its total length
• …is decomposed in more atomic tasks…
1. find sentences that need to be shortened
2. fix a sentence by shortening it
3. verify which summarized sentence maintains original meaning
• …and the complex task is turned into a workflow of simple
tasks, and each step is outsourced to a crowd
36copyright © 2018 Cefriel – All rights reserved
M. Bernstein, G. Little, R. Miller, B. Hartmann, M. Ackerman, D. Karger, D. Crowell, K. Panovich.
Soylent: A Word Processor with a Crowd Inside, UIST Proceedings, 2010.
COMPARE AND CONTRAST
• A sort of “wisdom of the crowd(sourcing methods)”:
(1) apply different approaches to solve the same problem
and (2) compare results
• Which is the best approach
for a specific use case?
• Which is the most suitable crowd?
• Is human computation better/faster/cheaper
than machine computation?
• Knowledge Graph Refinement: use Human Computation
to “crowdsource” a gold standard and then use it to train
some statistical/machine learning algorithm
37copyright © 2018 Cefriel – All rights reserved
input
task
output
solution
Human
Computation
Machine
Computation
input
task
output
solution
Human Computation
Machine Computation
input
task
output
solution
Machine
Computation
Human
Computation
input
task
output
solution
Machine
Computation
Human
Computation
Human
Computation
Gloria Re Calegari, Gioele Nasi, Irene Celino. Human Computation vs. Machine Learning:
an Experimental Comparison for Image Classification. Human Computation Journal, 2018.
FINAL NOTE ON DISAGREEMENT
• Is there always a “right answer”? Or is there a “crowd truth”?
• Not always true/false, because of human subjectivity,
ambiguity and uncertainty
• Disagreement across contributors is not necessarily bad,
but a sign of: different opinions, interpretations, contexts,
perspectives, …
• Remember the long tail theory…
• …and ask yourself who are your users
and who you want to involve
38copyright © 2018 Cefriel – All rights reserved
Lora Aroyo, Chris Welty. Truth is a Lie: 7 Myths about Human Annotation. AI Magazine 2014.
from ideation to business value
39
7. INDIRECT PEOPLE INVOLVEMENT
Are there indirect ways to involve humans in data processing?
copyright © 2018 Cefriel – All rights reserved
HUMANS AS A SOURCE OF INFORMATION
• People are not only task executors, they are also information providers!
• Opportunistic sensing
• Voluntary or involuntary digital traces of human-related activities
• e.g., phone call logs, GPS traces, social media activities
• Open content and cooperative knowledge
• Data explicitly provided by people can “hide” further information
• e.g., logs of wiki editing, statistical distribution of contributes
40copyright © 2018 Cefriel – All rights reserved
FROM POI INFORMATION AND PHONE CALL LOGS TO LAND USE
• General topic: exploit “low-cost” information about a geographic area as features to
train a predictive model that outputs “expensive” information about the same area
• “Inexpensive” input information:
• Geo-information about points of interests
• Mobile traffic data processed using different time series techniques –
smoothing, decomposition, filtering, time-windowing
• “Expensive” output information:
• Land use characterization (usually collected through long and expensive
workflows that mix machine processing and costly human labour)
41copyright © 2018 Cefriel – All rights reserved
Gloria Re Calegari, Emanuela Carlino, Diego Peroni, Irene Celino. Extracting Urban Land Use from Linked Open Geospatial Data. IJGI, 2015
Gloria Re Calegari, Emanuela Carlino, Diego Peroni, Irene Celino. Filtering and Windowing Mobile Traffic Time Series for Territorial Land Use Classification. COMCOM, 2016
FROM SPATIAL ANALYTICS TO GEO-ONTOLOGY ENGINEERING
• OpenStreetMap collects information about points of interest (POI)
• Spatial distribution and conglomeration of specific POIs can give hints
about the geographical space
• Re-engineering of spatial features through comparison between areas:
same POI type shows different distribution  evidence for different
semantics (e.g. what is a pub in Milano vs. London)
• Semantic specification of spatial neighbourhoods:
• Emerging neighbourhoods from spatial clustering of POIs (opposed
to administrative divisions)
• Spatial version of tf-idf to compare between different areas (e.g.
central or peripheral areas in different cities) and to characterise
neighbourhoods (e.g. shopping district)
42copyright © 2018 Cefriel – All rights reserved
Gloria Re Calegari, Emanuela Carlino, Irene Celino, Diego Peroni. Supporting Geo-Ontology
Engineering through Spatial Data Analytics. 13th Extended Semantic Web Conference, 2016
MILANO
viale Sarca 226,
20126,
Milano - Italy
LONDON
4th floor
57 Rathbone Place
London W1T 1JU – UK
NEW YORK
One Liberty Plaza,
165 Broadway, 23rd Floor,
New York City, New York, 10006 USA
Cefriel.com
Thanks for your attention!
Any question?
Irene Celino
Knowledge Technologies
Digital Interaction Division
irene.celino@cefriel.com

More Related Content

What's hot

Human-in-the-loop: a design pattern for managing teams which leverage ML by P...
Human-in-the-loop: a design pattern for managing teams which leverage ML by P...Human-in-the-loop: a design pattern for managing teams which leverage ML by P...
Human-in-the-loop: a design pattern for managing teams which leverage ML by P...Big Data Spain
 
On Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebOn Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebJames Hendler
 
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2Connected Data World
 
Open IE tutorial 2018
Open IE tutorial 2018Open IE tutorial 2018
Open IE tutorial 2018Andre Freitas
 
OWF14 - Plenary Session : Ori Pekelman, Founder, Constellation Matrix
OWF14 - Plenary Session : Ori Pekelman, Founder, Constellation MatrixOWF14 - Plenary Session : Ori Pekelman, Founder, Constellation Matrix
OWF14 - Plenary Session : Ori Pekelman, Founder, Constellation MatrixParis Open Source Summit
 
Swap2010 agave
Swap2010 agaveSwap2010 agave
Swap2010 agavejuanaya
 
Presentation of current research: distributed architecture for recommendation...
Presentation of current research: distributed architecture for recommendation...Presentation of current research: distributed architecture for recommendation...
Presentation of current research: distributed architecture for recommendation...Benjamin Heitmann
 
Tutorial Cognition - Irene
Tutorial Cognition - IreneTutorial Cognition - Irene
Tutorial Cognition - IreneSSSW
 
The Semantic Web: It's for Real
The Semantic Web: It's for RealThe Semantic Web: It's for Real
The Semantic Web: It's for RealJames Hendler
 
Semantic Web: The Inside Story
Semantic Web: The Inside StorySemantic Web: The Inside Story
Semantic Web: The Inside StoryJames Hendler
 
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...Artificial Intelligence Institute at UofSC
 
SSSW 2016 Cognition Tutorial
SSSW 2016 Cognition TutorialSSSW 2016 Cognition Tutorial
SSSW 2016 Cognition TutorialIrene Celino
 

What's hot (15)

Human-in-the-loop: a design pattern for managing teams which leverage ML by P...
Human-in-the-loop: a design pattern for managing teams which leverage ML by P...Human-in-the-loop: a design pattern for managing teams which leverage ML by P...
Human-in-the-loop: a design pattern for managing teams which leverage ML by P...
 
On Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebOn Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the Web
 
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
 
Open IE tutorial 2018
Open IE tutorial 2018Open IE tutorial 2018
Open IE tutorial 2018
 
PhD thesis defense of Christopher Thomas
PhD thesis defense of Christopher ThomasPhD thesis defense of Christopher Thomas
PhD thesis defense of Christopher Thomas
 
OWF14 - Plenary Session : Ori Pekelman, Founder, Constellation Matrix
OWF14 - Plenary Session : Ori Pekelman, Founder, Constellation MatrixOWF14 - Plenary Session : Ori Pekelman, Founder, Constellation Matrix
OWF14 - Plenary Session : Ori Pekelman, Founder, Constellation Matrix
 
Swap2010 agave
Swap2010 agaveSwap2010 agave
Swap2010 agave
 
Presentation of current research: distributed architecture for recommendation...
Presentation of current research: distributed architecture for recommendation...Presentation of current research: distributed architecture for recommendation...
Presentation of current research: distributed architecture for recommendation...
 
Wither OWL
Wither OWLWither OWL
Wither OWL
 
Tutorial Cognition - Irene
Tutorial Cognition - IreneTutorial Cognition - Irene
Tutorial Cognition - Irene
 
The Semantic Web: It's for Real
The Semantic Web: It's for RealThe Semantic Web: It's for Real
The Semantic Web: It's for Real
 
Semantic Web: The Inside Story
Semantic Web: The Inside StorySemantic Web: The Inside Story
Semantic Web: The Inside Story
 
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
 
Implementing Artificial Intelligence with Big Data
Implementing Artificial Intelligence with Big DataImplementing Artificial Intelligence with Big Data
Implementing Artificial Intelligence with Big Data
 
SSSW 2016 Cognition Tutorial
SSSW 2016 Cognition TutorialSSSW 2016 Cognition Tutorial
SSSW 2016 Cognition Tutorial
 

Similar to Human Computation

Human computation @ Data Semantics
Human computation @ Data SemanticsHuman computation @ Data Semantics
Human computation @ Data SemanticsIrene Celino
 
CC TEL- Simulation-based co-design of algorithms
CC TEL- Simulation-based co-design of algorithmsCC TEL- Simulation-based co-design of algorithms
CC TEL- Simulation-based co-design of algorithmsSebastian Dennerlein
 
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera, Inc.
 
Human Computation for VGI Management
Human Computation for VGI ManagementHuman Computation for VGI Management
Human Computation for VGI ManagementIrene Celino
 
Metadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge ProductionMetadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge ProductionKevin Rundblad
 
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...DATAVERSITY
 
JIMS Rohini IT Flash Monthly Newsletter - October Issue
JIMS Rohini IT Flash Monthly Newsletter  - October IssueJIMS Rohini IT Flash Monthly Newsletter  - October Issue
JIMS Rohini IT Flash Monthly Newsletter - October IssueJIMS Rohini Sector 5
 
Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxjuliennehar
 
Human-in-the-loop @ ISWS 2019
Human-in-the-loop @ ISWS 2019Human-in-the-loop @ ISWS 2019
Human-in-the-loop @ ISWS 2019Irene Celino
 
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...Bernhard Rieder
 
Data Viz for Data Discovery
Data Viz for Data DiscoveryData Viz for Data Discovery
Data Viz for Data DiscoveryMegan Bowe
 
Machine Learning and Social Participation
Machine Learning and Social ParticipationMachine Learning and Social Participation
Machine Learning and Social ParticipationYasodara Cordova
 
Metalayer, now Colayer at Internet Expo
Metalayer, now Colayer at Internet ExpoMetalayer, now Colayer at Internet Expo
Metalayer, now Colayer at Internet ExpoMarkus Hegi
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?Elena Simperl
 
Seminar 20221027 v4.pptx
Seminar 20221027 v4.pptxSeminar 20221027 v4.pptx
Seminar 20221027 v4.pptxISSIP
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Demystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningDemystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningJulian Bright
 
Networks, Hashtags, Memes: A Quali-Quantitative Approach for Exploring Social...
Networks, Hashtags, Memes: A Quali-Quantitative Approach for Exploring Social...Networks, Hashtags, Memes: A Quali-Quantitative Approach for Exploring Social...
Networks, Hashtags, Memes: A Quali-Quantitative Approach for Exploring Social...Janna Joceli Omena
 
Thwart Fraud Using Graph-Enhanced Machine Learning and AI
Thwart Fraud Using Graph-Enhanced Machine Learning and AIThwart Fraud Using Graph-Enhanced Machine Learning and AI
Thwart Fraud Using Graph-Enhanced Machine Learning and AINeo4j
 
The Unreasonable Effectiveness of Metadata
The Unreasonable Effectiveness of MetadataThe Unreasonable Effectiveness of Metadata
The Unreasonable Effectiveness of MetadataJames Hendler
 

Similar to Human Computation (20)

Human computation @ Data Semantics
Human computation @ Data SemanticsHuman computation @ Data Semantics
Human computation @ Data Semantics
 
CC TEL- Simulation-based co-design of algorithms
CC TEL- Simulation-based co-design of algorithmsCC TEL- Simulation-based co-design of algorithms
CC TEL- Simulation-based co-design of algorithms
 
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
 
Human Computation for VGI Management
Human Computation for VGI ManagementHuman Computation for VGI Management
Human Computation for VGI Management
 
Metadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge ProductionMetadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge Production
 
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...
 
JIMS Rohini IT Flash Monthly Newsletter - October Issue
JIMS Rohini IT Flash Monthly Newsletter  - October IssueJIMS Rohini IT Flash Monthly Newsletter  - October Issue
JIMS Rohini IT Flash Monthly Newsletter - October Issue
 
Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docx
 
Human-in-the-loop @ ISWS 2019
Human-in-the-loop @ ISWS 2019Human-in-the-loop @ ISWS 2019
Human-in-the-loop @ ISWS 2019
 
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
 
Data Viz for Data Discovery
Data Viz for Data DiscoveryData Viz for Data Discovery
Data Viz for Data Discovery
 
Machine Learning and Social Participation
Machine Learning and Social ParticipationMachine Learning and Social Participation
Machine Learning and Social Participation
 
Metalayer, now Colayer at Internet Expo
Metalayer, now Colayer at Internet ExpoMetalayer, now Colayer at Internet Expo
Metalayer, now Colayer at Internet Expo
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 
Seminar 20221027 v4.pptx
Seminar 20221027 v4.pptxSeminar 20221027 v4.pptx
Seminar 20221027 v4.pptx
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Demystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningDemystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine Learning
 
Networks, Hashtags, Memes: A Quali-Quantitative Approach for Exploring Social...
Networks, Hashtags, Memes: A Quali-Quantitative Approach for Exploring Social...Networks, Hashtags, Memes: A Quali-Quantitative Approach for Exploring Social...
Networks, Hashtags, Memes: A Quali-Quantitative Approach for Exploring Social...
 
Thwart Fraud Using Graph-Enhanced Machine Learning and AI
Thwart Fraud Using Graph-Enhanced Machine Learning and AIThwart Fraud Using Graph-Enhanced Machine Learning and AI
Thwart Fraud Using Graph-Enhanced Machine Learning and AI
 
The Unreasonable Effectiveness of Metadata
The Unreasonable Effectiveness of MetadataThe Unreasonable Effectiveness of Metadata
The Unreasonable Effectiveness of Metadata
 

More from Irene Celino

Knowledge Technologies group at Cefriel
Knowledge Technologies group at CefrielKnowledge Technologies group at Cefriel
Knowledge Technologies group at CefrielIrene Celino
 
Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with ...
Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with ...Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with ...
Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with ...Irene Celino
 
A Framework to build Games with a Purpose for Linked Data Refinement
A Framework to build Games with a Purpose  for Linked Data RefinementA Framework to build Games with a Purpose  for Linked Data Refinement
A Framework to build Games with a Purpose for Linked Data RefinementIrene Celino
 
Involving people in Citizen Science through game incentives: the case of the ...
Involving people in Citizen Science through game incentives: the case of the ...Involving people in Citizen Science through game incentives: the case of the ...
Involving people in Citizen Science through game incentives: the case of the ...Irene Celino
 
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...Irene Celino
 
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility Behaviours
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility BehavioursNinja Riders - Youth and Road Safety: Discovering Urban Mobility Behaviours
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility BehavioursIrene Celino
 
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...Irene Celino
 
Give and Take in Citizen Science
Give and Take in Citizen ScienceGive and Take in Citizen Science
Give and Take in Citizen ScienceIrene Celino
 
Ninja Riders @ Human Factory Day 2017
Ninja Riders @ Human Factory Day 2017Ninja Riders @ Human Factory Day 2017
Ninja Riders @ Human Factory Day 2017Irene Celino
 
Night Knights: exploiting games to engage people in a citizen science campaign
Night Knights: exploiting games to engage people in a citizen science campaignNight Knights: exploiting games to engage people in a citizen science campaign
Night Knights: exploiting games to engage people in a citizen science campaignIrene Celino
 
STARS4ALL-CAPSSI-Workshop
STARS4ALL-CAPSSI-WorkshopSTARS4ALL-CAPSSI-Workshop
STARS4ALL-CAPSSI-WorkshopIrene Celino
 
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...Irene Celino
 
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...Irene Celino
 
Supporting Geo-Ontology Engineering through Spatial Data Analytics
Supporting Geo-Ontology Engineering through Spatial Data AnalyticsSupporting Geo-Ontology Engineering through Spatial Data Analytics
Supporting Geo-Ontology Engineering through Spatial Data AnalyticsIrene Celino
 
Smart City Semantics - Data Analytics and Human Computation to understand the...
Smart City Semantics - Data Analytics and Human Computation to understand the...Smart City Semantics - Data Analytics and Human Computation to understand the...
Smart City Semantics - Data Analytics and Human Computation to understand the...Irene Celino
 
Towards a Semantic City Service Ecosystem
Towards a Semantic City Service EcosystemTowards a Semantic City Service Ecosystem
Towards a Semantic City Service EcosystemIrene Celino
 
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014Irene Celino
 
Urbanopoly @ PlanetData review
Urbanopoly @ PlanetData reviewUrbanopoly @ PlanetData review
Urbanopoly @ PlanetData reviewIrene Celino
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Irene Celino
 
Urbanopoly minute madness
Urbanopoly minute madnessUrbanopoly minute madness
Urbanopoly minute madnessIrene Celino
 

More from Irene Celino (20)

Knowledge Technologies group at Cefriel
Knowledge Technologies group at CefrielKnowledge Technologies group at Cefriel
Knowledge Technologies group at Cefriel
 
Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with ...
Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with ...Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with ...
Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with ...
 
A Framework to build Games with a Purpose for Linked Data Refinement
A Framework to build Games with a Purpose  for Linked Data RefinementA Framework to build Games with a Purpose  for Linked Data Refinement
A Framework to build Games with a Purpose for Linked Data Refinement
 
Involving people in Citizen Science through game incentives: the case of the ...
Involving people in Citizen Science through game incentives: the case of the ...Involving people in Citizen Science through game incentives: the case of the ...
Involving people in Citizen Science through game incentives: the case of the ...
 
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...
 
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility Behaviours
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility BehavioursNinja Riders - Youth and Road Safety: Discovering Urban Mobility Behaviours
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility Behaviours
 
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...
 
Give and Take in Citizen Science
Give and Take in Citizen ScienceGive and Take in Citizen Science
Give and Take in Citizen Science
 
Ninja Riders @ Human Factory Day 2017
Ninja Riders @ Human Factory Day 2017Ninja Riders @ Human Factory Day 2017
Ninja Riders @ Human Factory Day 2017
 
Night Knights: exploiting games to engage people in a citizen science campaign
Night Knights: exploiting games to engage people in a citizen science campaignNight Knights: exploiting games to engage people in a citizen science campaign
Night Knights: exploiting games to engage people in a citizen science campaign
 
STARS4ALL-CAPSSI-Workshop
STARS4ALL-CAPSSI-WorkshopSTARS4ALL-CAPSSI-Workshop
STARS4ALL-CAPSSI-Workshop
 
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...
 
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...
 
Supporting Geo-Ontology Engineering through Spatial Data Analytics
Supporting Geo-Ontology Engineering through Spatial Data AnalyticsSupporting Geo-Ontology Engineering through Spatial Data Analytics
Supporting Geo-Ontology Engineering through Spatial Data Analytics
 
Smart City Semantics - Data Analytics and Human Computation to understand the...
Smart City Semantics - Data Analytics and Human Computation to understand the...Smart City Semantics - Data Analytics and Human Computation to understand the...
Smart City Semantics - Data Analytics and Human Computation to understand the...
 
Towards a Semantic City Service Ecosystem
Towards a Semantic City Service EcosystemTowards a Semantic City Service Ecosystem
Towards a Semantic City Service Ecosystem
 
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014
 
Urbanopoly @ PlanetData review
Urbanopoly @ PlanetData reviewUrbanopoly @ PlanetData review
Urbanopoly @ PlanetData review
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Urbanopoly minute madness
Urbanopoly minute madnessUrbanopoly minute madness
Urbanopoly minute madness
 

Recently uploaded

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 

Recently uploaded (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Human Computation

  • 1. HUMAN COMPUTATION Irene Celino – irene.celino@cefriel.com Cefriel, Viale Sarca 226, 20126 Milano Seminar @ Data Semantics course – April 11th, 2018
  • 2. 1. Introduction 2. Linked Data and Knowledge Graph Refinement 3. Human Computation and Games with a Purpose 4. Examples of GWAP for Data Linking 5. Truth Inference and Open Science 6. Guidelines 7. Indirect People Involvement 2copyright © 2018 Cefriel – All rights reserved
  • 3. from ideation to business value 3 1. INTRODUCTION Is the Web a pure technological artefact? What role can people play on the Web? copyright © 2018 Cefriel – All rights reserved
  • 4. WEB AS A SOCIAL ARTEFACT “The Web isn’t about what you can do with computers. It’s people and, yes, they are connected by computers. But computer science, as the study of what happens in a computer, doesn’t tell you about what happens on the Web” – sir Tim Berners-Lee 4copyright © 2018 Cefriel – All rights reserved
  • 5. Open Source Software “Given enough eyeballs, all bugs are shallow.” Eric S. Raymond (The Cathedral and the Bazaar) OPEN EVERYTHING Open Content “It is easy when you skip the intermediaries” original motto of Creative Commons (EN video) (IT video) Open Data 5copyright © 2018 Cefriel – All rights reserved “Raw. Data. Now.” Tim Berners-Lee (The year open data went worldwide – TED Talk)
  • 6. COOPERATION ON THE WEB TO PRODUCE OPEN KNOWLEDGE 6copyright © 2018 Cefriel – All rights reserved
  • 7. WISDOM OF CROWDS • “Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations” • Criteria for a wise crowd • Diversity of opinion (importance of interpretation) • Independence (not a “single mind”) • Decentralization (importance of local knowledge) • Aggregation (aim to get a collective decision) • The are also failures/risks in crowd decisions: • Homogeneity, centralization, division, imitation, emotionality 7copyright © 2018 Cefriel – All rights reserved James Surowiecki The wisdom of crowds Anchor, 2005
  • 8. from ideation to business value 8 2. LINKED DATA & KNOWLEDGE GRAPH REFINEMENT Do we need to involve people in Semantic Web systems? What semantic data management tasks can we effectively “outsource” to humans? copyright © 2018 Cefriel – All rights reserved
  • 9. HUMANS IN THE SEMANTIC WEB • Knowledge-intensive and/or context-specific character of Semantic Web tasks: • e.g., conceptual modelling, multi-language resource labelling, content annotation with ontologies, concept/entity similarity recognition, … • Need to engage users and involve them in executing tasks: • e.g., wikis for semantic content authoring, folksonomies to bootstrap formal ontologies, instance creation by data entry, … 9copyright © 2018 Cefriel – All rights reserved
  • 10. SEMANTIC WEB TASKS (ALSO) FOR HUMANS 10copyright © 2018 Cefriel – All rights reserved Fact level Schema level Collection Creation CorrectionValidation Filtering Ranking Linking Conceptual modelling Ontology population Quality assessment Ontology re- engineering Ontology pruning Ontology elicitation Knowledge acquisition Ontology repair Knowledge base update Data search/ selection Link generation Ontology alignment Ontology matching
  • 11. AUTOMATIC METHODS IN THE SEMANTIC WEB? • Knowledge Graph Refinement (and, in general, linked dataset refinement) is an emerging and hot topic to (1) identify and correct errors and (2) add missing knowledge • e.g., completing type assertions via classification, predicting relations from textual sources, finding erroneous type assertions, identifying erroneous literal values through anomaly/outlier detection, … • Statistical and machine learning approaches require some partial gold standard, i.e. a “ground truth” dataset to train automatic models • Ground truth is usually put together manually by expert • Sourcing gold standard from humans is expensive! 11copyright © 2018 Cefriel – All rights reserved Heiko Paulheim. Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic Web Journal, 2017
  • 12. DATA LINKING • Creation of links in the form of RDF triples (subject, predicate, object) • Within the same dataset (i.e. generating new connections between resources of the same dataset or knowledge graph) • Across different datasets (i.e. creating RDF links, as named in the Linked Data world) • Note: • In literature, data linking often means finding equivalent resources (similarly to record linkage in database research), i.e. triples with correspondence/match predicate (e.g. owl:sameAs)  in the following, data linking is intended in its broader meaning (i.e. links with any predicate) 12copyright © 2018 Cefriel – All rights reserved
  • 13. DATA LINKING: SOME DEFINITIONS • Resources R is the set of all resources (and literals), whenever possible also described by the respective types. More specifically: R = Rs ∪ Ro, where Rs is the set of resources that can take the role of subject in a triple and Ro is the set of resources that can take the role of object in a triple; as said above the two sets are not necessarily disjoint, i.e. it can happen that Rs ∩ Ro ≠ ∅. • Predicates P is the set of all predicates, whenever possible also described by the respective domain and range. • Links L is the set of all links; since links are triples created between resources and predicates it is: L ⊂ Rs × P × Ro; each link is defined as l = (rs,p,ro) ∈ L with rs ∈ Rs, p ∈ P, ro ∈ Ro. L is usually smaller than the full Cartesian product of Rs, P, Ro, because in each link (rs,p,ro) it must be true that rs ∈ domain(p) and ro ∈ range(p). • Link scores σ is the score of a link, i.e. a value indicating the confidence on the truth value of the link; usually σ ∈ [0,1]; each link l ∈ L can have an associated score. 13copyright © 2018 Cefriel – All rights reserved
  • 14. CASES OF DATA LINKING • Link creation: a link l is created: given R = Rs ∪ Ro and P, the link l = (rs,p,ro), with rs ∈ Rs, p ∈ P, ro ∈ Ro is created and added to L • e.g., music classification: assign one or more music styles to audio tracks by creating the link (track,genre,style) • Link ranking: given the set of links L, a score σ ∈ [0,1] is assigned to each link l. The score represents the probability of the link to be recognized as true. Links can be ordered on the basis of their score σ, thus obtaining a ranking • e.g., ranking photos depicting a specific person (an actor, a singer, a politician) to identify the pictures in which the person is more recognizable or more clearly depicted • Link validation: given the set of links L, a score σ ∈ [0,1] is assigned to each link l. The score represents the actual truth value of the link. A threshold t ∈ [0,1] is set so that all links with score σ ≥ t are considered true • e.g., assessing the correct music style identification in audio tracks (music classification) 14copyright © 2018 Cefriel – All rights reserved
  • 15. from ideation to business value 15 3. HUMAN COMPUTATION & GAMES WITH A PURPOSE What goals can humans help machines to achieve? How to involve a crowd of persons? What extrinsic rewards (money, prizes, etc.) or intrinsic incentives can we adopt to motivate people? copyright © 2018 Cefriel – All rights reserved
  • 16. HUMAN COMPUTATION • Human Computation is a computer science technique in which a computational process is performed by outsourcing certain steps to humans. Unlike traditional computation, in which a human delegates a task to a computer, in Human Computation the computer asks a person or a large group of people to solve a problem; then it collects, interprets and integrates their solutions • The original concept of Human Computation by its inventor Luis von Ahn derived from the common sense observation that people are intrinsically very good at solving some kinds of tasks which are, on the other hand, very hard to address for a computer; this is the case of a number of targets of Artificial Intelligence (like image recognition or natural language understanding) for which research is still open 16copyright © 2018 Cefriel – All rights reserved Edith Law and Luis von Ahn. Human computation. Synthesis Lectures on Artificial Intelligence and Machine Learning, 2011
  • 17. HUMAN COMPUTATION 17copyright © 2018 Cefriel – All rights reserved Problem: an Artificial Intelligence algorithm is unable to achieve an adequate result with a satisfactory level of confidence Solution: ask people to intervene when the AI system fails, “masking” the task within another human process Example: https://www.google.com/recaptcha/
  • 18. CROWDSOURCING • Crowdsourcing is the process to outsource tasks to a “crowd” of distributed people. The possibility to exploit the Internet as vehicle to recruit contributors and to assign tasks led to the rise of micro-work platforms, thus often (but not always) implying a monetary reward. The term Crowdsourcing, although quite recent, is used to indicate a wide range of practices; however, the most common meaning of Crowdsourcing implies that the “crowd” of workers involved in the solution of tasks is different from the traditional or intended groups of task solvers 18copyright © 2018 Cefriel – All rights reserved Jeff Howe. Crowdsourcing: How the power of the crowd is driving the future of business. Random House, 2008
  • 19. CROWDSOURCING 19copyright © 2018 Cefriel – All rights reserved Problem: a company needs to execute a lot of simple tasks, but cannot afford hiring a person to do that job Solution: pack tasks in bunches (human intelligence tasks or HITs) and outsource them to a very cheap workforce through an online platform Example: https://www.mturk.com/
  • 20. CITIZEN SCIENCE • Citizen Science is the involvement of volunteers to collect or process data as part of a scientific or research experiment; those volunteers can be the scientists and researchers themselves, but more often the name of this discipline “implies a form of science developed and enacted by citizens” including those “outside of formal scientific institutions”, thus representing a form of public participation to science. Formally, Citizen Science has been defined as “the systematic collection and analysis of data; development of technology; testing of natural phenomena; and the dissemination of these activities by researchers on a primarily avocational basis”. 20copyright © 2018 Cefriel – All rights reserved Alan Irwin. Citizen science: A study of people, expertise and sustainable development. Psychology Press, 1995
  • 21. CITIZEN SCIENCE 21copyright © 2018 Cefriel – All rights reserved Example: https://www.zooniverse.org/ Problem: a scientific experiment requires the execution of a lot of simple tasks, but researchers are busy Solution: engage the general audience in solving those tasks, explaining that they are contributing to science, research and the public good
  • 22. SPOT THE DIFFERENCE… • Similarities: • Involvement of people • No automatic replacement • Variations: • Motivation • Reward (glory, money, passion/need) • Hybrids or parallel! 22copyright © 2018 Cefriel – All rights reserved Citizen Science Crowdsourcing Human Computation
  • 23. GAMES WITH A PURPOSE • A GWAP lets to outsource to humans some steps of a computational process in an entertaining way • The application has a “collateral effect”, because players’ actions are exploited to solve a hidden task • The application *IS* a fully-fledged game (opposed to gamification, which is the use of game-like features in non-gaming environments) • The players are (usually) unaware of the hidden purpose, they simply meet game challenges 23copyright © 2018 Cefriel – All rights reserved Luis Von Ahn. Games with a purpose. Computer, 39(6):92–94, 2006 Luis Von Ahn and Laura Dabbish. Designing games with a purpose. Communications of the ACM, 51(8):58–67, 2008
  • 24. GAMES WITH A PURPOSE (GWAP) 24copyright © 2018 Cefriel – All rights reserved Problem: it’s the same of Human Computation (ask humans when AI fails) Solution: Solution: hide the task within a game, so that users are motivated by game challenges, often remaining unaware of the hidden purpose, task solution comes from agreement between players
  • 25. from ideation to business value 25 4. GWAPS FOR DATA LINKING Can we embed data linking tasks within Games with a Purpose? copyright © 2018 Cefriel – All rights reserved
  • 26. 26 • Input: set of all links <asset> foaf:depiction <photo> • Goal: assign score 𝜎 to rank links on their recognisability/representa- tiveness • The score 𝜎 is a function of 𝑋 𝑁 where 𝑋 is the no. of successes (=recognitions) and 𝑁 the no. of trials of the Bernoulli process (guess or not guess) realized by the game • Cultural heritage assets in Milano and their pictures LINK RANKING copyright © 2018 Cefriel – All rights reserved http://bit.ly/indomilando Pure GWAP with hidden purpose Points, badges, leaderboard as intrinsic reward Link ranking is a result of the “agreement” between players But also an educational “collateral effect” Irene Celino, Andrea Fiano, Riccardo Fino. Analysis of a Cultural Heritage Game with a Purpose with an Educational Incentive. 16th International Conference on Web Engineering, 2016
  • 27. 27 • Input: set of links <land-area> clc:hasLandCover <land-cover> • Goal: assign score 𝜎 to each link to discover the “right” land cover class • Score 𝜎 of each link is updated on the basis of players’ choices (incremented if link selected, decremented if link not selected) • When the score of a link overcomes the threshold 𝜎 ≥ 𝑡 , the link is considered “true” (and removed from the game) • Two automatic classifications in disagreement: <land-cover-assigned-by-DUSAF> ≠ <land-cover-assigned-by-GL30> LINK VALIDATION copyright © 2018 Cefriel – All rights reserved https://youtu.be/Q0ru1hhDM9Q http://bit.ly/foss4game Pure GWAP with not-so-hidden purpose (played by “experts”) Points, badges, leaderboard as intrinsic reward A player scores if he/she guess one of the two disagreeing classifications Link validation is a result of the “agreement” between players Maria Antonia Brovelli, Irene Celino, Andrea Fiano, Monia Elisa Molinari, Vijaycharan Venkatachalam. A crowdsourcing-based game for land cover validation. Applied Geomatics, 2017
  • 28. 28 • Input: set of subject resources (pictures) and object resources (classification categories) • Goal: create links <picture> hasCategory <category> and assign score 𝜎 to each link • Score 𝜎 of each link is updated on the basis of players’ choices (incremented if link selected) • When the score of a link overcomes the threshold 𝜎 ≥ 𝑡 , the link is considered “true” (and the picture is removed from the game) • Identify pictures of cities from above between those taken on board of the ISS (the pictures are used then in a scientific process in light pollution research) LINK COLLECTION & VALIDATION copyright © 2018 Cefriel – All rights reserved http://nightknights.eu Pure GWAP with not-so-hidden purpose (but played by anybody) Points, badges, leaderboard as intrinsic reward A player scores if he/she agrees with another player “Bonus” intrinsic reward with NASA pictures! Gloria Re Calegari, Gioele Nasi, Irene Celino. Human Computation vs. Machine Learning: an Experimental Comparison for Image Classification. Human Computation Journal, 2018.
  • 29. from ideation to business value 29 5. TRUTH INFERENCE & OPEN SCIENCE How do we aggregate the contributions from the crowd? Are individual contribution of any value? copyright © 2018 Cefriel – All rights reserved
  • 30. AGGREGATION OF CONTRIBUTIONS • The same task is usually given to multiple human contributors (named workers in crowdsourcing) • Results on the same task are then aggregated across different contributors (“wisdom of crowds”) • How to perform the truth inference process? • Simplistic solution: majority voting across all contributors • But… are all contributors “created equal”? No! Less simplistic solutions: • Majority voting across “quality” contributors (filtering out “spammers”) • Weighted majority voting with estimation of contributors “reliability” • Expectation maximization • Message passing… and a lot more! • How to compute contributor reliability? • Assessment tasks (gold standard) with known solution to measure reliability • History of contributions/past behaviours to compute a “reputation” value 30copyright © 2018 Cefriel – All rights reserved
  • 31. TRUTH INFERENCE GENERIC ALGORITHM 31copyright © 2018 Cefriel – All rights reserved Yudian Zheng, Guoliang Li, Yuanbing Li, Caihua Shan, Reynold Cheng. Truth Inference in Crowdsourcing: Is the Problem Solved? VLDB 2017 Input: contributions Output: truth and reliability Step 2: compute an estimation of contributor reliability (e.g. precision on truth estimation) Step 1: compute an estimation of the truth (e.g. majority voting) Iterate until convergence (e.g. until some difference w.r.t. previous step is really small)
  • 32. OPEN SCIENCE: ENABLING COMPARE & CONTRAST • Open Science has the aim to make scientific research and data accessible to all levels of society • Repeatability and reproducibility are among the foundational principles of open science • Human Computation aims at involving people in some step of the scientific process • Human contributors generate data to solve assigned tasks • Algorithms aggregate contributions in the truth inference process • Can we compare different truth inference algorithms? • Yes, if we make available the data of the Human Computation process! • What can we share, e.g. in the case of data linking tasks? • “True” and “false” links • Confidence scores of the links • Individual contributions and aggregation process 32copyright © 2018 Cefriel – All rights reserved
  • 33. PROV-O AND HUMAN COMPUTATION ONTOLOGY • Provenance is information about entities, activities, and people involved in producing a piece of data or thing (used to assess its quality, reliability or trustworthiness) • W3C defined the PROV-O ontology to capture provenance information https://www.w3.org/TR/prov-o/ • The Human Computation ontology extends PROV-O to describe the data shared within a Human Computation Process http://swa.cefriel.it/ontologies/hc • Data linking process information can be published according to linked data principles described with the HC ontology (e.g. data from the Urbanopoly GWAP at http://swa.cefriel.it/linkeddata/) 33copyright © 2018 Cefriel – All rights reserved aggregatedFrom Contributor Contribution Human Computation Task provo:Agent provo:Entity provo:Activity Consolidated Information solvedBy enabledBy contributionFrom solutionTo aggregatedBy Human Computation Algorithm Irene Celino. Human Computation VGI Provenance: Semantic Web-based Representation and Publishing. IEEE Transactions on Geoscience and Remote Sensing, 2013
  • 34. from ideation to business value 34 6. GUIDELINES Is it that easy to involve people on the Web? What should we care of when designing a human computation system? copyright © 2018 Cefriel – All rights reserved
  • 35. MICE AND MEN (OR: KEEP IT SIMPLE) • Crowdsourcing workers behave like mice • Mice prefer to use their motor skills (biologically cheap, e.g. pressing a lever to get food) rather than their cognitive skills (biologically expensive, e.g. going through a labyrinth to get food) • Workers prefer/are better at simple tasks (e.g. those that can be solved at first sight) and discard/are worse at more complex tasks (e.g. those that require logics) • Crowdsourcing tasks should be carefully designed • Tasks as simple as possible for the workers to solve • Complex tasks together with other incentives (e.g. variety/novelty) 35copyright © 2018 Cefriel – All rights reserved Panos Ipeirotis. On Mice and Men: The Role of Biology in Crowdsourcing, Keynote talk at Collective Intelligence, 2012.
  • 36. DIVIDE ET IMPERA (OR: FIND-FIX-VERIFY) • Find-Fix-Verify crowd programming pattern • A long and “expensive” task… • Summarize a text to shorten its total length • …is decomposed in more atomic tasks… 1. find sentences that need to be shortened 2. fix a sentence by shortening it 3. verify which summarized sentence maintains original meaning • …and the complex task is turned into a workflow of simple tasks, and each step is outsourced to a crowd 36copyright © 2018 Cefriel – All rights reserved M. Bernstein, G. Little, R. Miller, B. Hartmann, M. Ackerman, D. Karger, D. Crowell, K. Panovich. Soylent: A Word Processor with a Crowd Inside, UIST Proceedings, 2010.
  • 37. COMPARE AND CONTRAST • A sort of “wisdom of the crowd(sourcing methods)”: (1) apply different approaches to solve the same problem and (2) compare results • Which is the best approach for a specific use case? • Which is the most suitable crowd? • Is human computation better/faster/cheaper than machine computation? • Knowledge Graph Refinement: use Human Computation to “crowdsource” a gold standard and then use it to train some statistical/machine learning algorithm 37copyright © 2018 Cefriel – All rights reserved input task output solution Human Computation Machine Computation input task output solution Human Computation Machine Computation input task output solution Machine Computation Human Computation input task output solution Machine Computation Human Computation Human Computation Gloria Re Calegari, Gioele Nasi, Irene Celino. Human Computation vs. Machine Learning: an Experimental Comparison for Image Classification. Human Computation Journal, 2018.
  • 38. FINAL NOTE ON DISAGREEMENT • Is there always a “right answer”? Or is there a “crowd truth”? • Not always true/false, because of human subjectivity, ambiguity and uncertainty • Disagreement across contributors is not necessarily bad, but a sign of: different opinions, interpretations, contexts, perspectives, … • Remember the long tail theory… • …and ask yourself who are your users and who you want to involve 38copyright © 2018 Cefriel – All rights reserved Lora Aroyo, Chris Welty. Truth is a Lie: 7 Myths about Human Annotation. AI Magazine 2014.
  • 39. from ideation to business value 39 7. INDIRECT PEOPLE INVOLVEMENT Are there indirect ways to involve humans in data processing? copyright © 2018 Cefriel – All rights reserved
  • 40. HUMANS AS A SOURCE OF INFORMATION • People are not only task executors, they are also information providers! • Opportunistic sensing • Voluntary or involuntary digital traces of human-related activities • e.g., phone call logs, GPS traces, social media activities • Open content and cooperative knowledge • Data explicitly provided by people can “hide” further information • e.g., logs of wiki editing, statistical distribution of contributes 40copyright © 2018 Cefriel – All rights reserved
  • 41. FROM POI INFORMATION AND PHONE CALL LOGS TO LAND USE • General topic: exploit “low-cost” information about a geographic area as features to train a predictive model that outputs “expensive” information about the same area • “Inexpensive” input information: • Geo-information about points of interests • Mobile traffic data processed using different time series techniques – smoothing, decomposition, filtering, time-windowing • “Expensive” output information: • Land use characterization (usually collected through long and expensive workflows that mix machine processing and costly human labour) 41copyright © 2018 Cefriel – All rights reserved Gloria Re Calegari, Emanuela Carlino, Diego Peroni, Irene Celino. Extracting Urban Land Use from Linked Open Geospatial Data. IJGI, 2015 Gloria Re Calegari, Emanuela Carlino, Diego Peroni, Irene Celino. Filtering and Windowing Mobile Traffic Time Series for Territorial Land Use Classification. COMCOM, 2016
  • 42. FROM SPATIAL ANALYTICS TO GEO-ONTOLOGY ENGINEERING • OpenStreetMap collects information about points of interest (POI) • Spatial distribution and conglomeration of specific POIs can give hints about the geographical space • Re-engineering of spatial features through comparison between areas: same POI type shows different distribution  evidence for different semantics (e.g. what is a pub in Milano vs. London) • Semantic specification of spatial neighbourhoods: • Emerging neighbourhoods from spatial clustering of POIs (opposed to administrative divisions) • Spatial version of tf-idf to compare between different areas (e.g. central or peripheral areas in different cities) and to characterise neighbourhoods (e.g. shopping district) 42copyright © 2018 Cefriel – All rights reserved Gloria Re Calegari, Emanuela Carlino, Irene Celino, Diego Peroni. Supporting Geo-Ontology Engineering through Spatial Data Analytics. 13th Extended Semantic Web Conference, 2016
  • 43. MILANO viale Sarca 226, 20126, Milano - Italy LONDON 4th floor 57 Rathbone Place London W1T 1JU – UK NEW YORK One Liberty Plaza, 165 Broadway, 23rd Floor, New York City, New York, 10006 USA Cefriel.com Thanks for your attention! Any question? Irene Celino Knowledge Technologies Digital Interaction Division irene.celino@cefriel.com