Crowdsourcing satellite imagery (Talk at Giscience2012)

Crowdsourcing satellite imagery:
study of iterative vs. parallel models
Nicolas Maisonneuve, Bastien Chopard

Twitter: nmaisonneuve

Friday, September 21, 12 1

Damage assessment after a humanitarian crisis


Port-au-prince: 300K buildings assessed
in 3 months for 8 UNOSAT experts


Organizational challenges:
How to organize non-trained volunteers,
especially to enforce quality?



Investigated scope:
• Qualitative + Quantitative study of 2 collaborative models inspired by
Computer science: iterative vs parallel information processing

• Controlled experiment to isolate quality = F(organisation), removing
other parameters e.g. training, task difﬁculty

• this research != studying real world collaborative practices but more
extreme/symbolic cases to guide collaborative system designers



Investigated scope:




Tested Collaborative Models (1/2)
iterative model

e.g. wikipedia, open street map, assembly lines

parallel model

aggregation

e.g. voting systems in society, distributed computing

parallel model

old version (17th to mid 20th century): when computers were human/women
(Mathematical Table project - (1938 -1948)

Qualitative comparison
Iterative Parallel

problem No need to divide complex Complex problem need to be
divisibility problem divided in easier pieces


Iterative Parallel


optimization copy emphasizing isolation emphasizing
tradeoff exploitation exploration


Iterative Parallel



quality redundancy + diversity of
sequential improvement
mechanism opinions


Iterative Parallel



quality redundancy + diversity of
sequential improvement
mechanism opinions

useless redundancy for
path dependency effect +
side effect obvious decisions + pb of
sensitivity to vandalism
aggregation


Controlled Experiment: web platform

Interface/instruction for the Parallel model


on 3 maps with different topologies
(annotated by 1 UNITAR expert)


Participants used for the experiments:
Mechanical Turk as simulator


Data Quality Metrics

Quality of the collective output
• type I errors = p(wrong annotation)
• type II errors = p(missing a building)
• Consistency

Analogy with the information retrieval ﬁeld:
• Precision = p(an annotation is a building)
• Recall = p(a building is annotated)
• F-measure = score mixing recall + precision
• (metrics adjusted with tolerance distance)


Methodology for parallel model
Step 1 - collecting independent contribution:
N for (map1, map2, map3) = (121,120,113)


Step 2 - for each map,
generating the set of groups of m=[1 to N] participants

m=1

m=2

m=3


Step 3 - for each group: aggregating + computing quality

groups
of m = 2

Spatial Clustering of points + quorum

Compute Data Quality with Gold Standard

Precision Recall F-measure


The more = the better?
(parallel model)
avg. F-measure

yes but until some points..
• (Adding more people wont change the consensus panel)
• Limitation of Linus’ law (compared to iterative model e.g.
openstreetmap)
• Wisdom != skill: we can’t replace training by more people

Methodology for Iterative model

sample of an iterative process for map3



n instances
of about m
iterations

Collected data for map1, map2, map3 = 13, 21,25
instances of about 10 iterations

Step 2- for each iteration, we compute the precision,
recall, f-measure of all the instances

Precision Recall F-measure


Intrepretation of results / Comparison
on data quality

Parallel Iterative

Accuracy -
wrong consensual results (*) error propagation
annotations
accumulation of
Accuracy -
useless redundancy on knowledge driving
missing
obvious buildings attention on
buildings
uncovered area
Consistency redundancy naive last = best
(*) but parallel < iterative in difﬁcult cases (map 2) (lack of consensus)


Side-objective: Measuring how the crowd spatially agrees
Method: taking randomly 2 participants and measure their
spatial inter-agreement (e.g. ratio of points matching) and repeat
the process N time


Side-objective: Measuring how the crowd spatially agrees
Method: taking randomly 2 participants and measure their
spatial inter-agreement (e.g. ratio of points matching) and repeat
the process N time

way to measure the intrinsic difﬁculty of a task
(map 1 = easy , map 2 = quite hard)

future tracks
Impact of the organization beyond data
quality
• Energy / Footprint to collectively solve a problem,
• Participation sustainability,
• On Individual behavior (skill Learning & Enjoyment)
Skill complementarity:
Is the best group of 3 people the best 3 people at the
individual level? data says no!
Other symbolic organisations / mechanism:
• human cellular automata (cell = 1 person, resubmit a task at
time t, because inﬂuenced by peers results generated at time
t-1)
• Integration of Game design / Gamiﬁcation

Crowdsourcing satellite imagery (Talk at Giscience2012)

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (10)

Recently uploaded

Recently uploaded (20)

Crowdsourcing satellite imagery (Talk at Giscience2012)