SlideShare a Scribd company logo
1 of 29
Download to read offline
Crowdsourcing satellite imagery:
          study of iterative vs. parallel models
                           Nicolas Maisonneuve, Bastien Chopard




                                                          Twitter: nmaisonneuve




Friday, September 21, 12                                                          1
Damage assessment after a humanitarian crisis




Friday, September 21, 12                                           2
Port-au-prince: 300K buildings assessed
                           in 3 months for 8 UNOSAT experts




Friday, September 21, 12                                             3
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?




Friday, September 21, 12                                             4
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?



        Investigated scope:
        • Qualitative + Quantitative study of 2 collaborative models inspired by
        Computer science: iterative vs parallel information processing

        • Controlled experiment to isolate quality = F(organisation), removing
        other parameters e.g. training, task difficulty

        • this research != studying real world collaborative practices but more
        extreme/symbolic cases to guide collaborative system designers



Friday, September 21, 12                                                           5
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?



        Investigated scope:
        • Qualitative + Quantitative study of 2 collaborative models inspired by
        Computer science: iterative vs parallel information processing

        • Controlled experiment to isolate quality = F(organisation), removing
        other parameters e.g. training, task difficulty

        • this research != studying real world collaborative practices but more
        extreme/symbolic cases to guide collaborative system designers



Friday, September 21, 12                                                           6
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?



        Investigated scope:
        • Qualitative + Quantitative study of 2 collaborative models inspired by
        Computer science: iterative vs parallel information processing

        • Controlled experiment to isolate quality = F(organisation), removing
        other parameters e.g. training, task difficulty

        • this research != studying real world collaborative practices but more
        extreme/symbolic cases to guide collaborative system designers



Friday, September 21, 12                                                           7
Tested Collaborative Models (1/2)
                                  iterative model




                       e.g. wikipedia, open street map, assembly lines
Friday, September 21, 12                                                 8
Tested Collaborative Models (2/2)
                                   parallel model




                                                 aggregation




             e.g. voting systems in society, distributed computing
Friday, September 21, 12                                             9
Tested Collaborative Models (2/2)
                                   parallel model




            old version (17th to mid 20th century): when computers were human/women
            (Mathematical Table project - (1938 -1948)
Friday, September 21, 12                                                              10
Qualitative comparison
                                    Iterative                    Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces




Friday, September 21, 12                                                               11
Qualitative comparison
                                    Iterative                     Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces


           optimization       copy emphasizing            isolation emphasizing
               tradeoff       exploitation                exploration




Friday, September 21, 12                                                               12
Qualitative comparison
                                    Iterative                     Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces


           optimization       copy emphasizing            isolation emphasizing
               tradeoff       exploitation                exploration


               quality                                    redundancy + diversity of
                              sequential improvement
            mechanism                                     opinions




Friday, September 21, 12                                                               13
Qualitative comparison
                                    Iterative                     Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces


           optimization       copy emphasizing            isolation emphasizing
               tradeoff       exploitation                exploration


               quality                                    redundancy + diversity of
                              sequential improvement
            mechanism                                     opinions

                                                          useless redundancy for
                              path dependency effect +
               side effect                                obvious decisions + pb of
                              sensitivity to vandalism
                                                          aggregation




Friday, September 21, 12                                                               14
Controlled Experiment: web platform




                           Interface/instruction for the Parallel model

Friday, September 21, 12                                                  15
on 3 maps with different topologies
                    (annotated by 1 UNITAR expert)




Friday, September 21, 12                                16
Participants used for the experiments:
              Mechanical Turk as simulator




Friday, September 21, 12                          17
Data Quality Metrics

                 Quality of the collective output
                 • type I errors = p(wrong annotation)
                 • type II errors = p(missing a building)
                 • Consistency

                 Analogy with the information retrieval field:
                 • Precision = p(an annotation is a building)
                 • Recall = p(a building is annotated)
                 • F-measure = score mixing recall + precision
                 • (metrics adjusted with tolerance distance)



Friday, September 21, 12                                         18
Methodology for parallel model
                     Step 1 - collecting independent contribution:
                     N for (map1, map2, map3) = (121,120,113)




Friday, September 21, 12                                             19
Methodology for parallel model
                       Step 2 - for each map,
       generating the set of groups of m=[1 to N] participants


  m=1


  m=2



m=3


Friday, September 21, 12                                         20
Methodology for parallel model
         Step 3 - for each group: aggregating + computing quality

 groups
of m = 2

                           Spatial Clustering of points + quorum




                 Compute Data Quality with Gold Standard

                             Precision          Recall             F-measure

Friday, September 21, 12                                                       21
The more = the better?
                              (parallel model)
      avg. F-measure




    yes but until some points..
    • (Adding more people wont change the consensus panel)
    • Limitation of Linus’ law (compared to iterative model e.g.
    openstreetmap)
    • Wisdom != skill: we can’t replace training by more people
Friday, September 21, 12                                           22
Methodology for Iterative model




                           sample of an iterative process for map3




Friday, September 21, 12                                             23
Methodology for Iterative model




 n instances
 of about m
  iterations

      Collected data for map1, map2, map3 = 13, 21,25
              instances of about 10 iterations
Friday, September 21, 12                                24
Methodology for Iterative model
            Step 2- for each iteration, we compute the precision,
                     recall, f-measure of all the instances




                           Precision   Recall       F-measure

Friday, September 21, 12                                            25
Intrepretation of results / Comparison
               on data quality

                           Parallel                               Iterative

   Accuracy -
   wrong                   consensual results (*)                 error propagation
   annotations
                                                                  accumulation of
   Accuracy -
                           useless redundancy on                  knowledge driving
   missing
                           obvious buildings                      attention on
   buildings
                                                                  uncovered area
   Consistency             redundancy                             naive last = best
  (*) but parallel < iterative in difficult cases (map 2) (lack of consensus)

Friday, September 21, 12                                                              26
Side-objective: Measuring how the crowd spatially agrees
          Method: taking randomly 2 participants and measure their
        spatial inter-agreement (e.g. ratio of points matching) and repeat
                              the process N time




Friday, September 21, 12                                                     27
Side-objective: Measuring how the crowd spatially agrees
          Method: taking randomly 2 participants and measure their
        spatial inter-agreement (e.g. ratio of points matching) and repeat
                              the process N time




                           way to measure the intrinsic difficulty of a task
                                  (map 1 = easy , map 2 = quite hard)
Friday, September 21, 12                                                      28
future tracks
                     Impact of the organization beyond data
                     quality
                     • Energy / Footprint to collectively solve a problem,
                     • Participation sustainability,
                     • On Individual behavior (skill Learning & Enjoyment)
                     Skill complementarity:
                     Is the best group of 3 people the best 3 people at the
                     individual level? data says no!
                     Other symbolic organisations / mechanism:
                     • human cellular automata (cell = 1 person, resubmit a task at
                     time t, because influenced by peers results generated at time
                     t-1)
                     • Integration of Game design / Gamification
Friday, September 21, 12                                                              29

More Related Content

Viewers also liked

The territory of expertise: machine vs. human expert vs. group of amateurs.
The territory of expertise: machine vs. human expert vs. group of amateurs.The territory of expertise: machine vs. human expert vs. group of amateurs.
The territory of expertise: machine vs. human expert vs. group of amateurs.Nicolas Maisonneuve
 
Mapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street ViewMapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street ViewNicolas Maisonneuve
 
Team activity analysis / visualization
Team activity analysis / visualizationTeam activity analysis / visualization
Team activity analysis / visualizationNicolas Maisonneuve
 
NoiseTube: Participatory sensing for noise pollution via mobile phones
NoiseTube: Participatory sensing for noise pollution via mobile phonesNoiseTube: Participatory sensing for noise pollution via mobile phones
NoiseTube: Participatory sensing for noise pollution via mobile phonesNicolas Maisonneuve
 
Orientation of the Community's attention and User alignment
Orientation of the Community's attention and User alignmentOrientation of the Community's attention and User alignment
Orientation of the Community's attention and User alignmentNicolas Maisonneuve
 
Matching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a PurposeMatching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a PurposeLuca Galli
 

Viewers also liked (10)

Observer service
Observer service Observer service
Observer service
 
The territory of expertise: machine vs. human expert vs. group of amateurs.
The territory of expertise: machine vs. human expert vs. group of amateurs.The territory of expertise: machine vs. human expert vs. group of amateurs.
The territory of expertise: machine vs. human expert vs. group of amateurs.
 
a dynamic web feed system
a dynamic web feed systema dynamic web feed system
a dynamic web feed system
 
Social Attention analysis
Social Attention analysisSocial Attention analysis
Social Attention analysis
 
Mapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street ViewMapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street View
 
Team activity analysis / visualization
Team activity analysis / visualizationTeam activity analysis / visualization
Team activity analysis / visualization
 
NoiseTube project
NoiseTube projectNoiseTube project
NoiseTube project
 
NoiseTube: Participatory sensing for noise pollution via mobile phones
NoiseTube: Participatory sensing for noise pollution via mobile phonesNoiseTube: Participatory sensing for noise pollution via mobile phones
NoiseTube: Participatory sensing for noise pollution via mobile phones
 
Orientation of the Community's attention and User alignment
Orientation of the Community's attention and User alignmentOrientation of the Community's attention and User alignment
Orientation of the Community's attention and User alignment
 
Matching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a PurposeMatching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a Purpose
 

Recently uploaded

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

Crowdsourcing satellite imagery (Talk at Giscience2012)

  • 1. Crowdsourcing satellite imagery: study of iterative vs. parallel models Nicolas Maisonneuve, Bastien Chopard Twitter: nmaisonneuve Friday, September 21, 12 1
  • 2. Damage assessment after a humanitarian crisis Friday, September 21, 12 2
  • 3. Port-au-prince: 300K buildings assessed in 3 months for 8 UNOSAT experts Friday, September 21, 12 3
  • 4. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Friday, September 21, 12 4
  • 5. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designers Friday, September 21, 12 5
  • 6. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designers Friday, September 21, 12 6
  • 7. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designers Friday, September 21, 12 7
  • 8. Tested Collaborative Models (1/2) iterative model e.g. wikipedia, open street map, assembly lines Friday, September 21, 12 8
  • 9. Tested Collaborative Models (2/2) parallel model aggregation e.g. voting systems in society, distributed computing Friday, September 21, 12 9
  • 10. Tested Collaborative Models (2/2) parallel model old version (17th to mid 20th century): when computers were human/women (Mathematical Table project - (1938 -1948) Friday, September 21, 12 10
  • 11. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces Friday, September 21, 12 11
  • 12. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation exploration Friday, September 21, 12 12
  • 13. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation exploration quality redundancy + diversity of sequential improvement mechanism opinions Friday, September 21, 12 13
  • 14. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation exploration quality redundancy + diversity of sequential improvement mechanism opinions useless redundancy for path dependency effect + side effect obvious decisions + pb of sensitivity to vandalism aggregation Friday, September 21, 12 14
  • 15. Controlled Experiment: web platform Interface/instruction for the Parallel model Friday, September 21, 12 15
  • 16. on 3 maps with different topologies (annotated by 1 UNITAR expert) Friday, September 21, 12 16
  • 17. Participants used for the experiments: Mechanical Turk as simulator Friday, September 21, 12 17
  • 18. Data Quality Metrics Quality of the collective output • type I errors = p(wrong annotation) • type II errors = p(missing a building) • Consistency Analogy with the information retrieval field: • Precision = p(an annotation is a building) • Recall = p(a building is annotated) • F-measure = score mixing recall + precision • (metrics adjusted with tolerance distance) Friday, September 21, 12 18
  • 19. Methodology for parallel model Step 1 - collecting independent contribution: N for (map1, map2, map3) = (121,120,113) Friday, September 21, 12 19
  • 20. Methodology for parallel model Step 2 - for each map, generating the set of groups of m=[1 to N] participants m=1 m=2 m=3 Friday, September 21, 12 20
  • 21. Methodology for parallel model Step 3 - for each group: aggregating + computing quality groups of m = 2 Spatial Clustering of points + quorum Compute Data Quality with Gold Standard Precision Recall F-measure Friday, September 21, 12 21
  • 22. The more = the better? (parallel model) avg. F-measure yes but until some points.. • (Adding more people wont change the consensus panel) • Limitation of Linus’ law (compared to iterative model e.g. openstreetmap) • Wisdom != skill: we can’t replace training by more people Friday, September 21, 12 22
  • 23. Methodology for Iterative model sample of an iterative process for map3 Friday, September 21, 12 23
  • 24. Methodology for Iterative model n instances of about m iterations Collected data for map1, map2, map3 = 13, 21,25 instances of about 10 iterations Friday, September 21, 12 24
  • 25. Methodology for Iterative model Step 2- for each iteration, we compute the precision, recall, f-measure of all the instances Precision Recall F-measure Friday, September 21, 12 25
  • 26. Intrepretation of results / Comparison on data quality Parallel Iterative Accuracy - wrong consensual results (*) error propagation annotations accumulation of Accuracy - useless redundancy on knowledge driving missing obvious buildings attention on buildings uncovered area Consistency redundancy naive last = best (*) but parallel < iterative in difficult cases (map 2) (lack of consensus) Friday, September 21, 12 26
  • 27. Side-objective: Measuring how the crowd spatially agrees Method: taking randomly 2 participants and measure their spatial inter-agreement (e.g. ratio of points matching) and repeat the process N time Friday, September 21, 12 27
  • 28. Side-objective: Measuring how the crowd spatially agrees Method: taking randomly 2 participants and measure their spatial inter-agreement (e.g. ratio of points matching) and repeat the process N time way to measure the intrinsic difficulty of a task (map 1 = easy , map 2 = quite hard) Friday, September 21, 12 28
  • 29. future tracks Impact of the organization beyond data quality • Energy / Footprint to collectively solve a problem, • Participation sustainability, • On Individual behavior (skill Learning & Enjoyment) Skill complementarity: Is the best group of 3 people the best 3 people at the individual level? data says no! Other symbolic organisations / mechanism: • human cellular automata (cell = 1 person, resubmit a task at time t, because influenced by peers results generated at time t-1) • Integration of Game design / Gamification Friday, September 21, 12 29