SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
Linking Smart Cities Datasets
  with Human Computation:
   the case of UrbanMatch

 Irene Celino, Simone Contessa, Marta Corubolo,
     Daniele Dell’Aglio, Emanuele Della Valle,
       Stefano Fumeo and Thorsten Krüger
  CEFRIEL – Politecnico di Milano – SIEMENS
                 UrbanMatch - ISWC 2012
Citizen Computation

 Human Computation                                                    Citizen Science
 exploiting human capabilities                                       exploiting volunteers
 to solve computational tasks                                     to collect scientific data or
     difficult for machines                                         to conduct experiments
                                                                          "in the world"




                                  Citizen Computation
                                 exploiting human capabilities
                                   to contribute to a mixed
                                    computational system
                                    by living "in the world"



  UrbanMatch - ISWC 2012                      2                  Boston, 13th November 2012
UrbanMatch: Citizen Computation GWAP
                                                 Photos from social media
                   Places & POIs from
                     OpenStreetMap              Wikimedia
                                                Commons


                                        ?
          e.g. Duomo




   e.g. equestrian statue           ?

                                        ?




     Which photos actually represent the urban POIs?
   Can we link POIs to their most representative photos?
 UrbanMatch - ISWC 2012                     3     Boston, 13th November 2012
The Record Linkage Problem                   [Fellegi&Sunter,1969]

      Γ=A×B                             Comparison space
                                 B          Γ= 𝐴× 𝐵
                                        Purpose is to split Γ in
 A                                          M (set of matches) and
                                            U (set of non-matches)
                                        Each link γ has a score 𝑠 𝛾
                                            𝑠 𝛾 = 𝑃 𝛾 ∈ Γ 𝑀 /𝑃(𝛾 ∈ Γ|𝑈)
                                 B
                                        Partitioning based on
                                         lower/upper thresholds:
  A                                         𝑠 𝛾 > 𝑈𝑃𝑃𝐸𝑅 ⇒ 𝛾 ∈ 𝑀
                                            𝑠 𝛾 < 𝐿𝑂𝑊𝐸𝑅 ⇒ 𝛾 ∈ 𝑈
                            ∈M               𝐿𝑂𝑊𝐸𝑅 ≤ 𝑠 𝛾 ≤ 𝑈𝑃𝑃𝐸𝑅 ⇒ 𝛾 ∈ 𝐶
                           ∈U                (candidate links to be assessed by experts)
                          ∈C
 UrbanMatch - ISWC 2012              4              Boston, 13th November 2012
UrbanMatch Linkage Problem
                                 B (photos)      A contains POIs
A (POIs)                                         B contains photos
                Piazza
               del Duomo
                                                 Γ contains POI-photo links
                                                 "Playable places" group
Piazza della
                                                  POIs  partitioning of Γ
   Scala                                          (problem space reduction)


                                                 Input data in UrbanMatch:
                 UrbanMatch links                     34 POIs in 14 playable places
                                                      11,089 photos
                                                      196 links in M
                                                      382 links in U
                                                      37,413 links in C
        UrbanMatch - ISWC 2012                5             Boston, 13th November 2012
Showing links to UrbanMatch players
   The POI-photo links to be evaluated could be shown
    directly to players...
                      Santa Maria    correct?
                      delle Grazie     ???

    ...but the player could not know the name of the POI
   On the other hand, the player can easily "see"


 (Santa Maria
                                      correct?
                                                                              (Sant'Ambrogio)
 delle Grazie)
                                        NO!

     thus we use a "sure" photo as proxy for the POI, thus the
     UrbanMatch game is designed as a photo-pairing challenge
    UrbanMatch - ISWC 2012               6       Boston, 13th November 2012
UrbanMatch level definition
                                     Set B of photos
    Set A of POIs                a        b           c




            POI 1                                                    d
            (Duomo)

                             h


            POI 2
             (statue)
                                     g        f                e

   Photos d and g depict POI 1 (and do not depict POI 2)
   Photos c and h depict POI 2 (and do not depict POI 1)
   Photos b and f do not depict POI 1 nor POI 2 (distractors)
   Photos a and e maybe depict POI 1 and/or POI 2
    UrbanMatch - ISWC 2012       7            Boston, 13th November 2012
Feedbacks to UrbanMatch players
                                          If the selected pair
                                  d       is correct (two sure
                                          photos) or plausible,
                                          UrbanMatch gives a
                                          positive feedback
                              e
                      b
                                          If the selected pair is
                                          wrong (at least one
                                          selected photo is a
                                          "distractor" option),
                                          UrbanMatch gives a
                                          negative feedback
                          f               (and counts the player's mistake)
 UrbanMatch - ISWC 2012               8         Boston, 13th November 2012
UrbanMatch: gameplay

                          Video at: http://bit.ly/um-video




 UrbanMatch - ISWC 2012               9             Boston, 13th November 2012
Link score in UrbanMatch
   Effect of players' evidences on the score sγ
         If there is a positive evidence (an UrbanMatch player says that the
          link γ is correct), sγ is increased at most by Kpos
         If there is a negative evidence (an UrbanMatch player says that the
          link γ is incorrect), sγ is decreased at most by Kneg
         Each evidence is weighted with a factor that represents UrbanMatch
          player's reliability
                                                  𝜀
                                                 − 𝑝 2
                                      𝑟𝑝 =     𝑒
          where 𝜀 𝑝 is the number of errors made by the player, thus 𝑟 𝑝 ∈ [0. . 1]
          (reliability is 1 if the player makes no mistake, 0.6 if he makes 1 error,
          almost zero if he makes 6 mistakes or more)

   For each link γ in C, its score varies as follows
         Positive evidence:          𝑠′ 𝛾 = 𝑠 𝛾 + 𝐾 𝑝𝑜𝑠 ∗ 𝑟 𝑝
         Negative evidence:          𝑠′ 𝛾 = 𝑠 𝛾 − 𝐾 𝑛𝑒𝑔 ∗ 𝑟 𝑝

    UrbanMatch - ISWC 2012                10                     Boston, 13th November 2012
Evaluating UrbanMatch precision
   Accuracy definition
     (metrics specific to Data Quality)
                                                                      candidate links correctly assessed to
game ability to correctly                                                be either trustable or incorrect
assess candidate links

                                                𝐶𝑀 − 𝐹𝑃 + (𝐶𝑈 − 𝐹𝑁)
                             𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
                                                     𝐶𝑀 + 𝐶𝑈
                                                                          candidate links assessed to be
                                                                            either trustable or incorrect


   CM = right links, i.e. candidate links moved from C to M (because 𝑠 𝛾 > 𝑈𝑃𝑃𝐸𝑅)
   CU = wrong links, i.e. candidate links moved from C to U (because 𝑠 𝛾 < 𝐿𝑂𝑊𝐸𝑅)
   FP = false positives (candidate links assessed as trustable but actually incorrect)
   FN = false negatives (candidate links assessed as incorrect but actually trustable)


                             accuracy requires ground truth  computed only on a
                              manually analyzed set of links for evaluation's sake !

    UrbanMatch - ISWC 2012                       11                Boston, 13th November 2012
Evaluating UrbanMatch as a GWAP
   Throughput and ALP definition
     (metrics specific to Games with a Purpose)

game ability to assess
                                              𝑆𝑜𝑙𝑣𝑒𝑑𝑃𝑟𝑜𝑏𝑙𝑒𝑚𝑠
   candidate links
                                 𝑇ℎ𝑟𝑜𝑢𝑔ℎ𝑝𝑢𝑡 =
                                                𝑃𝑙𝑎𝑦𝑒𝑑𝑇𝑖𝑚𝑒

               game ability to             𝑃𝑙𝑎𝑦𝑒𝑑𝑇𝑖𝑚𝑒
               engage players       𝐴𝐿𝑃 =
                                          𝐴𝑐𝑡𝑖𝑣𝑒𝑃𝑙𝑎𝑦𝑒𝑟𝑠

     where 𝑆𝑜𝑙𝑣𝑒𝑑𝑃𝑟𝑜𝑏𝑙𝑒𝑚𝑠 = 𝐶𝑀 + 𝐶𝑈 in which:
         CM = right links, i.e. candidate links moved from C to M
          (because 𝑠 𝛾 > 𝑈𝑃𝑃𝐸𝑅)
         CU = wrong links, i.e. candidate links moved from C to U
          (because 𝑠 𝛾 < 𝐿𝑂𝑊𝐸𝑅)

    UrbanMatch - ISWC 2012             12           Boston, 13th November 2012
Setting the thresholds for link score 𝒔 𝜸
    Accuracy and Throughput as a function of 𝐿𝑂𝑊𝐸𝑅
                  LOWER          0.05           0.10            0.15              0.20
                  CU              321           348            1152              1216
                  FN                 4           5               7                  8
                  Accuracy     98.75 %     98.56 %           99.39 %          99.34 %
                  Throughput    108.08         117.17          387.87           409.42

    Accuracy and Throughput as a function of 𝑈𝑃𝑃𝐸𝑅

UPPER                   0.60   0.65      0.70           0.75         0.80             0.85    0.90
CM                      227    225        68            65             61               60     49
FP                       48     47        4              4              3               3      1
Accuracy             78.85 % 79.11 % 94.11 % 95.38 % 95.08 % 95.00 % 97.95 %
Throughput             76.43   75.75     22.89         21.88         20.53           20.20    16.49


     UrbanMatch - ISWC 2012               13                     Boston, 13th November 2012
UrbanMatch evaluation results
Accuracy, Throughput and ALP results
 Evaluation data
         54 unique players, 290 games (781 game levels)
         2006 input links, 1284 assessed (trustable/incorrect)
         upper threshold 0.70, lower threshold 0.20


   Accuracy (on the manually-checked subset)
         99.06%              very good result 


   Throughput and ALP (on all processed links)
         485 links/h         very good result 
         3m 17s              one-time game, needs improvement 



    UrbanMatch - ISWC 2012            14            Boston, 13th November 2012
Evaluation UrbanMatch as a whole
User-based evaluation: actual "engagement"
   Based on game evaluation literature and integrated with our
    research-specific questions
   Questionnaire at http://bit.ly/um-survey
   Findings:




    UrbanMatch - ISWC 2012   15          Boston, 13th November 2012
Conclusions
   Citizen Computation games are effective to leverage
    linking capabilities of "humans"
   Game is to be designed to keep in consideration users'
    capabilities and usage context:
         (lack of) background knowledge
         small mobile-phone screen
   Different tasks to be solved need different game "missions"
         Linking task  pairing mission
         Verifying task  rating/selecting mission
         Collecting task  selecting/editing mission

                 Advertisement  try out Urbanopoly game here in Boston,
                 to collect and assess geo-spatial linked data
                 participant to Semantic Web Challenge http://bit.ly/urbanopoly


    UrbanMatch - ISWC 2012              16             Boston, 13th November 2012
Thanks for your attention!
          Any question?




Irene Celino – CEFRIEL, ICT Institute Politecnico di Milano
  email: Irene.Celino@cefriel.it – web: http://swa.cefriel.it
                        UrbanMatch - ISWC 2012

Weitere ähnliche Inhalte

Mehr von Irene Celino

Involving people in Citizen Science through game incentives: the case of the ...
Involving people in Citizen Science through game incentives: the case of the ...Involving people in Citizen Science through game incentives: the case of the ...
Involving people in Citizen Science through game incentives: the case of the ...Irene Celino
 
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...Irene Celino
 
Human Computation for VGI Management
Human Computation for VGI ManagementHuman Computation for VGI Management
Human Computation for VGI ManagementIrene Celino
 
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility Behaviours
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility BehavioursNinja Riders - Youth and Road Safety: Discovering Urban Mobility Behaviours
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility BehavioursIrene Celino
 
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...Irene Celino
 
Give and Take in Citizen Science
Give and Take in Citizen ScienceGive and Take in Citizen Science
Give and Take in Citizen ScienceIrene Celino
 
Ninja Riders @ Human Factory Day 2017
Ninja Riders @ Human Factory Day 2017Ninja Riders @ Human Factory Day 2017
Ninja Riders @ Human Factory Day 2017Irene Celino
 
Night Knights: exploiting games to engage people in a citizen science campaign
Night Knights: exploiting games to engage people in a citizen science campaignNight Knights: exploiting games to engage people in a citizen science campaign
Night Knights: exploiting games to engage people in a citizen science campaignIrene Celino
 
STARS4ALL-CAPSSI-Workshop
STARS4ALL-CAPSSI-WorkshopSTARS4ALL-CAPSSI-Workshop
STARS4ALL-CAPSSI-WorkshopIrene Celino
 
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...Irene Celino
 
SSSW 2016 Cognition Tutorial
SSSW 2016 Cognition TutorialSSSW 2016 Cognition Tutorial
SSSW 2016 Cognition TutorialIrene Celino
 
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...Irene Celino
 
Supporting Geo-Ontology Engineering through Spatial Data Analytics
Supporting Geo-Ontology Engineering through Spatial Data AnalyticsSupporting Geo-Ontology Engineering through Spatial Data Analytics
Supporting Geo-Ontology Engineering through Spatial Data AnalyticsIrene Celino
 
Smart City Semantics - Data Analytics and Human Computation to understand the...
Smart City Semantics - Data Analytics and Human Computation to understand the...Smart City Semantics - Data Analytics and Human Computation to understand the...
Smart City Semantics - Data Analytics and Human Computation to understand the...Irene Celino
 
Towards a Semantic City Service Ecosystem
Towards a Semantic City Service EcosystemTowards a Semantic City Service Ecosystem
Towards a Semantic City Service EcosystemIrene Celino
 
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014Irene Celino
 
Urbanopoly @ PlanetData review
Urbanopoly @ PlanetData reviewUrbanopoly @ PlanetData review
Urbanopoly @ PlanetData reviewIrene Celino
 
Urbanopoly minute madness
Urbanopoly minute madnessUrbanopoly minute madness
Urbanopoly minute madnessIrene Celino
 
Urbanopoly @ SoHuman - SocialCom 2012
Urbanopoly @ SoHuman - SocialCom 2012Urbanopoly @ SoHuman - SocialCom 2012
Urbanopoly @ SoHuman - SocialCom 2012Irene Celino
 

Mehr von Irene Celino (20)

Involving people in Citizen Science through game incentives: the case of the ...
Involving people in Citizen Science through game incentives: the case of the ...Involving people in Citizen Science through game incentives: the case of the ...
Involving people in Citizen Science through game incentives: the case of the ...
 
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...
 
Human Computation for VGI Management
Human Computation for VGI ManagementHuman Computation for VGI Management
Human Computation for VGI Management
 
Human Computation
Human ComputationHuman Computation
Human Computation
 
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility Behaviours
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility BehavioursNinja Riders - Youth and Road Safety: Discovering Urban Mobility Behaviours
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility Behaviours
 
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...
 
Give and Take in Citizen Science
Give and Take in Citizen ScienceGive and Take in Citizen Science
Give and Take in Citizen Science
 
Ninja Riders @ Human Factory Day 2017
Ninja Riders @ Human Factory Day 2017Ninja Riders @ Human Factory Day 2017
Ninja Riders @ Human Factory Day 2017
 
Night Knights: exploiting games to engage people in a citizen science campaign
Night Knights: exploiting games to engage people in a citizen science campaignNight Knights: exploiting games to engage people in a citizen science campaign
Night Knights: exploiting games to engage people in a citizen science campaign
 
STARS4ALL-CAPSSI-Workshop
STARS4ALL-CAPSSI-WorkshopSTARS4ALL-CAPSSI-Workshop
STARS4ALL-CAPSSI-Workshop
 
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...
 
SSSW 2016 Cognition Tutorial
SSSW 2016 Cognition TutorialSSSW 2016 Cognition Tutorial
SSSW 2016 Cognition Tutorial
 
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...
 
Supporting Geo-Ontology Engineering through Spatial Data Analytics
Supporting Geo-Ontology Engineering through Spatial Data AnalyticsSupporting Geo-Ontology Engineering through Spatial Data Analytics
Supporting Geo-Ontology Engineering through Spatial Data Analytics
 
Smart City Semantics - Data Analytics and Human Computation to understand the...
Smart City Semantics - Data Analytics and Human Computation to understand the...Smart City Semantics - Data Analytics and Human Computation to understand the...
Smart City Semantics - Data Analytics and Human Computation to understand the...
 
Towards a Semantic City Service Ecosystem
Towards a Semantic City Service EcosystemTowards a Semantic City Service Ecosystem
Towards a Semantic City Service Ecosystem
 
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014
 
Urbanopoly @ PlanetData review
Urbanopoly @ PlanetData reviewUrbanopoly @ PlanetData review
Urbanopoly @ PlanetData review
 
Urbanopoly minute madness
Urbanopoly minute madnessUrbanopoly minute madness
Urbanopoly minute madness
 
Urbanopoly @ SoHuman - SocialCom 2012
Urbanopoly @ SoHuman - SocialCom 2012Urbanopoly @ SoHuman - SocialCom 2012
Urbanopoly @ SoHuman - SocialCom 2012
 

Kürzlich hochgeladen

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Kürzlich hochgeladen (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Linking Smart Cities Datasets with Human Computation - the case of UrbanMatch

  • 1. Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch Irene Celino, Simone Contessa, Marta Corubolo, Daniele Dell’Aglio, Emanuele Della Valle, Stefano Fumeo and Thorsten Krüger CEFRIEL – Politecnico di Milano – SIEMENS UrbanMatch - ISWC 2012
  • 2. Citizen Computation Human Computation Citizen Science exploiting human capabilities exploiting volunteers to solve computational tasks to collect scientific data or difficult for machines to conduct experiments "in the world" Citizen Computation exploiting human capabilities to contribute to a mixed computational system by living "in the world" UrbanMatch - ISWC 2012 2 Boston, 13th November 2012
  • 3. UrbanMatch: Citizen Computation GWAP Photos from social media Places & POIs from OpenStreetMap Wikimedia Commons ? e.g. Duomo e.g. equestrian statue ? ? Which photos actually represent the urban POIs? Can we link POIs to their most representative photos? UrbanMatch - ISWC 2012 3 Boston, 13th November 2012
  • 4. The Record Linkage Problem [Fellegi&Sunter,1969] Γ=A×B  Comparison space B  Γ= 𝐴× 𝐵  Purpose is to split Γ in A  M (set of matches) and  U (set of non-matches)  Each link γ has a score 𝑠 𝛾  𝑠 𝛾 = 𝑃 𝛾 ∈ Γ 𝑀 /𝑃(𝛾 ∈ Γ|𝑈) B  Partitioning based on lower/upper thresholds: A  𝑠 𝛾 > 𝑈𝑃𝑃𝐸𝑅 ⇒ 𝛾 ∈ 𝑀  𝑠 𝛾 < 𝐿𝑂𝑊𝐸𝑅 ⇒ 𝛾 ∈ 𝑈 ∈M  𝐿𝑂𝑊𝐸𝑅 ≤ 𝑠 𝛾 ≤ 𝑈𝑃𝑃𝐸𝑅 ⇒ 𝛾 ∈ 𝐶 ∈U (candidate links to be assessed by experts) ∈C UrbanMatch - ISWC 2012 4 Boston, 13th November 2012
  • 5. UrbanMatch Linkage Problem B (photos)  A contains POIs A (POIs)  B contains photos Piazza del Duomo  Γ contains POI-photo links  "Playable places" group Piazza della POIs  partitioning of Γ Scala (problem space reduction)  Input data in UrbanMatch: UrbanMatch links  34 POIs in 14 playable places  11,089 photos  196 links in M  382 links in U  37,413 links in C UrbanMatch - ISWC 2012 5 Boston, 13th November 2012
  • 6. Showing links to UrbanMatch players  The POI-photo links to be evaluated could be shown directly to players... Santa Maria correct? delle Grazie ??? ...but the player could not know the name of the POI  On the other hand, the player can easily "see" (Santa Maria correct? (Sant'Ambrogio) delle Grazie) NO! thus we use a "sure" photo as proxy for the POI, thus the UrbanMatch game is designed as a photo-pairing challenge UrbanMatch - ISWC 2012 6 Boston, 13th November 2012
  • 7. UrbanMatch level definition Set B of photos Set A of POIs a b c POI 1 d (Duomo) h POI 2 (statue) g f e  Photos d and g depict POI 1 (and do not depict POI 2)  Photos c and h depict POI 2 (and do not depict POI 1)  Photos b and f do not depict POI 1 nor POI 2 (distractors)  Photos a and e maybe depict POI 1 and/or POI 2 UrbanMatch - ISWC 2012 7 Boston, 13th November 2012
  • 8. Feedbacks to UrbanMatch players If the selected pair d is correct (two sure photos) or plausible, UrbanMatch gives a positive feedback e b If the selected pair is wrong (at least one selected photo is a "distractor" option), UrbanMatch gives a negative feedback f (and counts the player's mistake) UrbanMatch - ISWC 2012 8 Boston, 13th November 2012
  • 9. UrbanMatch: gameplay Video at: http://bit.ly/um-video UrbanMatch - ISWC 2012 9 Boston, 13th November 2012
  • 10. Link score in UrbanMatch  Effect of players' evidences on the score sγ  If there is a positive evidence (an UrbanMatch player says that the link γ is correct), sγ is increased at most by Kpos  If there is a negative evidence (an UrbanMatch player says that the link γ is incorrect), sγ is decreased at most by Kneg  Each evidence is weighted with a factor that represents UrbanMatch player's reliability 𝜀 − 𝑝 2 𝑟𝑝 = 𝑒 where 𝜀 𝑝 is the number of errors made by the player, thus 𝑟 𝑝 ∈ [0. . 1] (reliability is 1 if the player makes no mistake, 0.6 if he makes 1 error, almost zero if he makes 6 mistakes or more)  For each link γ in C, its score varies as follows  Positive evidence: 𝑠′ 𝛾 = 𝑠 𝛾 + 𝐾 𝑝𝑜𝑠 ∗ 𝑟 𝑝  Negative evidence: 𝑠′ 𝛾 = 𝑠 𝛾 − 𝐾 𝑛𝑒𝑔 ∗ 𝑟 𝑝 UrbanMatch - ISWC 2012 10 Boston, 13th November 2012
  • 11. Evaluating UrbanMatch precision  Accuracy definition (metrics specific to Data Quality) candidate links correctly assessed to game ability to correctly be either trustable or incorrect assess candidate links 𝐶𝑀 − 𝐹𝑃 + (𝐶𝑈 − 𝐹𝑁) 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝐶𝑀 + 𝐶𝑈 candidate links assessed to be either trustable or incorrect  CM = right links, i.e. candidate links moved from C to M (because 𝑠 𝛾 > 𝑈𝑃𝑃𝐸𝑅)  CU = wrong links, i.e. candidate links moved from C to U (because 𝑠 𝛾 < 𝐿𝑂𝑊𝐸𝑅)  FP = false positives (candidate links assessed as trustable but actually incorrect)  FN = false negatives (candidate links assessed as incorrect but actually trustable) accuracy requires ground truth  computed only on a manually analyzed set of links for evaluation's sake ! UrbanMatch - ISWC 2012 11 Boston, 13th November 2012
  • 12. Evaluating UrbanMatch as a GWAP  Throughput and ALP definition (metrics specific to Games with a Purpose) game ability to assess 𝑆𝑜𝑙𝑣𝑒𝑑𝑃𝑟𝑜𝑏𝑙𝑒𝑚𝑠 candidate links 𝑇ℎ𝑟𝑜𝑢𝑔ℎ𝑝𝑢𝑡 = 𝑃𝑙𝑎𝑦𝑒𝑑𝑇𝑖𝑚𝑒 game ability to 𝑃𝑙𝑎𝑦𝑒𝑑𝑇𝑖𝑚𝑒 engage players 𝐴𝐿𝑃 = 𝐴𝑐𝑡𝑖𝑣𝑒𝑃𝑙𝑎𝑦𝑒𝑟𝑠 where 𝑆𝑜𝑙𝑣𝑒𝑑𝑃𝑟𝑜𝑏𝑙𝑒𝑚𝑠 = 𝐶𝑀 + 𝐶𝑈 in which:  CM = right links, i.e. candidate links moved from C to M (because 𝑠 𝛾 > 𝑈𝑃𝑃𝐸𝑅)  CU = wrong links, i.e. candidate links moved from C to U (because 𝑠 𝛾 < 𝐿𝑂𝑊𝐸𝑅) UrbanMatch - ISWC 2012 12 Boston, 13th November 2012
  • 13. Setting the thresholds for link score 𝒔 𝜸  Accuracy and Throughput as a function of 𝐿𝑂𝑊𝐸𝑅 LOWER 0.05 0.10 0.15 0.20 CU 321 348 1152 1216 FN 4 5 7 8 Accuracy 98.75 % 98.56 % 99.39 % 99.34 % Throughput 108.08 117.17 387.87 409.42  Accuracy and Throughput as a function of 𝑈𝑃𝑃𝐸𝑅 UPPER 0.60 0.65 0.70 0.75 0.80 0.85 0.90 CM 227 225 68 65 61 60 49 FP 48 47 4 4 3 3 1 Accuracy 78.85 % 79.11 % 94.11 % 95.38 % 95.08 % 95.00 % 97.95 % Throughput 76.43 75.75 22.89 21.88 20.53 20.20 16.49 UrbanMatch - ISWC 2012 13 Boston, 13th November 2012
  • 14. UrbanMatch evaluation results Accuracy, Throughput and ALP results  Evaluation data  54 unique players, 290 games (781 game levels)  2006 input links, 1284 assessed (trustable/incorrect)  upper threshold 0.70, lower threshold 0.20  Accuracy (on the manually-checked subset)  99.06%  very good result   Throughput and ALP (on all processed links)  485 links/h  very good result   3m 17s  one-time game, needs improvement  UrbanMatch - ISWC 2012 14 Boston, 13th November 2012
  • 15. Evaluation UrbanMatch as a whole User-based evaluation: actual "engagement"  Based on game evaluation literature and integrated with our research-specific questions  Questionnaire at http://bit.ly/um-survey  Findings: UrbanMatch - ISWC 2012 15 Boston, 13th November 2012
  • 16. Conclusions  Citizen Computation games are effective to leverage linking capabilities of "humans"  Game is to be designed to keep in consideration users' capabilities and usage context:  (lack of) background knowledge  small mobile-phone screen  Different tasks to be solved need different game "missions"  Linking task  pairing mission  Verifying task  rating/selecting mission  Collecting task  selecting/editing mission   Advertisement  try out Urbanopoly game here in Boston, to collect and assess geo-spatial linked data participant to Semantic Web Challenge http://bit.ly/urbanopoly UrbanMatch - ISWC 2012 16 Boston, 13th November 2012
  • 17. Thanks for your attention! Any question? Irene Celino – CEFRIEL, ICT Institute Politecnico di Milano email: Irene.Celino@cefriel.it – web: http://swa.cefriel.it UrbanMatch - ISWC 2012