Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Interplay of Game Incentives, Player Profiles
and Task Difficulty in Games with a Purpose
Gloria Re Calegari and Irene Celino...
HUMAN-IN-THE-LOOP FOR KNOWLEDGE ACQUISITION
• Machine learning approaches train automatic models on the basis of a trainin...
• Crowdsourcing and Human Computation approaches have been largely adopted for several
knowledge management tasks: collect...
• Input: set of pictures and
classification categories
• Goal: associate a category to
each picture by assigning a
score 𝜎...
NIGHT KNIGHTS: DATA AND EVALUATION
• Reference observation period: 9 months (February-October 2017)
• 1 month of competiti...
[Q1] HOW DO PARTICIPATION AND RESULTS CHANGE WITH INCENTIVES?
[Q2] DO THE EXTRINSIC REWARD EFFECTS LAST OVER TIME?
[Q1]
• ...
[Q3] DOES PLAYING STYLE CHANGE WITH THE INCENTIVE?
• Contribution speed = number of images played in each game round
• Est...
[Q4] HOW DO GWAPS COMPARE TO TRADITIONAL CITIZEN SCIENCE?
[Q5] WHAT DOES PLAYER BEHAVIOUR TELL ABOUT THE GAME NATURE?
[Q4]...
[Q6] WHAT KIND OF GWAP PLAYER PROFILES CAN BE IDENTIFIED?
• Player accuracy = how many tasks each player correctly solved ...
[Q7] DOES PLAYER BEHAVIOUR CHANGE WITH DIFFERENT INCENTIVES?
• During competition (extrinsic motivation period)
• Majority...
[Q8] DOES PLAYER BEHAVIOUR CHANGE WITH TASK DIFFICULTY?
[Q9] DOES PLAYER BEHAVIOUR CHANGE WITH TASK VARIETY?
• Task diffic...
CONCLUSIONS
• GWAPs are an effective “human in the loop” method to engage a target community in a process of
knowledge man...
MILANO
viale Sarca 226,
20126,
Milano - Italy
LONDON
4° floor
57 Rathbone Place
London W1T 1JU – UK
NEW YORK
One Liberty P...
Nächste SlideShare
Wird geladen in …5
×

Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with a Purpose

presentation of my paper at EKAW 2018 in Nancy - How to take multiple factors into account when evaluating a Game with a Purpose? How is player behaviour or participation influenced by different incentives? How does player engagement impact their accuracy in solving tasks? In this paper, we present a detailed investigation of multiple factors affecting the evaluation of a GWAP and we show how they impact on the achieved results. We inform our study with the experimental assessment of a GWAP designed to solve a multinomial classification task.

  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with a Purpose

  1. 1. Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with a Purpose Gloria Re Calegari and Irene Celino Nancy, November 15th, 2018 – 21st International Conference on Knowledge Engineering and Knowledge Management (EKAW 2018)
  2. 2. HUMAN-IN-THE-LOOP FOR KNOWLEDGE ACQUISITION • Machine learning approaches train automatic models on the basis of a training set, thus they require some partial gold standard, often also named “ground truth” • Ground truth requires putting back the human in the loop: building a training set for a machine learning pipeline means asking people to execute a set of tasks • This knowledge acquisition challenge is usually solved in one of the following ways: • Asking experts to put together the training set (but involving experts can be expensive!) • Adopting Crowdsourcing and Human Computation approaches, thus asking to a distributed crowd to collect the required knowledge Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018 2
  3. 3. • Crowdsourcing and Human Computation approaches have been largely adopted for several knowledge management tasks: collection, enrichment, validation, annotation, ranking, … • Those approaches differ in engagement and reward schemes for human participants • What are the condition that make it worth adopting a GWAP approach? • When and how are GWAPs effective to achieve their goal? • Crowdsourcing is the process to outsource tasks to a “crowd” of distributed people (notable examples: Amazon Mechanical Turk, Figure Eight) • Human Computation is a computer science technique in which a computational process is performed by outsourcing certain steps to humans, usually when humans are very good at solving those tasks while computers are not (notable example: reCAPTCHA) • Games with a Purpose (GWAP) are a Human Computation application that lets to outsource some tasks to humans in an entertaining way (notable example: the ESP game) CROWDSOURCING & HUMAN COMPUTATION Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018 3 premium access money prizes knowledge recognition fun enjoyment
  4. 4. • Input: set of pictures and classification categories • Goal: associate a category to each picture by assigning a score 𝜎 to each picture-category pair • Score 𝜎 of each picture- category association is updated on the basis of players’ choices • When the score of a picture- category pair overcomes the threshold 𝜎 ≥ 𝑡 , the association is considered “true” (and the picture is removed from the game) • Purpose: identify pictures of cities from above between those taken on board of the ISS (the pictures are used then in a scientific process in light pollution research) USE CASE: THE NIGHT KNIGHTS GWAP Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018 http://nightknights.eu DATA COLLECTION & VALIDATION Pure GWAP with not-so-hidden purpose (but played by anybody) Points, badges, leaderboard as intrinsic reward A player scores if he/she agrees with another player “Bonus” intrinsic reward with NASA pictures! Gloria Re Calegari, Gioele Nasi, Irene Celino. Human Computation vs. Machine Learning: an Experimental Comparison for Image Classification. Human Computation Journal, vol. 5, issue 1, 2018. Gloria Re Calegari, Andrea Fiano and Irene Celino: A Framework to build Games with a Purpose for Linked Data Refinement, in proceedings of ISWC 2018, LNCS Volume 11137, pp. 154-169. 4
  5. 5. NIGHT KNIGHTS: DATA AND EVALUATION • Reference observation period: 9 months (February-October 2017) • 1 month of competition with tangible reward (join the 2017 Summer Expetition to observe the Solar Eclipse in USA) in June-July 2017 • 4 months from the game launch to the competition start + 4 months after the competition • Data available at https://github.com/STARS4ALL/Night-Knights-dataset • ~ 650 players and ~ 28.000 classified pictures • Released under a Creative Commons 4.0 license • Investigation to analyse participation and find profile patterns • Standard GWAP metrics • Citizen Science metrics • Influence of different factors, including incentives, playing style, task difficulty, … 5Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018
  6. 6. [Q1] HOW DO PARTICIPATION AND RESULTS CHANGE WITH INCENTIVES? [Q2] DO THE EXTRINSIC REWARD EFFECTS LAST OVER TIME? [Q1] • A tangible reward has a clear effect on participation • There is a statistically significant difference between competition and non-competition periods in all evaluation metrics (throughput, average life play, expected contribution) [Q2] • The incentive effect doesn’t seem to last: there is no statistically significant difference between the pre-competition and the post-competition periods • The overlaps between the set of players in the different periods are very limited (<10%) Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018 Before During After Time span (months) 4 1 4 Classified images 1,830 24,600 1,300 Contributions 13,000 187,600 3,600 Users 285 174 174 Total play time (hours) 65 471 29 Throughput (tasks/hour) 69 212 113 ALP (mins/player) 5.5 65 4 EC (tasks/user) 6.4 141 7.5 6
  7. 7. [Q3] DOES PLAYING STYLE CHANGE WITH THE INCENTIVE? • Contribution speed = number of images played in each game round • Estimation: 3-5 seconds/photo, 1 min round  ~ 15 images/round • During the competition (extrinsic motivation) • Normal distribution centred around 15 pictures/round • Players tried to classify as many picture as possible • Before and after the competition (intrinsic motivation) • Almost flat distribution with median < 10 images/round • Players adopted a more “relaxed” playing style 7Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018
  8. 8. [Q4] HOW DO GWAPS COMPARE TO TRADITIONAL CITIZEN SCIENCE? [Q5] WHAT DOES PLAYER BEHAVIOUR TELL ABOUT THE GAME NATURE? [Q4] • Engagement metrics • From Citizen Science literature: activity ratio (AR, % active days), daily devoted time (DDT, in hours), relative active duration (RAD, wrt reference period), variation in periodicity (VIP, std of intervals between active days) • Players show very different behaviour: • 2-3 times higher AR, consistently higher DDT and RAD • Significantly lower VIP • Clustering leads to 90% group of hardworkers (high AR and low VIP), other Citizen Science behaviour not observed [Q5] • Casual game, because of total active time (last – first round) • 75% of players played for less than 5 minutes • 10% of players played for more than 1 day 8Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018 NK (global) NK (compet.) MW (*) GZ (*) WI (**) AR 0.96 0.95 0.40 0.33 0.32 DDT 0.68 1.80 0.44 0.32 - RAD - 0.54 0.20 0.23 0.43 VIP 14.53 2.53 18.27 25.23 5.11 Citizen Science campaigns from reference literature: * Ponciano, L., Brasileiro, F.: Finding volunteers’ engagement profiles in human computation for citizen science projects. Human Computation Journal, 2015 ** Aristeidou, M., Scanlon, E., Sharples, M.: Profiles of engagement in online communities of citizen science participation. Computers in Human Behavior, 2017
  9. 9. [Q6] WHAT KIND OF GWAP PLAYER PROFILES CAN BE IDENTIFIED? • Player accuracy = how many tasks each player correctly solved over the total number of tasks he played with (correct wrt aggregated solution) • Player participation = total number of contributions given by player • Threshold on accuracy axis  accurate / inaccurate player distinction • Threshold on participation axis  casual / frequent player distinction • Four different player profiles: • Beginners (low participation, low accuracy) • Snipers (low participation, high accuracy) • Champions (high participation, high accuracy) • Trolls (high participation, low accuracy) • Distribution of contributions across profiles: 9Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018 Beginners Snipers Champions Trolls Contributions 0.7% 0.4% 95.9% 3.0%
  10. 10. [Q7] DOES PLAYER BEHAVIOUR CHANGE WITH DIFFERENT INCENTIVES? • During competition (extrinsic motivation period) • Majority of champions (high participation, high accuracy)  maybe learning effect? • Higher average accuracy (statistically significant difference) for both casual and frequent players (7% improvement in both cases)  higher attention brings higher quality • Before/after competition (intrinsic motivation period) • (Relative) majority of beginners (low participation, low accuracy)  maybe due to curiosity or “first try” • Higher variability of accuracy values (height of boxplots) • In all periods: limited number of trolls, and always majority of accurate players (snipers+champions, 64%) 10Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018
  11. 11. [Q8] DOES PLAYER BEHAVIOUR CHANGE WITH TASK DIFFICULTY? [Q9] DOES PLAYER BEHAVIOUR CHANGE WITH TASK VARIETY? • Task difficulty = number of different users needed to solve the task (i.e. to find an agreement by aggregating user contributions) • Easy tasks: 4 users (minimum by design), 58% of all tasks • Difficult tasks: 5 to 17 users • Accuracy variability with task difficulty • No difference between casual and frequent players on easy tasks • Statistically significant difference between casual and frequent players on difficult tasks  learning effect (the more they play, the higher the accuracy) • Accuracy variability with task variety (different classes) • Some classes are indeed “more difficult” than others • No difference between casual and frequent players across classes  indeed anybody can be a classifier (no expert knowledge required) 11Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018
  12. 12. CONCLUSIONS • GWAPs are an effective “human in the loop” method to engage a target community in a process of knowledge management (e.g. to collect a large enough training set for machine learning) • Still they are less explored and evaluated among Human Computation approaches • Investigation of the interplay of different factors in GWAP evaluation • Game incentives, player participation profiles, task difficulty, … • A framework to analyse a GWAP and assess the effectiveness of your target community involvement in knowledge acquisition and management • Quantitative results are specific of the analysed game, but completely replicable approach • A method to identify strengths and weaknesses of a GWAP and to plan improvements Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018 12
  13. 13. MILANO viale Sarca 226, 20126, Milano - Italy LONDON 4° floor 57 Rathbone Place London W1T 1JU – UK NEW YORK One Liberty Plaza, 165 Broadway, 23rd Floor, New York City, New York, 10006 USA Cefriel.com Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with a Purpose Gloria Re Calegari and Irene Celino This work was partially supported by the STARS4ALL project (H2020-688135) co-funded by the European Commission Icons made by Eucalyp from www.flaticon.com Contact me: Irene Celino Head of Knowledge Technologies Group Cefriel - Politecnico di Milano irene.celino@cefriel.com iricelino.org

    Als Erste(r) kommentieren

    Loggen Sie sich ein, um Kommentare anzuzeigen.

presentation of my paper at EKAW 2018 in Nancy - How to take multiple factors into account when evaluating a Game with a Purpose? How is player behaviour or participation influenced by different incentives? How does player engagement impact their accuracy in solving tasks? In this paper, we present a detailed investigation of multiple factors affecting the evaluation of a GWAP and we show how they impact on the achieved results. We inform our study with the experimental assessment of a GWAP designed to solve a multinomial classification task.

Aufrufe

Aufrufe insgesamt

298

Auf Slideshare

0

Aus Einbettungen

0

Anzahl der Einbettungen

5

Befehle

Downloads

0

Geteilt

0

Kommentare

0

Likes

0

×