DevilTyper: A Game for CAPTCHA Usability Evaluation

•

1 gefällt mir•921 views

CAPTCHA is an effective and widely used solution for preventing computer programs (i.e., bots) from performing automated but often malicious actions, such as registering thousands of free email accounts or posting advertisement on Web blogs. To make CAPTCHAs robust to automatic character recognition techniques, the text in the tests are often distorted, blurred, and obscure. At the same time, those robust tests may prevent genuine users from telling the text easily and thus distribute the cost of crime prevention among all the users. Thus, we are facing a dilemma, that is, a CAPTCHA should be robust enough so that it cannot be broken by programs, but also needs to be easy enough so that users need not to repeatedly take tests because of wrong guesses. In this article, we attempt to resolve the dilemma by proposing a human computation game for quantifying the usability of CAPTCHAs. In our game, DevilTyper, players try to defeat as many devils as possible by solving CAPTCHAs, and player behavior in completing a CAPTCHA is recorded at the same time. Therefore, we can evaluate CAPTCHAs’ usability by analyzing collected player inputs. Since DevilTyper provides entertainment itself, we conduct a large-scale study for CAPTCHAs’ usability without the resource overhead required by traditional survey-based studies. In addition, we propose a consistent and reliable metric for assessing usability. Our evaluation results show that DevilTyper provides a fun and efficient platform for CAPTCHA designers to assess their CAPTCHA usability and thus improve CAPTCHA design.

Technologie Unterhaltung & Humor

Chien-Ju Ho1, Chen-Chi Wu2,
Kuan-Ta Chen1, Chin-Kuang Lai2
Presenter: Derec Wu1
1Institute of Information Science, Academia Sinica
2Department of Electrical Engineering, NationalTaiwan University

 Acronym for Computer Automated Public
Turing test to tell Computers and Humans
Apart
 Challenge-Response test
 Require users type letters or digits from a
distorted image to distinguish humans from
computers

 CAPTCHAs tests must be
 Secure
▪ Hard for computers
▪ Prevent computer programs from performing automated
malicious tasks
 Usable
▪ easy for human beings ?

Human Usability
v.s.
Computational Challenges

 Determine the difficulty of the CAPTCHA test
for human beings
 Traditional approach
 human survey
▪ cost a lot of money
▪ difficult to scale up

 A human computation game for CAPTCHA
usability evaluation
 Players are engaged to solve the problem for
us while having fun themselves
 Lower monetary cost and easier to scale up

 Overview
 CAPTCHA
 Why devilTyper
 DevilTyper Design
 Experiment
 Experiment setup
 Results
 Conclusion

 Each devil is attached with a CAPTCHA test
 Players are required to solve the test
correctly to win the game
 Player behaviors are recorded and are used to
evaluate the CAPTCHA usability

 Players must solve the CAPTCHA before the
devil from the top reaches the bottom
 Get scores by solving CAPTCHAs
 Lose HP if the devil reaches the bottom

 High score lists are maintained to encourage
players playing more

 The following player behaviors for solving
each CAPTCHA test are collected
 Finish time
 Rate of typing error
 Rate of giving up the test
 Rate of repeated typing
 Rate of failing to solve the test within time limit

 We announced the game in a popular social
network PTT and held a four-week campaign
 Total cost: US$ 30
 Total number of games being played: 6,500
 Total CAPTCHAs being solved: 1,407,055

 The results of different metrics are consistent
* A-F:
different types of CAPTCHAs
*The results are normalized to
0 to 1 for comparisons

 The DevilTyper results are consistent with
traditional survey method (MechanicalTurk)
* A-F:
different types of CAPTCHAs
*The results are normalized to
0 to 1 for comparisons
DevilTyper provides an open platform
for evaluating CAPTCHA usability

Design factors analysis using
DevilTyper

 Three strategies for text distortion in
CoolCAPTCHA
Character Distance X-AxisWave Y-AxisWave

 Three strategies for noise addition in
TgCAPTCHA
LongArcs Noise ShortArcs Noise Short Line Noise

 The difficulty of recognizing each character in
differentCAPTCHA types can be determined
“i” is hardly recognizable
in TgCAPTCHA
“i” is easier to recognize
in CoolCAPTCHA
Q V
C
T

 We proposed a human computation game,
Deviltyper, for evaluating CAPTCHA usability
 Monetary cost is much lower than traditional
surveys
 Evaluation is easier to scale up
 We show how this open platform can be used
to help the CAPTCHA designers to design
more user-friendlyCAPTCHAs

ThankYou 
http://deviltyper.iis.sinica.edu.tw/

Empfohlen

Generic Solving Of Text Based Captchakaranwayne

Captcha as graphical passwordGopinath Ramanna

CaptchasNIKHIL NAIR

CAPTCHA- Newly Attractive Presentation for YouthWebCrazyLabs

CaptchaVruti Surani

Captcha seminar Aurobindo Nayak

captcha.pptavinash2008

Herbalife product detail and usageHerbalife Distributor

Empfohlen

Generic Solving Of Text Based Captchakaranwayne

Captcha as graphical passwordGopinath Ramanna

CaptchasNIKHIL NAIR

CAPTCHA- Newly Attractive Presentation for YouthWebCrazyLabs

CaptchaVruti Surani

Captcha seminar Aurobindo Nayak

captcha.pptavinash2008

Herbalife product detail and usageHerbalife Distributor

Computational Social Science:The Collaborative Futures of Big Data, Computer ...Academia Sinica

Games on Demand: Are We There Yet?Academia Sinica

Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...Academia Sinica

Cloud Gaming Onward: Research Opportunities and OutlookAcademia Sinica

Quantifying User Satisfaction in Mobile Cloud GamesAcademia Sinica

量化「樂趣」－以心理生理量測探究數位娛樂商品之市場價值Academia Sinica

On The Battle between Online Gamers and LagsAcademia Sinica

Understanding The Performance of Thin-Client GamingAcademia Sinica

Quantifying QoS Requirements of Network Services: A Cheat-Proof FrameworkAcademia Sinica

Online Game QoE Evaluation using Paired ComparisonsAcademia Sinica

GamingAnywhere: An Open Cloud Gaming SystemAcademia Sinica

Are All Games Equally Cloud-Gaming-Friendly? An Electromyographic ApproachAcademia Sinica

Forecasting Online Game AddictivenessAcademia Sinica

Identifying MMORPG Bots: A Traffic Analysis ApproachAcademia Sinica

Toward an Understanding of the Processing Delay of Peer-to-Peer Relay NodesAcademia Sinica

Inferring Speech Activity from Encrypted Skype TrafficAcademia Sinica

Game Bot Detection Based on Avatar TrajectoryAcademia Sinica

Improving Reliability of Web 2.0-based Rating Systems Using Per-user TrustinessAcademia Sinica

A Collusion-Resistant Automation Scheme for Social Moderation SystemsAcademia Sinica

Tuning Skype’s Redundancy Control Algorithm for User SatisfactionAcademia Sinica

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Weitere ähnliche Inhalte

Mehr von Academia Sinica

Computational Social Science:The Collaborative Futures of Big Data, Computer ...Academia Sinica

Games on Demand: Are We There Yet?Academia Sinica

Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...Academia Sinica

Cloud Gaming Onward: Research Opportunities and OutlookAcademia Sinica

Quantifying User Satisfaction in Mobile Cloud GamesAcademia Sinica

量化「樂趣」－以心理生理量測探究數位娛樂商品之市場價值Academia Sinica

On The Battle between Online Gamers and LagsAcademia Sinica

Understanding The Performance of Thin-Client GamingAcademia Sinica

Quantifying QoS Requirements of Network Services: A Cheat-Proof FrameworkAcademia Sinica

Online Game QoE Evaluation using Paired ComparisonsAcademia Sinica

GamingAnywhere: An Open Cloud Gaming SystemAcademia Sinica

Are All Games Equally Cloud-Gaming-Friendly? An Electromyographic ApproachAcademia Sinica

Forecasting Online Game AddictivenessAcademia Sinica

Identifying MMORPG Bots: A Traffic Analysis ApproachAcademia Sinica

Toward an Understanding of the Processing Delay of Peer-to-Peer Relay NodesAcademia Sinica

Inferring Speech Activity from Encrypted Skype TrafficAcademia Sinica

Game Bot Detection Based on Avatar TrajectoryAcademia Sinica

Improving Reliability of Web 2.0-based Rating Systems Using Per-user TrustinessAcademia Sinica

A Collusion-Resistant Automation Scheme for Social Moderation SystemsAcademia Sinica

Tuning Skype’s Redundancy Control Algorithm for User SatisfactionAcademia Sinica

Mehr von Academia Sinica (20)

Computational Social Science:The Collaborative Futures of Big Data, Computer ...

Games on Demand: Are We There Yet?

Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...

Cloud Gaming Onward: Research Opportunities and Outlook

Quantifying User Satisfaction in Mobile Cloud Games

量化「樂趣」－以心理生理量測探究數位娛樂商品之市場價值

On The Battle between Online Gamers and Lags

Understanding The Performance of Thin-Client Gaming

Quantifying QoS Requirements of Network Services: A Cheat-Proof Framework

Online Game QoE Evaluation using Paired Comparisons

GamingAnywhere: An Open Cloud Gaming System

Are All Games Equally Cloud-Gaming-Friendly? An Electromyographic Approach

Forecasting Online Game Addictiveness

Identifying MMORPG Bots: A Traffic Analysis Approach

Toward an Understanding of the Processing Delay of Peer-to-Peer Relay Nodes

Inferring Speech Activity from Encrypted Skype Traffic

Game Bot Detection Based on Avatar Trajectory

Improving Reliability of Web 2.0-based Rating Systems Using Per-user Trustiness

A Collusion-Resistant Automation Scheme for Social Moderation Systems

Tuning Skype’s Redundancy Control Algorithm for User Satisfaction

Kürzlich hochgeladen

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700

Histor y of HAM Radio presentation slidevu2urc

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Partners Life - Insurer Innovation Award 2024The Digital Insurer

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Kürzlich hochgeladen (20)

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

Scaling API-first – The story of a global engineering organization

Finology Group – Insurtech Innovation Award 2024

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...

Histor y of HAM Radio presentation slide

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

08448380779 Call Girls In Friends Colony Women Seeking Men

How to Troubleshoot Apps for the Modern Connected Worker

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

Breaking the Kubernetes Kill Chain: Host Path Mount

Partners Life - Insurer Innovation Award 2024

Automating Google Workspace (GWS) & more with Apps Script

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Handwritten Text Recognition for manuscripts and early printed texts

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Driving Behavioral Change for Information Management through Data-Driven Gree...

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

DevilTyper: A Game for CAPTCHA Usability Evaluation

1. Chien-Ju Ho1, Chen-Chi Wu2, Kuan-Ta Chen1, Chin-Kuang Lai2 Presenter: Derec Wu1 1Institute of Information Science, Academia Sinica 2Department of Electrical Engineering, NationalTaiwan University

4.  Acronym for Computer Automated Public Turing test to tell Computers and Humans Apart  Challenge-Response test  Require users type letters or digits from a distorted image to distinguish humans from computers

5.  CAPTCHAs tests must be  Secure ▪ Hard for computers ▪ Prevent computer programs from performing automated malicious tasks  Usable ▪ easy for human beings ?

6. Human Usability v.s. Computational Challenges

7.  Determine the difficulty of the CAPTCHA test for human beings  Traditional approach  human survey ▪ cost a lot of money ▪ difficult to scale up

8.  A human computation game for CAPTCHA usability evaluation  Players are engaged to solve the problem for us while having fun themselves  Lower monetary cost and easier to scale up

9.  Overview  CAPTCHA  Why devilTyper  DevilTyper Design  Experiment  Experiment setup  Results  Conclusion

10.  Each devil is attached with a CAPTCHA test  Players are required to solve the test correctly to win the game  Player behaviors are recorded and are used to evaluate the CAPTCHA usability

11.  Players must solve the CAPTCHA before the devil from the top reaches the bottom  Get scores by solving CAPTCHAs  Lose HP if the devil reaches the bottom

12.  High score lists are maintained to encourage players playing more

13.  http://deviltyper.iis.sinica.edu.tw/

14.  The following player behaviors for solving each CAPTCHA test are collected  Finish time  Rate of typing error  Rate of giving up the test  Rate of repeated typing  Rate of failing to solve the test within time limit

15.  Overview  CAPTCHA  Why devilTyper  DevilTyper Design  Experiment  Experiment setup  Results  Conclusion

16.  We announced the game in a popular social network PTT and held a four-week campaign  Total cost: US$ 30  Total number of games being played: 6,500  Total CAPTCHAs being solved: 1,407,055

17.  CAPTCHATypes

18.  The results of different metrics are consistent * A-F: different types of CAPTCHAs *The results are normalized to 0 to 1 for comparisons

19.  The DevilTyper results are consistent with traditional survey method (MechanicalTurk) * A-F: different types of CAPTCHAs *The results are normalized to 0 to 1 for comparisons DevilTyper provides an open platform for evaluating CAPTCHA usability

20. Design factors analysis using DevilTyper

21. A: B: C: D: E: F: G: PlainText

22.  Three strategies for text distortion in CoolCAPTCHA Character Distance X-AxisWave Y-AxisWave

23.  Three strategies for noise addition in TgCAPTCHA LongArcs Noise ShortArcs Noise Short Line Noise

24.  The difficulty of recognizing each character in differentCAPTCHA types can be determined “i” is hardly recognizable in TgCAPTCHA “i” is easier to recognize in CoolCAPTCHA Q V C T

25.  We proposed a human computation game, Deviltyper, for evaluating CAPTCHA usability  Monetary cost is much lower than traditional surveys  Evaluation is easier to scale up  We show how this open platform can be used to help the CAPTCHA designers to design more user-friendlyCAPTCHAs

26. ThankYou  http://deviltyper.iis.sinica.edu.tw/

Hinweis der Redaktion

To ensure that the response is not generated by a computer
The common procedures to generate such images often include distortions,overlapping, clipping, and noise addition. These proceduresare performed to make image recognition algorithms unableto resolve the text in the images. However, the distortion ofthe text should be controlled to a reasonable level so thathuman can still tell the text clearly.
The most intuitive way to assess the usability of CAPTCHAsis to ask numerous human subjects to solve assignedCAPTCHAs repeatedly.However, such surveys are cost prohibitiveif a large-scale study is required and the investigatedCAPTCHAs are constantly updating. For example,investigating how different background noises affect theuser perception would require a large number of user inputs,which requires significant monetary investment to conductuser studies.
DevilTyper provides an open platform for evaluating CAPTCHA usability
(a) AuthImage(b) Captcher(c) Kiranvj(d) SecurImage(e) Plain Text(f) CoolCAPTCHA (g) TgCAPTCHA
Character distance stands for the distance between characters.In our experiment, we randomly set the characterdistances between 0.8 and 1.3, where a larger value correspondsto a tighter character arrangement.X-axis wave controls the degree of sine-wave distortions ofcharacters along the x-axis. In the experiment, this parameteris randomly set within the range from 0.5 to 1.2, wherea larger magnitude corresponds to stronger distortion.the x-axis wave distortion does notmake a systematic influence on users’ error rate, which impliesthat this type of distortion does not harm the CAPTCHA’susability.the y-axis distortions lead to a much moresignificant impact on CAPTCHA usability than x-axis distortions.Therefore, CAPTCHA designers should be carefulin choosing the appropriate degree for this type of distortionswhen adopting such CAPTCHAs in real use.the y-axis wave controlsthe degree of sine-wave distortions of characters alongthe y-axis, which we set within the range of 0.5 and 1.2in our experiments.
TgCAPTCHA, which is similar to theprevious Microsft CAPTCHA scheme, to demonstrate howsuch analysis is done by using the traces produced by DevilTyper.==========================================Long ArcsThe long arcs parameter controls the number of long arcsoverlaid on the image, where the position, length, and curvatureof the arcs are randomly chosen. In the experiment,we set this parameter between 0 and 5. we can see that the long arcs do not influence the usabilityof the CAPTCHAs significantly even when 5 long arcs wereadded.==========================================Short ArcsSimilar to long arcs, the short arcs parameter controls thenumber of short arcs overlaid on the image. In our experiment,the number of short arcs are randomly drawn fromthe range 0 to 20. Interestingly, while long arcs do not impactthe CAPTCHA’s usability, short arcs do, as shown inFigure 13(b). We believe it is due to the length of short arcsare similar to that of the character strokes so that short arcsare more likely to interfere with distorted text and increasethe difficulty of text recognition.==========================================Short LinesThe short lines parameter controls the number of short linesoverlaid on the rendered CAPTCHA. As with long and shortarcs, the position, length, and direction of each segmentis randomly decided. Our results show that users’ averageerror rates slightly but steadily increase with more shortlines, as shown in Figure 13(c). However, the impact ofshort lines is slightly less than that of short arcs, which isreasonable because arcs are more like the strokes of distortedtext and therefore more interference on readers’ recognitionis induced.
Each CAPTCHA scheme has its own obscuration algorithmto distort the text, which may have different impactson the recognition difficulty of different characters.We believe such results provide helpful informationwhen designing and applying CAPTCHAs. One obviousapplication is that, if a user happens to correctly solve allthe characters beside a ‘C’ character with the SecurImagescheme, we may allow the user pass the test as the ‘C’ characteris really difficult to recognize with that scheme.