2. Remember This?
⢠CAPTCHA
â Completely
â Automated
â Public
â Turing test to tell
â Computers and
â Humans
â Apart
⢠Security for the website, Agreed
⢠But for the real users?
⢠BORING task
⢠Waste of time
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.
3. CAPTCHA
⢠Yahoo! popularized it first
⢠Later, almost every website started using CAPTCHA to
avoid automated attacks
⢠Very effective : Only people can crack those word /
image puzzles
⢠But, it is a waste of time too
â Assuming you spend 10 seconds on a CAPTCHA
â Multiplied by 200 Million CAPTCHAs every day
â Thousands of hours being wasted on a daily basis
⢠Can something be done about this? (1)
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.
4. Another Problem
⢠Digitizing Books
⢠Process:
â Stage 1
⢠Scan
⢠Convert to image
⢠Save
â Stage 2
⢠Use OCR to convert
images to text
⢠Searchable Text
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.
5. OCR
⢠Optical Character Recognition
⢠Wonderful technology
⢠But not always reliable
⢠Especially with old text (due to ancient typeface,
damages, stains etc.,)
⢠Can something be done about this? (2)
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.
6. Possible Solutions
⢠Manual Corrections
â Near Impossible
â VERY Expensive
⢠Using multiple OCR Programs
â They will still make mistakes
â But not the same mistakes
â Hopefully!
⢠Can something be done about this? (3)
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.
7. Crowd Sourcing
⢠Each book contains 25000 words (Assume)
â Can we split them to 25 people, each correcting 1000 words?
â Or 50 people, each 500 words?
â Or 100 people, each 250 words?
â Or 2500 people, each 10 words?
â Or 25000 people, each 1 word?
⢠Sounds Stupid?
â Think again!
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.
8. Dr. Luis von Ahn
⢠Associate Professor @ Carnegie Mellon University
⢠Coined the word CAPTCHA
⢠Pioneer in the field of Crowdsourcing
⢠Founder of the company reCAPTCHA (Later acquired
by Google)
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.
9. reCAPTCHA
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.
10. reCAPTCHA Process
⢠Step 1 : Using multiple OCR Programs
â Accept Matching Words
â Use Dictionary
â Flag âProblematicâ Words
⢠Step 2 : reCAPTCHA
â Millions of users on various websites fill reCAPTCHA forms
⢠Proving they are not robots
⢠Proof reading text, One word at a time
â Similar entries are compared, before arriving at the final word
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.
11. How It Works
Flagged Word Control Word
(Real CAPTCHA)
Remember â25000 people, Proof Reading 1 Word at a timeâ?
Not âStupidâ Anymore!
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.
12. Few Statistics
⢠100M+ reCAPTCHAs every day
⢠96000+ Websites
â Most major websites use it
⢠Facebook, Twitter, CNN etc.,
⢠Security concerns exist!
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.
13. What We Can Do
⢠Use reCAPTCHA instead of CAPTCHA in your
websites, wherever required
â Registration Forms, Blogs, Forums etc.,
â Easy to use Widgets
⢠Be proud when filling a reCAPTCHA form
â You are helping Google preserve books âş
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.
14. Applying Crowd Sourcing
⢠Can it solve some of your existing problems?
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.
15. References, Image Credits
⢠https://www.youtube.com/watch?v=VoybhowC4LE
⢠http://www.nytimes.com/2011/03/29/science/29recaptcha.html?_r=1&
⢠http://techie-buzz.com/tech-news/recaptcha-crowdsourcing-ocr-google-
books.html
⢠http://www.google.com/recaptcha
⢠http://drupal.org/project/captcha
⢠http://www.captcha.net/
⢠http://www.brothersoft.com/cuneiform-ocr-4384.html
⢠http://www.compzets.com/view-upload.php?id=166&action=view
⢠http://en.wikipedia.org/
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.
16. Thank you
Story of reCAPTCHA www.crmit.com
Š Copyright 2013 CRMIT. All rights reserved.