SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Downloaden Sie, um offline zu lesen
• Introduction
• Crowd Motivation
• Client Motivations and Types of tasks
• Scale up with Machine Learning
• Quality Management
• Workflows for Complex tasks
• Reputation Systems
• Economic shift
                              PWI - September 29, 2011
                               corina@waterloohills.com
               http://bitsofknowledge.waterloohills.com
                        http://bitsofknowledge.waterloohills.com
Crowdsourcing                      Crowd or Community
                                   (online audience)




1               2


                                                               3


                    4
                    http://bitsofknowledge.waterloohills.com
Ex: “Adult Websites” Classification
• Large number of sites to label
• Get people to look at sites and classify them as:

  –G           (general audience)
  – PG         (parental guidance)
  –R           (restricted)
  –X           (porn)



  [Panos Ipeirotis. WWW2011 tutorial]   http://bitsofknowledge.waterloohills.com
Ex: “Adult Websites” Classification
• Large number of hand‐labeled sites
• Get people to look at sites and classify them as:

   –G           (general audience)
   – PG         (parental guidance)
   –R           (restricted)
   –X           (porn)
Cost/Speed Statistics:
• Undergrad intern: 200 websites/hr, cost: $15/hr
• MTurk: 2500 websites/hr, cost: $12/hr

 [Panos Ipeirotis. WWW2011 tutorial]   http://bitsofknowledge.waterloohills.com
Crowd Motivation

• €,$ = Money!
• Self-serving purpose (learning new skills,
  get recognition, avoid boredom, enjoyment,
  create a network with other profesionals)
• Socializing, feeling of belonging to a
  community, friendship
• Altruism (public good, help others)


                       http://bitsofknowledge.waterloohills.com
Examples: Altruism




         http://bitsofknowledge.waterloohills.com
Crowd Demography
  (background defines motivation)
• The 2008 survey at iStockphoto indicates that
  the crowd is quite homogenous and elite.

• Amazon’s Mechanical Turk workers come
  mainly from 2 countries:
  a) USA
  b) India


                       http://bitsofknowledge.waterloohills.com
Crowd Demography




        http://bitsofknowledge.waterloohills.com
Client motivation
• Need Suppliers:

  Mass work, Distributed work, or just tedious work
   Creative work
   Look for specific talent
   Testing
   Support
   To offload peak demands
   Tackle problems that need specific communities
  or human variety
   Any work that can be done cheaper this way.
                          http://bitsofknowledge.waterloohills.com
Client motivation

• Need customers!

• Need Funding

• Need to be Backed up

• Crowdsourcing is your business!



                     http://bitsofknowledge.waterloohills.com
Examples of Funding




         http://bitsofknowledge.waterloohills.com
Client Tasks Goals
3 main goals for a task to be done:

1. Minimize Cost (cheap)
2. Minimize Completion Time (fast)
3. Maximize Quality (good)

 Remember Crowd Motivation!
  (ex.: Game-ify your task,
  explain the final purpose)
                       http://bitsofknowledge.waterloohills.com
Examples: Games




       http://bitsofknowledge.waterloohills.com
http://bitsofknowledge.waterloohills.com
[Panos Ipeirotis. WWW2011 tutorial]
Pros
• Quicker: Parallellism reduces time
• Cheap
• Creativity, Innovation
• Quality (*depends)
• Access to scarce resources: The ‘long tail’
• Multiple feedback
• Allows to create a community (followers)
• Business Agility
• Scales up! (*up to a level)

                        http://bitsofknowledge.waterloohills.com
Cons
• Lack of professionalism: Unverified quality
• Too many answers
• No standards
• Not always cheap: Added costs to bring a
project to conclusion
• Too few participants if task or pay is not
attractive
• If worker is not motivated, lower quality of work


                         http://bitsofknowledge.waterloohills.com
Scale Up with Machine Learning
    Build an ‘Adult Website’ Classifier

• Crowdsourcing is cheap but not free
  - Workers cannot do more than xxhours/day,
   Cannot scale to web without help

Build automatic classification models using
 examples from crowdsourced data


                       http://bitsofknowledge.waterloohills.com
Integration with Machine Learning

• Humans label training data
• Use training data to build model




                       http://bitsofknowledge.waterloohills.com
Quality Management
       Ex: “Adult Website” Classification
• Bad news: Spammers!
• Worker ATAMRO447HWJQ labeled
  X (porn) sites as G (general audience)




[Panos Ipeirotis. WWW2011 tutorial]   http://bitsofknowledge.waterloohills.com
Quality Management
   Majority Voting and Label Quality
• Spammers try to go undetected
• Good willing workers may have bias
      difficult to set apart.

1. Ask multiple labelers
2. Keep majority label as
   “true” label

Use the probability of
being correct as the
Quality Indicator

                            http://bitsofknowledge.waterloohills.com
Complex tasks
 Handle answers through workflow
• Q: “My task does not have discrete answers….”
• A: Break into two Human Intelligence Tasks (HITs):
   – “Create” HIT
   – “Vote” HIT

Vote controls quality of Creation HIT
• Redundancy controls quality of Voting HIT



                          http://bitsofknowledge.waterloohills.com
Collaboration: Photo description
  But the free-form
  answer can be more
  complex, not just right or
  wrong…




TurkIt toolkit [Little et al., UIST 2010]: http://groups.csail.mit.edu/uid/turkit/
                                                 http://bitsofknowledge.waterloohills.com
Collaboration: Description Versions
1. A partial view of a pocket calculator
   together with some coins and a pen.
2. ...
3. A close‐up photograph of the following
   items: A CASIO multi‐function
   calculator. A ball point pen, uncapped.
   Various coins, apparently European,
   both copper and gold. Seems to be a
   theme illustration for a brochure or
   document cover treating finance,
   probably personal finance.
4. …
8. A close‐up photograph of the following items: A CASIO
   multi‐function, solar powered scientific calculator. A blue ball
   point pen with a blue rubber grip and the tip extended. Six
   British coins; two of £1value, three of 20p value and one of 1p
   value. Seems to be a theme illustration for a brochure or
   document cover treating finance ‐ probably personal finance.
                                      http://bitsofknowledge.waterloohills.com
Collaboration

• Exploration / exploitation tradeoff
  (Independence/or not)
– Can accelerate learning, by sharing good
  solutions
– But can lead to premature convergence on
  suboptimal solution

[Mason and Watts, submitted to Science, 2011]


                             http://bitsofknowledge.waterloohills.com
Collaboration: Positive
• Building iteratively allows better outcomes
  for the image description task.
• In the FoldIt puzzles, workers built on each
  other’s results. They recently found in 10
  days the molecular structure of a protein-
  cutting enzyme from an AIDS-like virus.




                        http://bitsofknowledge.waterloohills.com
Collaboration: Negative
             Group Thinking Effect
• Individual search strategies affect group success:

                        Players copying each other
                        make less exploring
                         lower probability of finding
                        peak on a round




                            http://bitsofknowledge.waterloohills.com
Workflow Patterns
• Generate / Create
• Find
• Improve / Edit / Fix
                                  Creation
• Vote for accept‐reject
• Vote up, vote down, to generate rank
• Vote for best / select top‐k
                                 Quality Control
• Split task
• Aggregate Flow Control
• Iterate
                                 Flow Control
                             http://bitsofknowledge.waterloohills.com
AdSafe Crowdsourcing Experience




               http://bitsofknowledge.waterloohills.com
http://bitsofknowledge.waterloohills.com
AdSafe Crowdsourcing Experience
•Detect pages that discuss swine flu
– Pharmaceutical firm had drug “treating” (off-label) swine flu
– FDA prohibited pharmaceuticals to display drug ad in
pages about swine flu
       Two days to comply!

• Big fast-food chain does not want ad to appear:
– In pages that discuss the brand (99% negative sentiment)
– In pages discussing obesity




                               http://bitsofknowledge.waterloohills.com
Adsafe Crowdsourcing Experience
     Workflow to classify URLs
• Find URLs for a given topic (hate speech, gambling, alcohol
abuse, guns, bombs, celebrity gossip, etc etc)
http://url‐collector.appspot.com/allTopics.jsp

• Classify URLs into appropriate categories
http://url‐annotator.appspot.com/AdminFiles/Categories.jsp

• Mesure quality of the labelers and remove spammers
http://qmturk.appspot.com/

• Get humans to “beat” the classifier by providing cases where
the classifier fails
http://adsafe‐beatthemachine.appspot.com/
                                http://bitsofknowledge.waterloohills.com
Crowdsourcing Aggregators
Act as Portals
• Create a crowd or community.
• Create a site to connect a client to the crowd
• Deal with workflow of complex tasks, like
decomposition into simpler tasks and answer
recomposition
• Works as Broker and Bank, Mediator

 Allow anonymity
 Consumers can benefit from a crowd without
the need to create it. http://bitsofknowledge.waterloohills.com
Market Design:
Crude vs Intelligent Crowdsourcing
• Intelligent Crowdsourcing uses an
  organized workflow to tackle CONS of
  crude crowdsourcing.

 Complex task is divided by experts,
 Given to relevant crowds, and not to
 everyone
Individual answers are recomposed by
 experts into general answer
                     http://bitsofknowledge.waterloohills.com
Lack of Reputation and
              Market for Lemons
“When quality of sold good is uncertain and hidden before
  transaction, prize goes to value of lowest valued good”
  [Akerlof, 1970; Nobel prize winner]

• Market evolution steps:
  1. Employers pays $10 to good worker, $0.1 to bad worker
  2. 50% good workers, 50% bad; indistinguishable from
  each other
  3. Employer offers price in the middle: $5
  4. Some good workers leave the market (pay too low)
  5. Employer revised prices downwards as % of bad
  increased
  6. More good workers leave the market… death spiral


http://en.wikipedia.org/wiki/The_Market_for_Lemons
                                        http://bitsofknowledge.waterloohills.com
Reputation systems
• Challenges:
  - Insufficient participation
  - Overwhelmingly positive feedback
   + Hoping to get a positive ranking in return
   - Negative feedback avoided for fear of retaliation
  - Dishonest reports
   + « Riddle for a PENNY! No shipping-Positive Feedback »
    - « Bad-mouth » reports
• Incentive mechanisms to get honest feedback
  - pay rater if report matches next;
  - delay next transaction over time
                                 http://bitsofknowledge.waterloohills.com
Reputation systems
• “Cheap pseudonyms”: easy to disappear and
  reregister under a new identity with almost no cost.
  [Friedman and Resnick 2001]
 Introduce opportunities to misbehave without
 paying reputational consequences.
Increase the difficulty of online identity changes
 Impose upfront costs to new entrants: allow new
 identities (forget the past) but make it costly.

• 2-sided Reputation Mechanisms
  – Crowd: To ensure worker quality
  – Employer: To ensure their trustworthiness
                                http://bitsofknowledge.waterloohills.com
Economical Shift
• From Social Networking to Social Production
  through Collaborative Innovation

   Mass-Collaboration changes how Products &
  Services are Designed,Manufactured,Marketed

• Classical geo-political and economical organisations
  do not correspond to new economy

   Realignment of competitive advantages
   Move towards Collaborative Enterprises based
  on Open Infrastructure
                            http://bitsofknowledge.waterloohills.com
Societal Shift
          Moral values Reinforcement
•   Open data access makes actions Transparent
•   Transparency makes people Accountable
•   Accountability forces/fosters Integrity
•   Integrity breeds Community Support

 Link between Ethical values and ROI


                         http://bitsofknowledge.waterloohills.com
References
• Wikipedia,2011
• Dion Hinchcliffe Crowdsourcing: 5 Reasons Its Not Just For Start Ups
Anymore,2009
• Tomoko A. Hosaka, MSNBC. "Facebook asks users to translate for
free“,2008.
• Daren C. Brabham. "Moving the Crowd at iStockphoto: The Composition of
the Crowd and Motivations for Participation in a Crowdsourcing Application",
First Monday, 13(6),2008.
• Karim R. Lakhani, Lars Bo Jeppesen, Peter A. Lohse & Jill A. Panetta. The
value of openness in scientific problem solving (Harvard Business School
Working Paper No. 07-050),2007.
• Klaus-Peter Speidel How to Do Intelligent Crowdsourcing,2011
• Panos Ipeirotis. Managing Crowdsourced Human Computation,
WWW2011 tutorial,2011
• Omar Alonso & Matthew Lease. Crowdsourcing 101: Putting the WSDM of
Crowds to Work for You, WSDM Hong Kong 2011.
• Sanjoy Dasgupta,
http://videolectures.net/icml09_dasgupta_langford_actl/,2009
•Don Tapscott, Anthony Williams. Macrowikinomics, 2010.
                                        http://bitsofknowledge.waterloohills.com
Call For Ideas:

                 If you have a large set of examples
                         or just an idea of application
                 for a program to classify or predict,
                       I would love to hear from you!

Questions?
                       corina@waterloohills.com
        http://bitsofknowledge.waterloohills.com
                           PWI - September 29, 2011
                       http://bitsofknowledge.waterloohills.com

Weitere ähnliche Inhalte

Andere mochten auch

Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012
xin wang
 

Andere mochten auch (6)

The World of Crowdsourcing
The World of CrowdsourcingThe World of Crowdsourcing
The World of Crowdsourcing
 
Webster the future of hr
Webster   the future of hrWebster   the future of hr
Webster the future of hr
 
Start Innovating Already: 13 Poisons to Open Innovation
Start  Innovating Already: 13 Poisons to Open InnovationStart  Innovating Already: 13 Poisons to Open Innovation
Start Innovating Already: 13 Poisons to Open Innovation
 
Crowdsource Your Performance Review
Crowdsource Your Performance ReviewCrowdsource Your Performance Review
Crowdsource Your Performance Review
 
Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012
 
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...
 

Ähnlich wie Crowdsourcing PWI Sept-2011

Getting Things Done with Crowdsourcing PWI May-2014
Getting Things Done with Crowdsourcing  PWI May-2014Getting Things Done with Crowdsourcing  PWI May-2014
Getting Things Done with Crowdsourcing PWI May-2014
Corina Ciechanow
 
Social media class 02062013
Social media class 02062013Social media class 02062013
Social media class 02062013
Kyle Claypool
 
Building an Excellent Web Startup
Building an Excellent Web StartupBuilding an Excellent Web Startup
Building an Excellent Web Startup
matthewhyatt
 
LEARN STARTUP OVERVIEW
LEARN STARTUP OVERVIEWLEARN STARTUP OVERVIEW
LEARN STARTUP OVERVIEW
we20
 
How to pich a VC (by Dave McClure)
How to pich a VC (by Dave McClure)How to pich a VC (by Dave McClure)
How to pich a VC (by Dave McClure)
Ricardo Dantas
 

Ähnlich wie Crowdsourcing PWI Sept-2011 (20)

Getting Things Done with Crowdsourcing PWI May-2014
Getting Things Done with Crowdsourcing  PWI May-2014Getting Things Done with Crowdsourcing  PWI May-2014
Getting Things Done with Crowdsourcing PWI May-2014
 
Selling UX in Your Organization - Stir Trek 2012
Selling UX in Your Organization - Stir Trek 2012Selling UX in Your Organization - Stir Trek 2012
Selling UX in Your Organization - Stir Trek 2012
 
Selling UX at CodeMash 2012
Selling UX at CodeMash 2012Selling UX at CodeMash 2012
Selling UX at CodeMash 2012
 
Social media class 02062013
Social media class 02062013Social media class 02062013
Social media class 02062013
 
Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018
Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018 Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018
Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018
 
TRANSLATING THE SH*T ENTREPRENEURS SAY
TRANSLATING THE SH*T ENTREPRENEURS SAYTRANSLATING THE SH*T ENTREPRENEURS SAY
TRANSLATING THE SH*T ENTREPRENEURS SAY
 
12 reasons your site sucks - InvestNI
12 reasons your site sucks - InvestNI12 reasons your site sucks - InvestNI
12 reasons your site sucks - InvestNI
 
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...
 
Principles of Website Design - Customer Experience and Usability IDM
Principles of Website Design - Customer Experience and Usability IDMPrinciples of Website Design - Customer Experience and Usability IDM
Principles of Website Design - Customer Experience and Usability IDM
 
Hardware Startups 101: Tips for "career engineers" (NESOSA 2015; Favalora)
Hardware Startups 101: Tips for "career engineers" (NESOSA 2015; Favalora)Hardware Startups 101: Tips for "career engineers" (NESOSA 2015; Favalora)
Hardware Startups 101: Tips for "career engineers" (NESOSA 2015; Favalora)
 
Intro to Product Management
Intro to Product Management Intro to Product Management
Intro to Product Management
 
Building an Excellent Web Startup
Building an Excellent Web StartupBuilding an Excellent Web Startup
Building an Excellent Web Startup
 
12 Rules for Building Your Product Management Playbook
12 Rules for Building Your Product Management Playbook12 Rules for Building Your Product Management Playbook
12 Rules for Building Your Product Management Playbook
 
How to Pitch a VC (Shanghai, May 2012)
How to Pitch a VC (Shanghai, May 2012)How to Pitch a VC (Shanghai, May 2012)
How to Pitch a VC (Shanghai, May 2012)
 
Building 500 Startups: #500STRONG
Building 500 Startups: #500STRONGBuilding 500 Startups: #500STRONG
Building 500 Startups: #500STRONG
 
Denver Startup Week: 10 Common Website Mistakes and How to Fix Them
Denver Startup Week: 10 Common Website Mistakes and How to Fix ThemDenver Startup Week: 10 Common Website Mistakes and How to Fix Them
Denver Startup Week: 10 Common Website Mistakes and How to Fix Them
 
LEARN STARTUP OVERVIEW
LEARN STARTUP OVERVIEWLEARN STARTUP OVERVIEW
LEARN STARTUP OVERVIEW
 
Web 2.0 Components for Business Websites
Web 2.0 Components for Business WebsitesWeb 2.0 Components for Business Websites
Web 2.0 Components for Business Websites
 
How to pich a VC (by Dave McClure)
How to pich a VC (by Dave McClure)How to pich a VC (by Dave McClure)
How to pich a VC (by Dave McClure)
 
Attract traffic with content and social media
Attract traffic with content and social mediaAttract traffic with content and social media
Attract traffic with content and social media
 

Kürzlich hochgeladen

FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
dollysharma2066
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
amitlee9823
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 

Kürzlich hochgeladen (20)

Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
 
John Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdfJohn Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdf
 
Grateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfGrateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdf
 
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataRSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors Data
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
 
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSM
 

Crowdsourcing PWI Sept-2011

  • 1. • Introduction • Crowd Motivation • Client Motivations and Types of tasks • Scale up with Machine Learning • Quality Management • Workflows for Complex tasks • Reputation Systems • Economic shift PWI - September 29, 2011 corina@waterloohills.com http://bitsofknowledge.waterloohills.com http://bitsofknowledge.waterloohills.com
  • 2. Crowdsourcing Crowd or Community (online audience) 1 2 3 4 http://bitsofknowledge.waterloohills.com
  • 3. Ex: “Adult Websites” Classification • Large number of sites to label • Get people to look at sites and classify them as: –G (general audience) – PG (parental guidance) –R (restricted) –X (porn) [Panos Ipeirotis. WWW2011 tutorial] http://bitsofknowledge.waterloohills.com
  • 4. Ex: “Adult Websites” Classification • Large number of hand‐labeled sites • Get people to look at sites and classify them as: –G (general audience) – PG (parental guidance) –R (restricted) –X (porn) Cost/Speed Statistics: • Undergrad intern: 200 websites/hr, cost: $15/hr • MTurk: 2500 websites/hr, cost: $12/hr [Panos Ipeirotis. WWW2011 tutorial] http://bitsofknowledge.waterloohills.com
  • 5. Crowd Motivation • €,$ = Money! • Self-serving purpose (learning new skills, get recognition, avoid boredom, enjoyment, create a network with other profesionals) • Socializing, feeling of belonging to a community, friendship • Altruism (public good, help others) http://bitsofknowledge.waterloohills.com
  • 6. Examples: Altruism http://bitsofknowledge.waterloohills.com
  • 7. Crowd Demography (background defines motivation) • The 2008 survey at iStockphoto indicates that the crowd is quite homogenous and elite. • Amazon’s Mechanical Turk workers come mainly from 2 countries: a) USA b) India http://bitsofknowledge.waterloohills.com
  • 8. Crowd Demography http://bitsofknowledge.waterloohills.com
  • 9. Client motivation • Need Suppliers: Mass work, Distributed work, or just tedious work  Creative work  Look for specific talent  Testing  Support  To offload peak demands  Tackle problems that need specific communities or human variety  Any work that can be done cheaper this way. http://bitsofknowledge.waterloohills.com
  • 10. Client motivation • Need customers! • Need Funding • Need to be Backed up • Crowdsourcing is your business! http://bitsofknowledge.waterloohills.com
  • 11. Examples of Funding http://bitsofknowledge.waterloohills.com
  • 12. Client Tasks Goals 3 main goals for a task to be done: 1. Minimize Cost (cheap) 2. Minimize Completion Time (fast) 3. Maximize Quality (good)  Remember Crowd Motivation! (ex.: Game-ify your task, explain the final purpose) http://bitsofknowledge.waterloohills.com
  • 13. Examples: Games http://bitsofknowledge.waterloohills.com
  • 15. Pros • Quicker: Parallellism reduces time • Cheap • Creativity, Innovation • Quality (*depends) • Access to scarce resources: The ‘long tail’ • Multiple feedback • Allows to create a community (followers) • Business Agility • Scales up! (*up to a level) http://bitsofknowledge.waterloohills.com
  • 16. Cons • Lack of professionalism: Unverified quality • Too many answers • No standards • Not always cheap: Added costs to bring a project to conclusion • Too few participants if task or pay is not attractive • If worker is not motivated, lower quality of work http://bitsofknowledge.waterloohills.com
  • 17. Scale Up with Machine Learning Build an ‘Adult Website’ Classifier • Crowdsourcing is cheap but not free - Workers cannot do more than xxhours/day, Cannot scale to web without help Build automatic classification models using examples from crowdsourced data http://bitsofknowledge.waterloohills.com
  • 18. Integration with Machine Learning • Humans label training data • Use training data to build model http://bitsofknowledge.waterloohills.com
  • 19. Quality Management Ex: “Adult Website” Classification • Bad news: Spammers! • Worker ATAMRO447HWJQ labeled X (porn) sites as G (general audience) [Panos Ipeirotis. WWW2011 tutorial] http://bitsofknowledge.waterloohills.com
  • 20. Quality Management Majority Voting and Label Quality • Spammers try to go undetected • Good willing workers may have bias  difficult to set apart. 1. Ask multiple labelers 2. Keep majority label as “true” label Use the probability of being correct as the Quality Indicator http://bitsofknowledge.waterloohills.com
  • 21. Complex tasks Handle answers through workflow • Q: “My task does not have discrete answers….” • A: Break into two Human Intelligence Tasks (HITs): – “Create” HIT – “Vote” HIT Vote controls quality of Creation HIT • Redundancy controls quality of Voting HIT http://bitsofknowledge.waterloohills.com
  • 22. Collaboration: Photo description But the free-form answer can be more complex, not just right or wrong… TurkIt toolkit [Little et al., UIST 2010]: http://groups.csail.mit.edu/uid/turkit/ http://bitsofknowledge.waterloohills.com
  • 23. Collaboration: Description Versions 1. A partial view of a pocket calculator together with some coins and a pen. 2. ... 3. A close‐up photograph of the following items: A CASIO multi‐function calculator. A ball point pen, uncapped. Various coins, apparently European, both copper and gold. Seems to be a theme illustration for a brochure or document cover treating finance, probably personal finance. 4. … 8. A close‐up photograph of the following items: A CASIO multi‐function, solar powered scientific calculator. A blue ball point pen with a blue rubber grip and the tip extended. Six British coins; two of £1value, three of 20p value and one of 1p value. Seems to be a theme illustration for a brochure or document cover treating finance ‐ probably personal finance. http://bitsofknowledge.waterloohills.com
  • 24. Collaboration • Exploration / exploitation tradeoff (Independence/or not) – Can accelerate learning, by sharing good solutions – But can lead to premature convergence on suboptimal solution [Mason and Watts, submitted to Science, 2011] http://bitsofknowledge.waterloohills.com
  • 25. Collaboration: Positive • Building iteratively allows better outcomes for the image description task. • In the FoldIt puzzles, workers built on each other’s results. They recently found in 10 days the molecular structure of a protein- cutting enzyme from an AIDS-like virus. http://bitsofknowledge.waterloohills.com
  • 26. Collaboration: Negative Group Thinking Effect • Individual search strategies affect group success: Players copying each other make less exploring  lower probability of finding peak on a round http://bitsofknowledge.waterloohills.com
  • 27. Workflow Patterns • Generate / Create • Find • Improve / Edit / Fix  Creation • Vote for accept‐reject • Vote up, vote down, to generate rank • Vote for best / select top‐k  Quality Control • Split task • Aggregate Flow Control • Iterate  Flow Control http://bitsofknowledge.waterloohills.com
  • 28. AdSafe Crowdsourcing Experience http://bitsofknowledge.waterloohills.com
  • 30. AdSafe Crowdsourcing Experience •Detect pages that discuss swine flu – Pharmaceutical firm had drug “treating” (off-label) swine flu – FDA prohibited pharmaceuticals to display drug ad in pages about swine flu  Two days to comply! • Big fast-food chain does not want ad to appear: – In pages that discuss the brand (99% negative sentiment) – In pages discussing obesity http://bitsofknowledge.waterloohills.com
  • 31. Adsafe Crowdsourcing Experience Workflow to classify URLs • Find URLs for a given topic (hate speech, gambling, alcohol abuse, guns, bombs, celebrity gossip, etc etc) http://url‐collector.appspot.com/allTopics.jsp • Classify URLs into appropriate categories http://url‐annotator.appspot.com/AdminFiles/Categories.jsp • Mesure quality of the labelers and remove spammers http://qmturk.appspot.com/ • Get humans to “beat” the classifier by providing cases where the classifier fails http://adsafe‐beatthemachine.appspot.com/ http://bitsofknowledge.waterloohills.com
  • 32. Crowdsourcing Aggregators Act as Portals • Create a crowd or community. • Create a site to connect a client to the crowd • Deal with workflow of complex tasks, like decomposition into simpler tasks and answer recomposition • Works as Broker and Bank, Mediator  Allow anonymity  Consumers can benefit from a crowd without the need to create it. http://bitsofknowledge.waterloohills.com
  • 33. Market Design: Crude vs Intelligent Crowdsourcing • Intelligent Crowdsourcing uses an organized workflow to tackle CONS of crude crowdsourcing.  Complex task is divided by experts,  Given to relevant crowds, and not to everyone Individual answers are recomposed by experts into general answer http://bitsofknowledge.waterloohills.com
  • 34. Lack of Reputation and Market for Lemons “When quality of sold good is uncertain and hidden before transaction, prize goes to value of lowest valued good” [Akerlof, 1970; Nobel prize winner] • Market evolution steps: 1. Employers pays $10 to good worker, $0.1 to bad worker 2. 50% good workers, 50% bad; indistinguishable from each other 3. Employer offers price in the middle: $5 4. Some good workers leave the market (pay too low) 5. Employer revised prices downwards as % of bad increased 6. More good workers leave the market… death spiral http://en.wikipedia.org/wiki/The_Market_for_Lemons http://bitsofknowledge.waterloohills.com
  • 35. Reputation systems • Challenges: - Insufficient participation - Overwhelmingly positive feedback + Hoping to get a positive ranking in return - Negative feedback avoided for fear of retaliation - Dishonest reports + « Riddle for a PENNY! No shipping-Positive Feedback » - « Bad-mouth » reports • Incentive mechanisms to get honest feedback - pay rater if report matches next; - delay next transaction over time http://bitsofknowledge.waterloohills.com
  • 36. Reputation systems • “Cheap pseudonyms”: easy to disappear and reregister under a new identity with almost no cost. [Friedman and Resnick 2001] Introduce opportunities to misbehave without paying reputational consequences. Increase the difficulty of online identity changes Impose upfront costs to new entrants: allow new identities (forget the past) but make it costly. • 2-sided Reputation Mechanisms – Crowd: To ensure worker quality – Employer: To ensure their trustworthiness http://bitsofknowledge.waterloohills.com
  • 37. Economical Shift • From Social Networking to Social Production through Collaborative Innovation  Mass-Collaboration changes how Products & Services are Designed,Manufactured,Marketed • Classical geo-political and economical organisations do not correspond to new economy  Realignment of competitive advantages  Move towards Collaborative Enterprises based on Open Infrastructure http://bitsofknowledge.waterloohills.com
  • 38. Societal Shift Moral values Reinforcement • Open data access makes actions Transparent • Transparency makes people Accountable • Accountability forces/fosters Integrity • Integrity breeds Community Support  Link between Ethical values and ROI http://bitsofknowledge.waterloohills.com
  • 39. References • Wikipedia,2011 • Dion Hinchcliffe Crowdsourcing: 5 Reasons Its Not Just For Start Ups Anymore,2009 • Tomoko A. Hosaka, MSNBC. "Facebook asks users to translate for free“,2008. • Daren C. Brabham. "Moving the Crowd at iStockphoto: The Composition of the Crowd and Motivations for Participation in a Crowdsourcing Application", First Monday, 13(6),2008. • Karim R. Lakhani, Lars Bo Jeppesen, Peter A. Lohse & Jill A. Panetta. The value of openness in scientific problem solving (Harvard Business School Working Paper No. 07-050),2007. • Klaus-Peter Speidel How to Do Intelligent Crowdsourcing,2011 • Panos Ipeirotis. Managing Crowdsourced Human Computation, WWW2011 tutorial,2011 • Omar Alonso & Matthew Lease. Crowdsourcing 101: Putting the WSDM of Crowds to Work for You, WSDM Hong Kong 2011. • Sanjoy Dasgupta, http://videolectures.net/icml09_dasgupta_langford_actl/,2009 •Don Tapscott, Anthony Williams. Macrowikinomics, 2010. http://bitsofknowledge.waterloohills.com
  • 40. Call For Ideas: If you have a large set of examples or just an idea of application for a program to classify or predict, I would love to hear from you! Questions? corina@waterloohills.com http://bitsofknowledge.waterloohills.com PWI - September 29, 2011 http://bitsofknowledge.waterloohills.com