SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Talking to the crowd in 7,000 languages
Robert Munro
Idibon
Crowdsourcing
Information is increasing
• Scale (well-known)
• Diversity (less understood)
– On a given day, what is the average number of
languages that someone could potentially hear?
– How has this changed?
Outline
Daily potential language exposure
5 5 5 5 5 5 4.5 4
50
1500
5000
2000
1400
720
540 500
Year
#oflanguages
5 5 5 5 5 5 4.5 4
50
1500
5000
2000
1400
720
540 500
Daily potential language exposure
Year
#oflanguages
5 5 5 5 5 5 4.5 4
50
1500
5000
2000
1400
720
540 500
Daily potential language exposure
Year
#oflanguages
5 5 5 5 5 5 4.5 4
50
1500
5000
2000
1400
720
540 500
Daily potential language exposure
Year
#oflanguages
Putting a phone in the
hands of everyone on the
planet is the easy part
Understanding
everyone is going to
be more complicated
99% of languages don’t have machine-translation
or similar services:
• Disproportionately lower healthcare & education
• Disproportionately greater exposure to disasters
Crowdsourcing can bridge part of the gap.
Diversity
GRAPH OF DEPLOYMENTS
Crowdsourcing
Crowdsourced
processing of
information in
Haitian Kreyol.
1000s of
Haitians in Haiti
and among the
diaspora.
Haiti – Mission 4636
Apo
Dalila
Haiti
(18.4957, -72.3185)
“I need Thomassin Apo please”
“Kenscoff Route: Lat: 18.4957, Long:-72.3185”
“This Area after Petion-Ville and Pelerin 5 is
not on Google Map. We have no streets
name”
Lopital Sacre-Coeur
ki nan vil Okap, pre
pou li resevwa
moun malad e lap
mande pou moun
ki malad yo ale la.
“Sacre-Coeur
Hospital which
located in this
village of Okap is
ready to receive
those who are
injured. Therefore,
we are asking
those who are sick
to report to that
hospital.”
Lopital Sacre-Coeur
ki nan vil Okap, pre
pou li resevwa
moun malad e lap
mande pou moun
ki malad yo ale la.
“Sacre-Coeur
Hospital which
located in this
village of Okap is
ready to receive
those who are
injured. Therefore,
we are asking
those who are sick
to report to that
hospital.”
Lopital Sacre-Coeur
ki nan vil Okap, pre
pou li resevwa
moun malad e lap
mande pou moun
ki malad yo ale la.
“Sacre-Coeur
Hospital which
located in this
village of Okap is
ready to receive
those who are
injured. Therefore,
we are asking
those who are sick
to report to that
hospital.”
Evaluating local knowledge
Lopital Sacre-
Coeur ki nan vil
Okap, pre pou li
resevwa moun
malad e lap
mande pou moun
ki malad yo ale
la.
Haitians (volunteers and paid) “Non-Haitians
3,000 messages
< 5 minutes each
> 4 hours each
45,000 messages
Lopital Sacre-
Coeur ki nan vil
Okap, pre pou li
resevwa moun
malad e lap
mande pou moun
ki malad yo ale
la.
Lopital Sacre-
Coeur ki nan vil
Okap, pre pou li
resevwa moun
malad e lap
mande pou moun
ki malad yo ale
la.
Lopital Sacre-
Coeur ki nan vil
Okap, pre pou li
resevwa moun
malad e lap
mande pou moun
ki malad yo ale
la.
Lopital Sacre-
Coeur ki nan vil
Okap, pre pou li
resevwa moun
malad e lap
mande pou moun
ki malad yo ale
la.
Lopital Sacre-
Coeur ki nan vil
Okap, pre pou li
resevwa moun
malad e lap
mande pou moun
ki malad yo ale
la.
Lopital Sacre-
Coeur ki nan vil
Okap, pre pou li
resevwa moun
malad e lap
mande pou moun
ki malad yo ale
la.
Lopital Sacre-
Coeur ki nan vil
Okap, pre pou li
resevwa moun
malad e lap
mande pou moun
ki malad yo ale
la.
Lopital Sacre-
Coeur ki nan vil
Okap, pre pou li
resevwa moun
malad e lap
mande pou moun
ki malad yo ale
la.
Lopital Sacre-
Coeur ki nan vil
Okap, pre pou li
resevwa moun
malad e lap
mande pou moun
ki malad yo ale
la.
Lopital Sacre-
Coeur ki nan vil
Okap, pre pou li
resevwa moun
malad e lap
mande pou moun
ki malad yo ale
la.
Lopital Sacre-
Coeur ki nan vil
Okap, pre pou li
resevwa moun
malad e lap
mande pou moun
ki malad yo ale
la.
Lopital Sacre-
Coeur ki nan vil
Okap, pre pou li
resevwa moun
malad e lap
mande pou moun
ki malad yo ale
la.
Haiti – Mission 4636
Lessons learned
• Default to private data practices
(Majority decision was not to use a public map)
• Find volunteers through strong social ties
(10x larger/faster than the publicized efforts)
• Avoid activists (‘bloggers’, ‘crisis-mappers’ …)
• Localize to the crisis-affected community
(25% of work was by paid workers in Haiti)
Haiti – Mission 4636
Paid workers in Mirebalais, Haiti (FATEM)
Benchmarks we can use:*
$ 0.25 per translation
$ 0.20 per geolocation
$ 0.05 per categorization / filtering
4:00 minutes per report processed
Can volunteerism undercut this cost?
* Munro. 2012. Crowdsourcing and crisis-affected community: lessons learned and looking
forward from Mission 4636. Journal of Information Retrieval
Data-structuring for 2010 floods in Pakistan
Pakreport
*Chohan, Hester and Munro. 2012. Pakreport: Crowdsourcing for Multipurpose and Multicategory
Climate-related Disaster Reporting. Climate Change, Innovation & ICTs Project. CDI
Multiple inexperienced
people are more
accurate than one
experienced person.*
Pakreport
Lessons learned
• Default to private data practices (!)
(Taliban threatened to attack mapped aid workers)
• Cross-validate tasks across multiple workers
(We used CrowdFlower, as with Mission 4636)
• Localize to the crisis-affected community
(Data obtained by hand / created jobs)
Scaling beyond purely manual processing.
Disease outbreaks are the world’s single greatest
killer.
No organization is tracking them all.
Epidemics
Diseases eradicated in the last 75 years:
Increase in air travel in the last 75 years:
smallpox
90% of ecological
diversity
90% of linguistic
diversity
Reported locally before identification
H1N1 (Swine Flu)
months
(10% of world
infected)
HIV
decades
(35 million infected)
H1N5 (Bird Flu)
weeks
(>50% fatal)
Simply finding these
early reports can help
prevent epidemics.
epidemicIQ
Machine -
learning
(millions)
Reports
(millions)
Microtaskers
(thousands)
Analysts – domain
experts
(capped number)
в предстоящий осенне-
зимний период в Украине
ожидаются две эпидемии
гриппа
‫من‬ ‫مزيد‬‫انفلونز‬‫الطيور‬ ‫ا‬‫مصر‬ ‫في‬
香港现1例H5N1禽流感病例曾
游上海南京等地
E Coli in Germany
The AI
head-start
epidemicIQ
Lessons learned
• Current data privacy practices are insufficient
(reports from areas where victims are vilified)
• Crowdsourcing can provide needed skill-sets
(100s of German speakers at short notice)
• Natural language processing can scale beyond
human processing capacity
Libya Crisis Map
A negative example
• 2283 reports already-open, English sources
• 1 month of full-time management and
contributions from >100 volunteers
Libya Crisis Map
Equivalent cost from paid workers
• $575.75
(or about $800 with multiple steps)
Equivalent time cost from Libyan nationals:
• 152.2 hours = less than 1 month for 1 person
(would also address some security concerns)
Libya Crisis Map
Lessons learned
• Crowdsourced volunteers were not required
(cost more to run than was saved by not paying)
(a single in-house Libyan could have achieved more)
• Default to private data practices
(assume all identities of volunteers were exposed)
(Libyans opposed the public map)
Crowdsourcing and risk
People’s real-time locations are their most
sensitive personal information.
Crowdsourcing distributes information to a large
number of individuals for processing.
For information about at-risk individuals:
• Is it right to crowdsource the processing?
• Is it right to use a public-facing map?
Recommendations
• Engage people with local knowledge
• Employ people with local knowledge
• Statistically cross-validate on-the-fly
• Default to private data practices
• Scale via natural language processing
Conclusions
Thank you
Robert Munro
Idibon
@WWRob
Crowdsourcing

Weitere ähnliche Inhalte

Ähnlich wie Talking to the crowd in 7,000 languages

AFRICAN MANAGEMENT INITIATIVE (1)
AFRICAN MANAGEMENT INITIATIVE (1)AFRICAN MANAGEMENT INITIATIVE (1)
AFRICAN MANAGEMENT INITIATIVE (1)
Mary Kungu
 
Rolf Maibach - Cooperation in Humanitarian Disasters – Balancing between cult...
Rolf Maibach - Cooperation in Humanitarian Disasters – Balancing between cult...Rolf Maibach - Cooperation in Humanitarian Disasters – Balancing between cult...
Rolf Maibach - Cooperation in Humanitarian Disasters – Balancing between cult...
Global Risk Forum GRFDavos
 
Post 2015 agenda & aids coordination
Post 2015 agenda & aids coordinationPost 2015 agenda & aids coordination
Post 2015 agenda & aids coordination
AIDS Watch Africa
 

Ähnlich wie Talking to the crowd in 7,000 languages (20)

AFRICAN MANAGEMENT INITIATIVE (1)
AFRICAN MANAGEMENT INITIATIVE (1)AFRICAN MANAGEMENT INITIATIVE (1)
AFRICAN MANAGEMENT INITIATIVE (1)
 
The Women's Commission for Refugee Women and Children: Making Reproductive He...
The Women's Commission for Refugee Women and Children: Making Reproductive He...The Women's Commission for Refugee Women and Children: Making Reproductive He...
The Women's Commission for Refugee Women and Children: Making Reproductive He...
 
ICCM 2014 -- Ignite Talks -- Session 1
ICCM 2014 -- Ignite Talks -- Session 1ICCM 2014 -- Ignite Talks -- Session 1
ICCM 2014 -- Ignite Talks -- Session 1
 
Rhcs presentation
Rhcs presentationRhcs presentation
Rhcs presentation
 
Sprache rettet Leben, Translators without Borders - Content Marketing Tuesday...
Sprache rettet Leben, Translators without Borders - Content Marketing Tuesday...Sprache rettet Leben, Translators without Borders - Content Marketing Tuesday...
Sprache rettet Leben, Translators without Borders - Content Marketing Tuesday...
 
Humanitarian Assitance & Social Services.ppt
Humanitarian Assitance & Social Services.pptHumanitarian Assitance & Social Services.ppt
Humanitarian Assitance & Social Services.ppt
 
Technology and Disaster Relief
Technology and Disaster ReliefTechnology and Disaster Relief
Technology and Disaster Relief
 
Using Language to Change the World - Translators Without Borders
Using Language to Change the World - Translators Without BordersUsing Language to Change the World - Translators Without Borders
Using Language to Change the World - Translators Without Borders
 
Lymphatic filariasis ppt 1014
Lymphatic filariasis ppt 1014Lymphatic filariasis ppt 1014
Lymphatic filariasis ppt 1014
 
The richness of rumours &amp; limitations of facts - stijn aelbers
The richness of rumours &amp; limitations of facts - stijn aelbersThe richness of rumours &amp; limitations of facts - stijn aelbers
The richness of rumours &amp; limitations of facts - stijn aelbers
 
Rolf Maibach - Cooperation in Humanitarian Disasters – Balancing between cult...
Rolf Maibach - Cooperation in Humanitarian Disasters – Balancing between cult...Rolf Maibach - Cooperation in Humanitarian Disasters – Balancing between cult...
Rolf Maibach - Cooperation in Humanitarian Disasters – Balancing between cult...
 
Homeless Planning for Emergencies: Preparedness, Response, and Recovery
Homeless Planning for Emergencies:  Preparedness, Response, and Recovery Homeless Planning for Emergencies:  Preparedness, Response, and Recovery
Homeless Planning for Emergencies: Preparedness, Response, and Recovery
 
Challenges of Reproductive Health in Complex Emergencies
Challenges of Reproductive Health in Complex EmergenciesChallenges of Reproductive Health in Complex Emergencies
Challenges of Reproductive Health in Complex Emergencies
 
Malware Awareness Training
Malware Awareness TrainingMalware Awareness Training
Malware Awareness Training
 
COMPUTERSAFETY-Mod01-email
COMPUTERSAFETY-Mod01-emailCOMPUTERSAFETY-Mod01-email
COMPUTERSAFETY-Mod01-email
 
Post 2015 agenda & aids coordination
Post 2015 agenda & aids coordinationPost 2015 agenda & aids coordination
Post 2015 agenda & aids coordination
 
"Future Earth Open House: #MyFutureEarth"
"Future Earth Open House: #MyFutureEarth""Future Earth Open House: #MyFutureEarth"
"Future Earth Open House: #MyFutureEarth"
 
Combatting Ebola
Combatting EbolaCombatting Ebola
Combatting Ebola
 
Insights from the 2023 Knowledge Translation Student Award Recipients
Insights from the 2023 Knowledge Translation Student Award RecipientsInsights from the 2023 Knowledge Translation Student Award Recipients
Insights from the 2023 Knowledge Translation Student Award Recipients
 
DCRSC Annual Report 2006
DCRSC Annual Report 2006DCRSC Annual Report 2006
DCRSC Annual Report 2006
 

Kürzlich hochgeladen

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Kürzlich hochgeladen (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Talking to the crowd in 7,000 languages

  • 1. Talking to the crowd in 7,000 languages Robert Munro Idibon Crowdsourcing
  • 2. Information is increasing • Scale (well-known) • Diversity (less understood) – On a given day, what is the average number of languages that someone could potentially hear? – How has this changed? Outline
  • 3. Daily potential language exposure 5 5 5 5 5 5 4.5 4 50 1500 5000 2000 1400 720 540 500 Year #oflanguages
  • 4. 5 5 5 5 5 5 4.5 4 50 1500 5000 2000 1400 720 540 500 Daily potential language exposure Year #oflanguages
  • 5. 5 5 5 5 5 5 4.5 4 50 1500 5000 2000 1400 720 540 500 Daily potential language exposure Year #oflanguages
  • 6. 5 5 5 5 5 5 4.5 4 50 1500 5000 2000 1400 720 540 500 Daily potential language exposure Year #oflanguages Putting a phone in the hands of everyone on the planet is the easy part Understanding everyone is going to be more complicated
  • 7. 99% of languages don’t have machine-translation or similar services: • Disproportionately lower healthcare & education • Disproportionately greater exposure to disasters Crowdsourcing can bridge part of the gap. Diversity
  • 9. Crowdsourced processing of information in Haitian Kreyol. 1000s of Haitians in Haiti and among the diaspora. Haiti – Mission 4636 Apo Dalila Haiti (18.4957, -72.3185) “I need Thomassin Apo please” “Kenscoff Route: Lat: 18.4957, Long:-72.3185” “This Area after Petion-Ville and Pelerin 5 is not on Google Map. We have no streets name”
  • 10. Lopital Sacre-Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. “Sacre-Coeur Hospital which located in this village of Okap is ready to receive those who are injured. Therefore, we are asking those who are sick to report to that hospital.”
  • 11. Lopital Sacre-Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. “Sacre-Coeur Hospital which located in this village of Okap is ready to receive those who are injured. Therefore, we are asking those who are sick to report to that hospital.”
  • 12. Lopital Sacre-Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. “Sacre-Coeur Hospital which located in this village of Okap is ready to receive those who are injured. Therefore, we are asking those who are sick to report to that hospital.”
  • 13. Evaluating local knowledge Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Haitians (volunteers and paid) “Non-Haitians 3,000 messages < 5 minutes each > 4 hours each 45,000 messages Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la. Lopital Sacre- Coeur ki nan vil Okap, pre pou li resevwa moun malad e lap mande pou moun ki malad yo ale la.
  • 14. Haiti – Mission 4636 Lessons learned • Default to private data practices (Majority decision was not to use a public map) • Find volunteers through strong social ties (10x larger/faster than the publicized efforts) • Avoid activists (‘bloggers’, ‘crisis-mappers’ …) • Localize to the crisis-affected community (25% of work was by paid workers in Haiti)
  • 15. Haiti – Mission 4636 Paid workers in Mirebalais, Haiti (FATEM) Benchmarks we can use:* $ 0.25 per translation $ 0.20 per geolocation $ 0.05 per categorization / filtering 4:00 minutes per report processed Can volunteerism undercut this cost? * Munro. 2012. Crowdsourcing and crisis-affected community: lessons learned and looking forward from Mission 4636. Journal of Information Retrieval
  • 16. Data-structuring for 2010 floods in Pakistan Pakreport *Chohan, Hester and Munro. 2012. Pakreport: Crowdsourcing for Multipurpose and Multicategory Climate-related Disaster Reporting. Climate Change, Innovation & ICTs Project. CDI Multiple inexperienced people are more accurate than one experienced person.*
  • 17. Pakreport Lessons learned • Default to private data practices (!) (Taliban threatened to attack mapped aid workers) • Cross-validate tasks across multiple workers (We used CrowdFlower, as with Mission 4636) • Localize to the crisis-affected community (Data obtained by hand / created jobs)
  • 18. Scaling beyond purely manual processing. Disease outbreaks are the world’s single greatest killer. No organization is tracking them all. Epidemics
  • 19. Diseases eradicated in the last 75 years: Increase in air travel in the last 75 years: smallpox
  • 20. 90% of ecological diversity 90% of linguistic diversity
  • 21. Reported locally before identification H1N1 (Swine Flu) months (10% of world infected) HIV decades (35 million infected) H1N5 (Bird Flu) weeks (>50% fatal) Simply finding these early reports can help prevent epidemics.
  • 22. epidemicIQ Machine - learning (millions) Reports (millions) Microtaskers (thousands) Analysts – domain experts (capped number) в предстоящий осенне- зимний период в Украине ожидаются две эпидемии гриппа ‫من‬ ‫مزيد‬‫انفلونز‬‫الطيور‬ ‫ا‬‫مصر‬ ‫في‬ 香港现1例H5N1禽流感病例曾 游上海南京等地
  • 23. E Coli in Germany The AI head-start
  • 24. epidemicIQ Lessons learned • Current data privacy practices are insufficient (reports from areas where victims are vilified) • Crowdsourcing can provide needed skill-sets (100s of German speakers at short notice) • Natural language processing can scale beyond human processing capacity
  • 25. Libya Crisis Map A negative example • 2283 reports already-open, English sources • 1 month of full-time management and contributions from >100 volunteers
  • 26. Libya Crisis Map Equivalent cost from paid workers • $575.75 (or about $800 with multiple steps) Equivalent time cost from Libyan nationals: • 152.2 hours = less than 1 month for 1 person (would also address some security concerns)
  • 27. Libya Crisis Map Lessons learned • Crowdsourced volunteers were not required (cost more to run than was saved by not paying) (a single in-house Libyan could have achieved more) • Default to private data practices (assume all identities of volunteers were exposed) (Libyans opposed the public map)
  • 28. Crowdsourcing and risk People’s real-time locations are their most sensitive personal information. Crowdsourcing distributes information to a large number of individuals for processing. For information about at-risk individuals: • Is it right to crowdsource the processing? • Is it right to use a public-facing map?
  • 29. Recommendations • Engage people with local knowledge • Employ people with local knowledge • Statistically cross-validate on-the-fly • Default to private data practices • Scale via natural language processing Conclusions