SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Downloaden Sie, um offline zu lesen
HUMAN
COMPUTATION IN THE
LINKED DATA
MANAGEMENT LIFE
CYCLE
ELENA SIMPERL
UNIVERSITY OF SOUTHAMPTON
7/18/2013
1st PRELIDA workshop
1
HUMAN
COMPUTATION
Outsourcing tasks that machines find difficult to solve to
humans (accuracy, efficiency, costs)
SEMANTIC TECHNOLOGIES
ARE ALL ABOUT
AUTOMATION
…but many tasks rely
on human input
• Modeling a domain
• Integrating data sources
originating from different
contexts
• Producing semantic
markup for various types of
digital artifacts
• ...
3
1st PRELIDA workshop
DIMENSIONS OF HUMAN
COMPUTATION SYSTEMS
What
Tasks that
require basic
human skills
How
Distribution
Coordination
Aggregation
Quality
Closed vs
open
answers
Ground truth
Quantitative
vs qualitative
Who is the
evaluator?
Optimize!
Incentives
Reduce
problem size
Task
assignment
7/18/2013
1st PRELIDA workshop
4
GAMES WITH A
PURPOSE (GWAP)
Human computation disguised as casual games
Tasks are divided into parallelizable atomic units
(challenges) solved (consensually) by players
Game models
• Single vs. multi-player
• Selection agreement vs. input agreement vs. inversion-
problem games
7/18/2013
5
MICROTASK
CROWDSOURCING
Similar types of tasks, but different incentives model
(monetary reward, PPP)
Successfully applied to transcription, classification, and
content generation, data collection, image tagging, website
feedback, usability tests…
7/18/2013
1st PRELIDA workshop
6
THE SAME, BUT
DIFFERENT
• Tasks leveraging common human skills, appealing to large
audiences
• Selection of domain and task more constrained in games to
create typical UX
• Tasks decomposed into smaller units of work to be solved
independently
• Complex workflows
• Creating a casual game experience vs. patterns in microtasks
• Quality assurance
• Synchronous interaction in games
• Levels of difficulty and near-real-time feedback in games
• Many methods applied in both cases (redundancy, votes,
statistical techniques)
• Different set of incentives and motivators
7/18/2013
1st PRELIDA workshop
7
Physical World
(people and devices)
HYBRID SYSTEMS
Design and
composition
Participation and
data supply
Model of social interaction
Virtual world
(Network of
social interactions)
Dave Robertson
Not sure
EXAMPLE: HYBRID DATA
INTEGRATION
paper conf
Data integration VLDB-01
Data mining SIGMOD-02
title author email
OLAP Mike mike@a
Social media Jane jane@b
Generate plausible matches
– paper = title, paper = author, paper = email, paper = venue
– conf = title, conf = author, conf = email, conf = venue
Ask users to verify
paper conf
Data integration VLDB-01
Data mining SIGMOD-02
title author email venue
OLAP Mike mike@a ICDE-02
Social media Jane jane@b PODS-05
Does attribute paper match attribute author?
NoYes
[McCann, Shen, Doan, ICDE 2008]
9
EXAMPLES FROM
THE LINKED DATA
WORLD
ELENA SIMPERL
UNIVERSITY OF SOUTHAMPTON, UK
7/18/2013
1st PRELIDA workshop
10
WHAT IS DIFFERENT ABOUT
SEMANTIC SYSTEMS?
Semantic Web tools vs.
applications
• Intelligent (specialized) Web
sites (portals) with improved
(local) search based on
vocabularies and ontologies
• X2X integration (often
combined with Web services)
• Knowledge representation,
communication and exchange
7/18/2013
1st PRELIDA workshop
TASKS NAMED IN
METHODOLOGIES ARE TOO HIGH-
LEVEL
Crowdsource very specific tasks that
are (highly) divisible
• Labeling (in different languages)
• Finding relationships
• Populating the ontology
• Aligning and interlinking
• Ontology-based annotation
• Validating the results of automatic
methods
• …
Think about the context of the
application (social structure) and about
how to hide tasks behind existing
practices and tools
12
7/18/2013
Tutorial@ESWC2013
TASTE IT! TRY IT!
• Restaurant review Android app developed in the Insemtives project
• Uses Dbpedia concepts to generate structured reviews
• Uses mechanism design/gamification to configure incentives
• User study
• 2274 reviews by 180 reviewers referring to 900 restaurants, using 5667 DPpedia concepts
7/18/2013
1st PRELIDA workshop
13
https://play.google.com/store/apps/details?id=insemtives.android&hl=en
0
500
1000
1500
2000
2500
CAFE FASTFOOD PUB RESTAURANT
Numer of reviews
Number of semantic annotations (type of cuisine)
Number of semantic annotations (dishes)
LODREFINE
7/18/2013
1st PRELIDA workshop
14
http://research.zemanta.com/crowds-to-the-rescue/
DBPEDIA CURATION
7/18/2013
1st PRELIDA workshop
15
http://aksw.org/Projects/TripleCheckMate.html
CROWDMAP
Experiments using MTurk, CrowdFlower and established benchmarks
Enhancing the results of automatic techniques
Fast, accurate, cost-effective [Sarasua, Simperl, Noy, ISWC2012]
16
CartP
301-304
100R50P
Edas-Iasted
100R50P
Ekaw-Iasted
100R50P
Cmt-Ekaw
100R50P
ConfOf-Ekaw
Imp
301-304
PRECISION 0.53 0.8 1.0 1.0 0.93 0.73
RECALL 1.0 0.42 0.7 0.75 0.65 1.0
ONTOLOGY
POPULATION
7/18/2013
1st PRELIDA workshop
17
LINKED DATA
CURATION
7/18/2013
1st PRELIDA workshop
18
PROBLEMS AND
CHALLENGES
•What is feasible and how can tasks be optimally translated into microtasks?
• Examples: data quality assessment for technical and contextual features; subjective vs
objective tasks (also in modeling); open-ended questions
•What to show to users
• Natural language descriptions of Linked Data/SPARQL
• How much context
• What form of rendering
• How about links?
•How to combine with automatic tools
• Which results to validate
• Low precision (no fun for gamers...)
• Low recall (vs all possible questions)
•How to embed it into an existing application
• Tasks are fine granular, perceived as additional burden to the actual functionality
•What to do with the resulting data?
• Integration into existing practices
• Vocabularies!
7/18/2013
1st PRELIDA workshop
19

Weitere ähnliche Inhalte

Ähnlich wie Crowdsourcing Linked Data management

Karan Mehta- Resume
Karan Mehta- ResumeKaran Mehta- Resume
Karan Mehta- Resume
Karan Mehta
 
The Story of the Semantic Grid
The Story of the Semantic GridThe Story of the Semantic Grid
The Story of the Semantic Grid
butest
 
Rajat_updated_Resume
Rajat_updated_ResumeRajat_updated_Resume
Rajat_updated_Resume
rajatgupta063
 

Ähnlich wie Crowdsourcing Linked Data management (20)

K anonymity for crowdsourcing database
K anonymity for crowdsourcing databaseK anonymity for crowdsourcing database
K anonymity for crowdsourcing database
 
Karan Mehta- Resume
Karan Mehta- ResumeKaran Mehta- Resume
Karan Mehta- Resume
 
Mastering Machine Learning with Competitions
Mastering Machine Learning with CompetitionsMastering Machine Learning with Competitions
Mastering Machine Learning with Competitions
 
Priyanka Pandit | Projects
Priyanka Pandit | ProjectsPriyanka Pandit | Projects
Priyanka Pandit | Projects
 
Fundamentals of human computation
Fundamentals of human computationFundamentals of human computation
Fundamentals of human computation
 
Human computation and the Semantic Web (examples)
Human computation and the Semantic Web (examples)Human computation and the Semantic Web (examples)
Human computation and the Semantic Web (examples)
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
 
Predicting Current User Intent with Contextual Markov Models
Predicting Current User Intent with Contextual Markov ModelsPredicting Current User Intent with Contextual Markov Models
Predicting Current User Intent with Contextual Markov Models
 
shiva_resume
shiva_resumeshiva_resume
shiva_resume
 
Machine Learning in Big Data
Machine Learning in Big DataMachine Learning in Big Data
Machine Learning in Big Data
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Gc vit sttp cc december 2013
Gc vit sttp cc december 2013Gc vit sttp cc december 2013
Gc vit sttp cc december 2013
 
Leveraging Machine Learning for Competitive Advantage at Search Party
Leveraging Machine Learning for Competitive Advantage at Search PartyLeveraging Machine Learning for Competitive Advantage at Search Party
Leveraging Machine Learning for Competitive Advantage at Search Party
 
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
 
The Story of the Semantic Grid
The Story of the Semantic GridThe Story of the Semantic Grid
The Story of the Semantic Grid
 
Data Management is a Team Sport - IBM
Data Management is a Team Sport - IBMData Management is a Team Sport - IBM
Data Management is a Team Sport - IBM
 
Rajat_updated_Resume
Rajat_updated_ResumeRajat_updated_Resume
Rajat_updated_Resume
 
Big Data Meetup #7
Big Data Meetup #7Big Data Meetup #7
Big Data Meetup #7
 
Garbage collection auto tuning for java map reduce on multi-cores
Garbage collection auto tuning for java map reduce on multi-coresGarbage collection auto tuning for java map reduce on multi-cores
Garbage collection auto tuning for java map reduce on multi-cores
 

Mehr von Elena Simperl

One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
Elena Simperl
 

Mehr von Elena Simperl (20)

This talk was not generated with ChatGPT: how AI is changing science
This talk was not generated with ChatGPT: how AI is changing scienceThis talk was not generated with ChatGPT: how AI is changing science
This talk was not generated with ChatGPT: how AI is changing science
 
Knowledge graph use cases in natural language generation
Knowledge graph use cases in natural language generationKnowledge graph use cases in natural language generation
Knowledge graph use cases in natural language generation
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
 
What Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineeringWhat Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineering
 
Open government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactOpen government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impact
 
Ten myths about knowledge graphs.pdf
Ten myths about knowledge graphs.pdfTen myths about knowledge graphs.pdf
Ten myths about knowledge graphs.pdf
 
What Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineeringWhat Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineering
 
Data commons and their role in fighting misinformation.pdf
Data commons and their role in fighting misinformation.pdfData commons and their role in fighting misinformation.pdf
Data commons and their role in fighting misinformation.pdf
 
Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 
Crowdsourcing and citizen engagement for people-centric smart cities
Crowdsourcing and citizen engagement for people-centric smart citiesCrowdsourcing and citizen engagement for people-centric smart cities
Crowdsourcing and citizen engagement for people-centric smart cities
 
Pie chart or pizza: identifying chart types and their virality on Twitter
Pie chart or pizza: identifying chart types and their virality on TwitterPie chart or pizza: identifying chart types and their virality on Twitter
Pie chart or pizza: identifying chart types and their virality on Twitter
 
High-value datasets: from publication to impact
High-value datasets: from publication to impactHigh-value datasets: from publication to impact
High-value datasets: from publication to impact
 
The story of Data Stories
The story of Data StoriesThe story of Data Stories
The story of Data Stories
 
The human face of AI: how collective and augmented intelligence can help sol...
The human face of AI:  how collective and augmented intelligence can help sol...The human face of AI:  how collective and augmented intelligence can help sol...
The human face of AI: how collective and augmented intelligence can help sol...
 
Qrowd and the city: designing people-centric smart cities
Qrowd and the city: designing people-centric smart citiesQrowd and the city: designing people-centric smart cities
Qrowd and the city: designing people-centric smart cities
 
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
 
Qrowd and the city
Qrowd and the cityQrowd and the city
Qrowd and the city
 
Inclusive cities: a crowdsourcing approach
Inclusive cities: a crowdsourcing approachInclusive cities: a crowdsourcing approach
Inclusive cities: a crowdsourcing approach
 

Kürzlich hochgeladen

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
ssuserdda66b
 

Kürzlich hochgeladen (20)

Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 

Crowdsourcing Linked Data management

  • 1. HUMAN COMPUTATION IN THE LINKED DATA MANAGEMENT LIFE CYCLE ELENA SIMPERL UNIVERSITY OF SOUTHAMPTON 7/18/2013 1st PRELIDA workshop 1
  • 2. HUMAN COMPUTATION Outsourcing tasks that machines find difficult to solve to humans (accuracy, efficiency, costs)
  • 3. SEMANTIC TECHNOLOGIES ARE ALL ABOUT AUTOMATION …but many tasks rely on human input • Modeling a domain • Integrating data sources originating from different contexts • Producing semantic markup for various types of digital artifacts • ... 3 1st PRELIDA workshop
  • 4. DIMENSIONS OF HUMAN COMPUTATION SYSTEMS What Tasks that require basic human skills How Distribution Coordination Aggregation Quality Closed vs open answers Ground truth Quantitative vs qualitative Who is the evaluator? Optimize! Incentives Reduce problem size Task assignment 7/18/2013 1st PRELIDA workshop 4
  • 5. GAMES WITH A PURPOSE (GWAP) Human computation disguised as casual games Tasks are divided into parallelizable atomic units (challenges) solved (consensually) by players Game models • Single vs. multi-player • Selection agreement vs. input agreement vs. inversion- problem games 7/18/2013 5
  • 6. MICROTASK CROWDSOURCING Similar types of tasks, but different incentives model (monetary reward, PPP) Successfully applied to transcription, classification, and content generation, data collection, image tagging, website feedback, usability tests… 7/18/2013 1st PRELIDA workshop 6
  • 7. THE SAME, BUT DIFFERENT • Tasks leveraging common human skills, appealing to large audiences • Selection of domain and task more constrained in games to create typical UX • Tasks decomposed into smaller units of work to be solved independently • Complex workflows • Creating a casual game experience vs. patterns in microtasks • Quality assurance • Synchronous interaction in games • Levels of difficulty and near-real-time feedback in games • Many methods applied in both cases (redundancy, votes, statistical techniques) • Different set of incentives and motivators 7/18/2013 1st PRELIDA workshop 7
  • 8. Physical World (people and devices) HYBRID SYSTEMS Design and composition Participation and data supply Model of social interaction Virtual world (Network of social interactions) Dave Robertson
  • 9. Not sure EXAMPLE: HYBRID DATA INTEGRATION paper conf Data integration VLDB-01 Data mining SIGMOD-02 title author email OLAP Mike mike@a Social media Jane jane@b Generate plausible matches – paper = title, paper = author, paper = email, paper = venue – conf = title, conf = author, conf = email, conf = venue Ask users to verify paper conf Data integration VLDB-01 Data mining SIGMOD-02 title author email venue OLAP Mike mike@a ICDE-02 Social media Jane jane@b PODS-05 Does attribute paper match attribute author? NoYes [McCann, Shen, Doan, ICDE 2008] 9
  • 10. EXAMPLES FROM THE LINKED DATA WORLD ELENA SIMPERL UNIVERSITY OF SOUTHAMPTON, UK 7/18/2013 1st PRELIDA workshop 10
  • 11. WHAT IS DIFFERENT ABOUT SEMANTIC SYSTEMS? Semantic Web tools vs. applications • Intelligent (specialized) Web sites (portals) with improved (local) search based on vocabularies and ontologies • X2X integration (often combined with Web services) • Knowledge representation, communication and exchange 7/18/2013 1st PRELIDA workshop
  • 12. TASKS NAMED IN METHODOLOGIES ARE TOO HIGH- LEVEL Crowdsource very specific tasks that are (highly) divisible • Labeling (in different languages) • Finding relationships • Populating the ontology • Aligning and interlinking • Ontology-based annotation • Validating the results of automatic methods • … Think about the context of the application (social structure) and about how to hide tasks behind existing practices and tools 12 7/18/2013 Tutorial@ESWC2013
  • 13. TASTE IT! TRY IT! • Restaurant review Android app developed in the Insemtives project • Uses Dbpedia concepts to generate structured reviews • Uses mechanism design/gamification to configure incentives • User study • 2274 reviews by 180 reviewers referring to 900 restaurants, using 5667 DPpedia concepts 7/18/2013 1st PRELIDA workshop 13 https://play.google.com/store/apps/details?id=insemtives.android&hl=en 0 500 1000 1500 2000 2500 CAFE FASTFOOD PUB RESTAURANT Numer of reviews Number of semantic annotations (type of cuisine) Number of semantic annotations (dishes)
  • 15. DBPEDIA CURATION 7/18/2013 1st PRELIDA workshop 15 http://aksw.org/Projects/TripleCheckMate.html
  • 16. CROWDMAP Experiments using MTurk, CrowdFlower and established benchmarks Enhancing the results of automatic techniques Fast, accurate, cost-effective [Sarasua, Simperl, Noy, ISWC2012] 16 CartP 301-304 100R50P Edas-Iasted 100R50P Ekaw-Iasted 100R50P Cmt-Ekaw 100R50P ConfOf-Ekaw Imp 301-304 PRECISION 0.53 0.8 1.0 1.0 0.93 0.73 RECALL 1.0 0.42 0.7 0.75 0.65 1.0
  • 19. PROBLEMS AND CHALLENGES •What is feasible and how can tasks be optimally translated into microtasks? • Examples: data quality assessment for technical and contextual features; subjective vs objective tasks (also in modeling); open-ended questions •What to show to users • Natural language descriptions of Linked Data/SPARQL • How much context • What form of rendering • How about links? •How to combine with automatic tools • Which results to validate • Low precision (no fun for gamers...) • Low recall (vs all possible questions) •How to embed it into an existing application • Tasks are fine granular, perceived as additional burden to the actual functionality •What to do with the resulting data? • Integration into existing practices • Vocabularies! 7/18/2013 1st PRELIDA workshop 19