SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
www.ucomp.eu | www.chistera.eu @uCompEU
uComp Objectives
• Develop a generic and reusable Human
Computation (HC) framework
• Address challenges of noisy data
• Embed human computation into
knowledge extraction workflows
• Factual Knowledge
• Affective Knowledge
• Evaluate EHC performance
(EHC = Embedded Human Computation)
www.ucomp.eu | www.chistera.eu @uCompEU
Work Package Overview
www.ucomp.eu | www.chistera.eu @uCompEU
System Architecture
www.ucomp.eu | www.chistera.eu @uCompEU
Content Repository (WP1)
• Extensible Web Retrieval Toolkit (eWRT)
• Open Source Library
www.weblyzard.com/ewrt
• Media Watch on Climate Change
• English Version
• www.ecoresearch.net/climate
• News Media Articles: 1,275,000
• Social Media Postings: 20,000,000
• German Version
• www.ecoresearch.net/climate/de
• News Media Articles: 650,000
• Social Meeting Postings: 565,000
• French Version
• www.ecoresearch.net/climate/de
• News Media Articles: 720,000
• Social Meeting Postings: 410,000
www.ucomp.eu | www.chistera.eu @uCompEU
HC Framework (WP2)
• Application Framework. Facilitate developing GWAPs
to engage users and generate valuable information.
• Mechanism. Players score if inputs match: (i) system-
generated values; (ii) Real-time input from other
players; (iii) stored records from previous users.
• If a certain number of players agree, the task will be
assumed complete and taken out of the game
• Progress
• Cross-platform HTML5 application framework. Complete.
• Application Programming Interface (API). Complete.
• Integration of GWAPs with CrowdFlower. Complete.
• Support of Prediction Tasks. Complete.
• Framework for Social Logins. Complete.
www.ucomp.eu | www.chistera.eu @uCompEU
GWAP Use Case
Launch – 25 Mar 2015
www.twitter.com/uCompEU
www.ucomp.eu | www.chistera.eu @uCompEU
GWAP Use Case
www.ucomp.eu | www.chistera.eu @uCompEU
HC + Text Mining (WP3)
• Open-source, released as part of GATE
gate.ac.uk/wiki/crowdsourcing.html
• Two types of tasks: (i) Classification - e.g. entity/word
disambiguation, sentiment; (ii) sequence selection - e.g.
named entity annotation
• Tasks commissioned from the GATE Developer UI
• Automatic mapping from sentences to HC tasks
• Annotation provenance & contributor reliability tracked
• Collected data mapped back onto corpora and
documents automatically
• Several knowledge aggregation and corpus distribution
methods implemented (T3.3)
www.ucomp.eu | www.chistera.eu @uCompEU
• One entity class per crowdsourcing task; better
than simultaneous annotation of entity types
Crowdsourced NE Corpora
www.ucomp.eu | www.chistera.eu @uCompEU
Result Aggregation
• Automatic adjudication/aggregation strategies
implemented
• Challenges encountered
• Worker agreement not always representative of quality
• Many entities are recognised by only a minority of
workers
• Regional knowledge is required: #mufc, the bulls
• Span mismatch: King of England vs King of England
• Quality evaluation
• PER P 68.7 R 56.2 F1 61.8
LOC P 15.3 R 91.7 F1 26.2
ORG P 53.2 R 67.1 F1 59.3
www.ucomp.eu | www.chistera.eu @uCompEU
Factual Knowledge (WP4)
• Ontologies create shared meaning and are
a cornerstone of the Semantic Web
• Manual construction of ontologies is
cumbersome and expensive
• Ontology learning is a (semi-)automatic
process to assist the ontology engineer
• uComp builds on an existing ontology
learning framework
www.ucomp.eu | www.chistera.eu @uCompEU
Protégé Plugin
• Goal: Apply the uComp HC framework to
ontology learning and other ontology
construction tasks
• How: A plugin implemented for Protégé, a
popular ontology engineering platform,
using the uComp HC API to validate
ontological entities
www.ucomp.eu | www.chistera.eu @uCompEU
Knowledge Creation Lifecycle
www.ucomp.eu | www.chistera.eu @uCompEU
Knowledge Quality Evaluation
• Feasibility Study
• Cost: Reduction of 40% to 83% depending on
design used
• Quality: Comparable with that of tasks performed
by ontology engineers
• Large-Scale Evaluation in Medical Domain
• Result Quality: Accuracy of 89% / 99%
• Completion Time: Similar to domain experts
• Cost Reduction of 75% to 81%
www.ucomp.eu | www.chistera.eu @uCompEU
Affective Knowledge (WP5)
• Use HC to produce affective resources that
are difficult to obtain automatically and too
costly to produce manually, for multiple
languages (EN, FR, DE).
• Assess HC-produced resources by
evaluating the performance impact of using
them instead of traditional resources for
opinion mining and sentiment analysis
(quantitative black-box methodology).
• Assess the possibility to replace static gold
standard resources by dynamic HC
www.ucomp.eu | www.chistera.eu @uCompEU
Affective Model
www.ucomp.eu | www.chistera.eu @uCompEU
Multilingual Twitter Data
www.ucomp.eu | www.chistera.eu @uCompEU
Crowdsourcing lexicon validation experiment
• French Affective Lexicon (9,939 Entries)
• Task Design
• Results
• Feasibility depends on workers’ motivation
• Good quality/cost ratio
• Ethical and legal
issues
Evaluation
Percentage of crowdsourced validated terms per affective class
www.ucomp.eu | www.chistera.eu @uCompEU
Evaluation
www.ucomp.eu | www.chistera.eu @uCompEU
Evaluation
• Data Annotation
• Expert Annotation: 30.000 tweets : 50% French + 50%
German; French: Complete, German: In Progress
• Annotation Guide
• 7 Entities: Opinion Holder, Opinion Target, Opinion
/ Sentiment / Emotion Expression, Negation,
Modifier, Global OSE Recipient
• 6 Relations: SAYS, ABOOUT, NEG, MOD and
RECEIVER
• Evaluation Campaign – DEFT2015
• 22 participants registered
• Polarity, emotion, and opinion holder/target detection
• DEFT Workshop at TALN 2015
www.ucomp.eu | www.chistera.eu @uCompEU
Dissemination & Impact (WP6)
• Web Site: www.ucomp.eu; Twitter Presence: @uCompEU
• Deliverables: 17
• Y1: D1.1, D1.2, D2.1, D3.1, D5.1, D6.1, D6.2, D7.1, D7.2, D7.3
• Y2: D1.3, D3.2, D3.4, D4.2, D5.2, D5.3, D7.4
• Scientific Publications: 24
• Open-Source Toolkits: 4
• eWRT, TwitIE, Gate HC Plugin, Protégé Plugin
• Collaboration: DecarboNet (Climate Challenge), PHEME
(Evaluation), Member of the European Center for Social Media
• Training and Teaching
• Two week-long courses on Mining and Crowdsourcing Social Media
Corpora. GATE Summer School (8-12 June 2015; 9-13 June 2014)
• Tutorial: Knowledge Extraction from Social Media with GATE.
12th Extended Semantic Web Conference (ESWC-2015)
• Tutorial: NLP for Social Media. 14th Conference of the European Chapter
of the Association for Computational Linguistics (EACL-2014)
www.ucomp.eu | www.chistera.eu @uCompEU
Project Management (WP7)
• Project duration extended by six months
until 14 May 2016 (key staff leaving at MOD and
USFD; recruitment delays at WU)
• Changes to Work Plan
• D2.2 - Postpone to M30 (matching completion of T2.3
and T2.4);
• D2.3 - Postpone to M40 (matching T2.5);
• D3.3 - Postpone to M42 (matching completion of T3.4);
• D5.2 v2 and D5.3 v.2 - postpone to M36 (to allow prior
completion of D2.2. at M30);
• D5.4 - Postpone to M42;
• D6.3 - Postpone to M42 (as this needs to report on all
the work done until the end of the project).

Weitere ähnliche Inhalte

Andere mochten auch

Parallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corporaParallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corpora
Haithem Afli
 
Macro economische analyse van brazilië
Macro economische analyse van braziliëMacro economische analyse van brazilië
Macro economische analyse van brazilië
Jan-Willem Lammens
 
Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...
Tobias Wunner
 
Word Formation in English
Word Formation in EnglishWord Formation in English
Word Formation in English
teflang
 

Andere mochten auch (16)

Bilingual Terminology Extraction based on Translation Patterns
Bilingual Terminology Extraction based on Translation PatternsBilingual Terminology Extraction based on Translation Patterns
Bilingual Terminology Extraction based on Translation Patterns
 
Parallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corporaParallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corpora
 
Challenges in the linguistic exploitation of specialized republishable web co...
Challenges in the linguistic exploitation of specialized republishable web co...Challenges in the linguistic exploitation of specialized republishable web co...
Challenges in the linguistic exploitation of specialized republishable web co...
 
Macro economische analyse van brazilië
Macro economische analyse van braziliëMacro economische analyse van brazilië
Macro economische analyse van brazilië
 
Bilingual terminology mining
Bilingual terminology miningBilingual terminology mining
Bilingual terminology mining
 
A cognitive view of the bilingual lexicon
A cognitive view of the bilingual lexiconA cognitive view of the bilingual lexicon
A cognitive view of the bilingual lexicon
 
Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...
 
Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon I...
Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon I...Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon I...
Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon I...
 
Enriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration ExtractionEnriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration Extraction
 
Chelo Vargas-Sierra
Chelo Vargas-SierraChelo Vargas-Sierra
Chelo Vargas-Sierra
 
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
 
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchange
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchangeDealing with Lexicon Acquired from Comparable Corpora: post-edition and exchange
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchange
 
Applicative evaluation of bilingual terminologies
Applicative evaluation of bilingual terminologiesApplicative evaluation of bilingual terminologies
Applicative evaluation of bilingual terminologies
 
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...
 
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
 
Word Formation in English
Word Formation in EnglishWord Formation in English
Word Formation in English
 

Ähnlich wie Embedded Human Computation for Knowledge Extraction and Evaluation

Crowdsourcing Representation Information to Support Preservation: CRISP
Crowdsourcing Representation Information to Support Preservation: CRISPCrowdsourcing Representation Information to Support Preservation: CRISP
Crowdsourcing Representation Information to Support Preservation: CRISP
mopennock
 
Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013
MediaMixerCommunity
 
DARE: Delivering Agile Research Excellence on European e-Infrastructures
DARE: Delivering Agile Research Excellence on European e-Infrastructures DARE: Delivering Agile Research Excellence on European e-Infrastructures
DARE: Delivering Agile Research Excellence on European e-Infrastructures
EUDAT
 
SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...
Sandra Gesing
 

Ähnlich wie Embedded Human Computation for Knowledge Extraction and Evaluation (20)

Science Demonstrator Session: Social and Earth Sciences
Science Demonstrator Session: Social and Earth SciencesScience Demonstrator Session: Social and Earth Sciences
Science Demonstrator Session: Social and Earth Sciences
 
IDCC Presentation on the Future of Data Management Planning, Feb 2016
IDCC Presentation on the Future of Data Management Planning, Feb 2016IDCC Presentation on the Future of Data Management Planning, Feb 2016
IDCC Presentation on the Future of Data Management Planning, Feb 2016
 
Crowdsourcing Representation Information to Support Preservation: CRISP
Crowdsourcing Representation Information to Support Preservation: CRISPCrowdsourcing Representation Information to Support Preservation: CRISP
Crowdsourcing Representation Information to Support Preservation: CRISP
 
Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013
 
BSC presentation for Festibity Sponsors
BSC presentation for Festibity SponsorsBSC presentation for Festibity Sponsors
BSC presentation for Festibity Sponsors
 
Conjugating Open Science & Open Education: The Sci-GaIA e-Research Hackfest m...
Conjugating Open Science & Open Education: The Sci-GaIA e-Research Hackfest m...Conjugating Open Science & Open Education: The Sci-GaIA e-Research Hackfest m...
Conjugating Open Science & Open Education: The Sci-GaIA e-Research Hackfest m...
 
DARE: Delivering Agile Research Excellence on European e-Infrastructures
DARE: Delivering Agile Research Excellence on European e-Infrastructures DARE: Delivering Agile Research Excellence on European e-Infrastructures
DARE: Delivering Agile Research Excellence on European e-Infrastructures
 
Enabling open and reproducible computer systems research: the good, the bad a...
Enabling open and reproducible computer systems research: the good, the bad a...Enabling open and reproducible computer systems research: the good, the bad a...
Enabling open and reproducible computer systems research: the good, the bad a...
 
E Infrastructure for OA
E Infrastructure for OAE Infrastructure for OA
E Infrastructure for OA
 
HiPEAC 2019 Workshop Overview
HiPEAC 2019 Workshop OverviewHiPEAC 2019 Workshop Overview
HiPEAC 2019 Workshop Overview
 
Archiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award Ceremony
 
Archiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award Ceremony
 
MuSa. Combined use of mooc, e learning and workplace learning to support prof...
MuSa. Combined use of mooc, e learning and workplace learning to support prof...MuSa. Combined use of mooc, e learning and workplace learning to support prof...
MuSa. Combined use of mooc, e learning and workplace learning to support prof...
 
2021 09 kowi_tsoukala final
2021 09 kowi_tsoukala final2021 09 kowi_tsoukala final
2021 09 kowi_tsoukala final
 
COBWEB Summit at the OGC TC Dublin, 2016
COBWEB Summit at the OGC TC Dublin, 2016COBWEB Summit at the OGC TC Dublin, 2016
COBWEB Summit at the OGC TC Dublin, 2016
 
All WP Meeting Athens - Europeana Inside - Gordon McKenna
All WP Meeting Athens - Europeana Inside - Gordon McKennaAll WP Meeting Athens - Europeana Inside - Gordon McKenna
All WP Meeting Athens - Europeana Inside - Gordon McKenna
 
SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...
 
HPC Performance tools, on the road to Exascale
HPC Performance tools, on the road to ExascaleHPC Performance tools, on the road to Exascale
HPC Performance tools, on the road to Exascale
 
Intro-EOSC.pptx
Intro-EOSC.pptxIntro-EOSC.pptx
Intro-EOSC.pptx
 
ELIXIR TCG update
ELIXIR TCG updateELIXIR TCG update
ELIXIR TCG update
 

Mehr von webLyzard technology

Mehr von webLyzard technology (13)

Elasticsearch Meetup Vienna - webLyzard Live Demo
Elasticsearch Meetup Vienna - webLyzard Live DemoElasticsearch Meetup Vienna - webLyzard Live Demo
Elasticsearch Meetup Vienna - webLyzard Live Demo
 
News Literacy 2020 - How to Understand and Combat Misinformation
News Literacy 2020 - How to Understand and Combat MisinformationNews Literacy 2020 - How to Understand and Combat Misinformation
News Literacy 2020 - How to Understand and Combat Misinformation
 
Communication Success Metrics for the U.S. Climate Agency NOAA
Communication Success Metrics for the U.S. Climate Agency NOAACommunication Success Metrics for the U.S. Climate Agency NOAA
Communication Success Metrics for the U.S. Climate Agency NOAA
 
Automated Rumor Detection and Visualization
Automated Rumor Detection and VisualizationAutomated Rumor Detection and Visualization
Automated Rumor Detection and Visualization
 
E-Day 2017: Fake News in Sozialen Medien
E-Day 2017: Fake News in Sozialen MedienE-Day 2017: Fake News in Sozialen Medien
E-Day 2017: Fake News in Sozialen Medien
 
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...
 
TEDx - Analyzing the Digital Talk: Visual Tools for Exploring Global Communic...
TEDx - Analyzing the Digital Talk: Visual Tools for Exploring Global Communic...TEDx - Analyzing the Digital Talk: Visual Tools for Exploring Global Communic...
TEDx - Analyzing the Digital Talk: Visual Tools for Exploring Global Communic...
 
InVID Research Project - In Video Veritas
InVID Research Project - In Video VeritasInVID Research Project - In Video Veritas
InVID Research Project - In Video Veritas
 
PHEME Dashboard - Interactive Visual Analytics
PHEME Dashboard - Interactive Visual AnalyticsPHEME Dashboard - Interactive Visual Analytics
PHEME Dashboard - Interactive Visual Analytics
 
US Election 2016 Web Monitor
US Election 2016 Web MonitorUS Election 2016 Web Monitor
US Election 2016 Web Monitor
 
Networking Knowledge, Networking People
Networking Knowledge, Networking PeopleNetworking Knowledge, Networking People
Networking Knowledge, Networking People
 
Web Intelligence | Marktforschung, Strategische Positionierung, Messung von K...
Web Intelligence | Marktforschung, Strategische Positionierung, Messung von K...Web Intelligence | Marktforschung, Strategische Positionierung, Messung von K...
Web Intelligence | Marktforschung, Strategische Positionierung, Messung von K...
 
Web Intelligence and Visual Media Analytics
Web Intelligence and Visual Media AnalyticsWeb Intelligence and Visual Media Analytics
Web Intelligence and Visual Media Analytics
 

Kürzlich hochgeladen

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 

Embedded Human Computation for Knowledge Extraction and Evaluation

  • 1.
  • 2. www.ucomp.eu | www.chistera.eu @uCompEU uComp Objectives • Develop a generic and reusable Human Computation (HC) framework • Address challenges of noisy data • Embed human computation into knowledge extraction workflows • Factual Knowledge • Affective Knowledge • Evaluate EHC performance (EHC = Embedded Human Computation)
  • 3. www.ucomp.eu | www.chistera.eu @uCompEU Work Package Overview
  • 4. www.ucomp.eu | www.chistera.eu @uCompEU System Architecture
  • 5. www.ucomp.eu | www.chistera.eu @uCompEU Content Repository (WP1) • Extensible Web Retrieval Toolkit (eWRT) • Open Source Library www.weblyzard.com/ewrt • Media Watch on Climate Change • English Version • www.ecoresearch.net/climate • News Media Articles: 1,275,000 • Social Media Postings: 20,000,000 • German Version • www.ecoresearch.net/climate/de • News Media Articles: 650,000 • Social Meeting Postings: 565,000 • French Version • www.ecoresearch.net/climate/de • News Media Articles: 720,000 • Social Meeting Postings: 410,000
  • 6. www.ucomp.eu | www.chistera.eu @uCompEU HC Framework (WP2) • Application Framework. Facilitate developing GWAPs to engage users and generate valuable information. • Mechanism. Players score if inputs match: (i) system- generated values; (ii) Real-time input from other players; (iii) stored records from previous users. • If a certain number of players agree, the task will be assumed complete and taken out of the game • Progress • Cross-platform HTML5 application framework. Complete. • Application Programming Interface (API). Complete. • Integration of GWAPs with CrowdFlower. Complete. • Support of Prediction Tasks. Complete. • Framework for Social Logins. Complete.
  • 7. www.ucomp.eu | www.chistera.eu @uCompEU GWAP Use Case Launch – 25 Mar 2015 www.twitter.com/uCompEU
  • 8. www.ucomp.eu | www.chistera.eu @uCompEU GWAP Use Case
  • 9. www.ucomp.eu | www.chistera.eu @uCompEU HC + Text Mining (WP3) • Open-source, released as part of GATE gate.ac.uk/wiki/crowdsourcing.html • Two types of tasks: (i) Classification - e.g. entity/word disambiguation, sentiment; (ii) sequence selection - e.g. named entity annotation • Tasks commissioned from the GATE Developer UI • Automatic mapping from sentences to HC tasks • Annotation provenance & contributor reliability tracked • Collected data mapped back onto corpora and documents automatically • Several knowledge aggregation and corpus distribution methods implemented (T3.3)
  • 10. www.ucomp.eu | www.chistera.eu @uCompEU • One entity class per crowdsourcing task; better than simultaneous annotation of entity types Crowdsourced NE Corpora
  • 11. www.ucomp.eu | www.chistera.eu @uCompEU Result Aggregation • Automatic adjudication/aggregation strategies implemented • Challenges encountered • Worker agreement not always representative of quality • Many entities are recognised by only a minority of workers • Regional knowledge is required: #mufc, the bulls • Span mismatch: King of England vs King of England • Quality evaluation • PER P 68.7 R 56.2 F1 61.8 LOC P 15.3 R 91.7 F1 26.2 ORG P 53.2 R 67.1 F1 59.3
  • 12. www.ucomp.eu | www.chistera.eu @uCompEU Factual Knowledge (WP4) • Ontologies create shared meaning and are a cornerstone of the Semantic Web • Manual construction of ontologies is cumbersome and expensive • Ontology learning is a (semi-)automatic process to assist the ontology engineer • uComp builds on an existing ontology learning framework
  • 13. www.ucomp.eu | www.chistera.eu @uCompEU Protégé Plugin • Goal: Apply the uComp HC framework to ontology learning and other ontology construction tasks • How: A plugin implemented for Protégé, a popular ontology engineering platform, using the uComp HC API to validate ontological entities
  • 14. www.ucomp.eu | www.chistera.eu @uCompEU Knowledge Creation Lifecycle
  • 15.
  • 16. www.ucomp.eu | www.chistera.eu @uCompEU Knowledge Quality Evaluation • Feasibility Study • Cost: Reduction of 40% to 83% depending on design used • Quality: Comparable with that of tasks performed by ontology engineers • Large-Scale Evaluation in Medical Domain • Result Quality: Accuracy of 89% / 99% • Completion Time: Similar to domain experts • Cost Reduction of 75% to 81%
  • 17. www.ucomp.eu | www.chistera.eu @uCompEU Affective Knowledge (WP5) • Use HC to produce affective resources that are difficult to obtain automatically and too costly to produce manually, for multiple languages (EN, FR, DE). • Assess HC-produced resources by evaluating the performance impact of using them instead of traditional resources for opinion mining and sentiment analysis (quantitative black-box methodology). • Assess the possibility to replace static gold standard resources by dynamic HC
  • 18. www.ucomp.eu | www.chistera.eu @uCompEU Affective Model
  • 19. www.ucomp.eu | www.chistera.eu @uCompEU Multilingual Twitter Data
  • 20. www.ucomp.eu | www.chistera.eu @uCompEU Crowdsourcing lexicon validation experiment • French Affective Lexicon (9,939 Entries) • Task Design • Results • Feasibility depends on workers’ motivation • Good quality/cost ratio • Ethical and legal issues Evaluation Percentage of crowdsourced validated terms per affective class
  • 21. www.ucomp.eu | www.chistera.eu @uCompEU Evaluation
  • 22. www.ucomp.eu | www.chistera.eu @uCompEU Evaluation • Data Annotation • Expert Annotation: 30.000 tweets : 50% French + 50% German; French: Complete, German: In Progress • Annotation Guide • 7 Entities: Opinion Holder, Opinion Target, Opinion / Sentiment / Emotion Expression, Negation, Modifier, Global OSE Recipient • 6 Relations: SAYS, ABOOUT, NEG, MOD and RECEIVER • Evaluation Campaign – DEFT2015 • 22 participants registered • Polarity, emotion, and opinion holder/target detection • DEFT Workshop at TALN 2015
  • 23. www.ucomp.eu | www.chistera.eu @uCompEU Dissemination & Impact (WP6) • Web Site: www.ucomp.eu; Twitter Presence: @uCompEU • Deliverables: 17 • Y1: D1.1, D1.2, D2.1, D3.1, D5.1, D6.1, D6.2, D7.1, D7.2, D7.3 • Y2: D1.3, D3.2, D3.4, D4.2, D5.2, D5.3, D7.4 • Scientific Publications: 24 • Open-Source Toolkits: 4 • eWRT, TwitIE, Gate HC Plugin, Protégé Plugin • Collaboration: DecarboNet (Climate Challenge), PHEME (Evaluation), Member of the European Center for Social Media • Training and Teaching • Two week-long courses on Mining and Crowdsourcing Social Media Corpora. GATE Summer School (8-12 June 2015; 9-13 June 2014) • Tutorial: Knowledge Extraction from Social Media with GATE. 12th Extended Semantic Web Conference (ESWC-2015) • Tutorial: NLP for Social Media. 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL-2014)
  • 24. www.ucomp.eu | www.chistera.eu @uCompEU Project Management (WP7) • Project duration extended by six months until 14 May 2016 (key staff leaving at MOD and USFD; recruitment delays at WU) • Changes to Work Plan • D2.2 - Postpone to M30 (matching completion of T2.3 and T2.4); • D2.3 - Postpone to M40 (matching T2.5); • D3.3 - Postpone to M42 (matching completion of T3.4); • D5.2 v2 and D5.3 v.2 - postpone to M36 (to allow prior completion of D2.2. at M30); • D5.4 - Postpone to M42; • D6.3 - Postpone to M42 (as this needs to report on all the work done until the end of the project).