SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Why Reinvent the Wheel- Let’s Build Question
Answering Systems Together
Kuldeep Singh, Arun Sethupat, Andreas Both, Saeedeh Shekarpour, Ioanna Lytra,
Ricardo Usbeck, Akhilesh Vyas, Akmal Khikmatullaev, Dharmen Punjani, Christoph Lange,
Maria-Esther Vidal, Jens Lehmann, Sören Auer
The Web Conference 2018 26.04.2018
● More than 60
Question Answering
(QA) systems in last
6 years for
structured data.
● QA systems
implement similar
tasks to answer
user’s question.
State of the Art
Motivating Example
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Motivating Example
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Motivating Example
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Problem
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Approach
● Step 1: Prediction of top k QA
components per task
● Step 2: Formation of QA pipelines
using a greedy algorithm
QA Optimisation Pipeline Algorithm
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Frankenstein Framework
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Frankenstein - Characteristics
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Frankenstein - Characteristics
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Frankenstein - Characteristics
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
DataSet Questions
Considered
Complexity Simple Questions
LC-QuAD 3252 High 22%
QALD-5 204 Low 53%
Corpus Creation
Preparing Training Dataset- Question Features
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Evaluation Metrics (per component)
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Metric Description
Precision (Micro) (#correct answer retrieved for q)/(#answers retrieved for q)
Recall (Micro) (#correct answer retrieved for q)/(# gold standard answers
retrieved for q)
Precision Average of Micro Precision for all questions
Recall Average of Micro Recall for all questions
F-Score (Micro) Harmonic Mean of Micro Precision and Micro Recall
F-Score Harmonic Mean of Precision and Recall
Evaluating Component Performance
QA Task Dataset Best Component Precision Recall Macro F-score
QB LC-QuAD
QALD 5
NLIWOD QB
NLIWOD QB
0.48
0.49
0.49
0.50
0.48
0.49
CL LC-QuAD
QALD 5
OKBQA DM CLS
OKBQA DM CLS
0.47
0.58
0.59
0.64
0.52
0.61
NED LC-QuAD
QALD 5
0.69
0.67
0.66
0.75
RL LC-QuAD
QALD 5
0.25
0.54
0.22
0.74
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
DBpediaSpotlight
Tag Me
ReMatch
0.23
0.62
RNLIWOD
0.67
0.71
Variation in Performance of Components
Question Type Highest Precision among all
components over LC-QuAD
India → dbr:India (#entities=2) 0.91
English → dbr:English_Speaking_Country
(#entity=1)
0.24
number of relation=1, relation is explicit 0.46
When no natural language relation
(e.g. “Give me all cosmonauts”)
0.0
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
NED
NED
RL
RL
Key Message
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
No one is perfect!
Question Type defines the perfection!
Evaluating Pipeline Performance (task level)
QA Tasks
Total
Questions
Answerable
Questions
Frankenstein* Best component
of LC-QuAD
(per task)Top 1 Top 2 Top 3
QB 324.3 175.4 175.4 -
CL 324.3 76 76 -
NED 324.3 294.2 270.9
RL 324.3 153.1 118.9
10 fold-Cross Validation Experiments on LC-QuAD Dataset
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Message: Frankenstein’s Dynamic Composition outperform
the best components
162.7 159.6
68.2
236.3
84.2
68.1
245.2
90.2
284.3
134.4
Evaluating Pipeline Performance (task level)
QA Tasks
Total
Questions
Answerable
Questions
Frankenstein* Best component
of LC-QuAD
(per task)Top 1 Top 2 Top 3
QB 204 119 119 -
CL 204 55 55 -
NED 204 168 153
RL 204 138 107
Using LC-QuAD for training, QALD for Test data
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Message: Frankenstein’s Dynamic Composition outperform
the best components again!
91
52
132
83
163
121
102
55
109
46
Evaluating Pipeline Performance (Pipeline level)
Frankenstein -
Pipeline
Answered
Question
Precision Recall Macro F-score
With QALD’s
best
components
37 0.17 0.19 0.18
Dynamic
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Using LC-QuAD for Training, QALD for Test Data
Message: 1. Dynamic Pipeline outperforms the static one.
2. Query Builder is bottleneck for accuracy.
3. Used Relation Linkers kills the runtime.
41 410.20 410.21 410.20
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
No one is perfect!
Frankenstein can compose dynamic pipelines based on type of question.
Missing intelligence in Query Builder effect QA pipeline miserably.
Worst runtime → Relation Linker
Using same feature set for all tasks impact performance of prediction.
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
Extend Feature set and dataset
Include more domain independent components in Frankenstein
Improve learning mechanism
Research in direction to add intelligence to Query Builder
Research in direction to address implicit/hidden relations
Research in direction to identify implicit entities
Join us : https://github.com/WDAqua/Frankenstein
Find us on : http://wdaqua.eu/ and http://sda.cs.uni-bonn.de/
Email : kuldeep.singh@iais.fraunhofer.de

Weitere ähnliche Inhalte

Ähnlich wie Why Reinvent the Wheel: Let's Build Question Answering Systems Together

Auto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningAuto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine Learning
Databricks
 
Mining Assumptions for Software Components using Machine Learning
Mining Assumptions for Software Components using Machine LearningMining Assumptions for Software Components using Machine Learning
Mining Assumptions for Software Components using Machine Learning
Lionel Briand
 
Shyam_CV_24.06.15 (1)
Shyam_CV_24.06.15 (1)Shyam_CV_24.06.15 (1)
Shyam_CV_24.06.15 (1)
Sunny Kumar
 
Katsuhiko Ogata _ Modern Control Engineering 5th Edition.pdf
Katsuhiko Ogata _ Modern Control Engineering 5th Edition.pdfKatsuhiko Ogata _ Modern Control Engineering 5th Edition.pdf
Katsuhiko Ogata _ Modern Control Engineering 5th Edition.pdf
taha717855
 

Ähnlich wie Why Reinvent the Wheel: Let's Build Question Answering Systems Together (20)

Keynote: Machine Learning for Design Automation at DAC 2018
Keynote:  Machine Learning for Design Automation at DAC 2018Keynote:  Machine Learning for Design Automation at DAC 2018
Keynote: Machine Learning for Design Automation at DAC 2018
 
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
 
Qei SCADA
Qei SCADAQei SCADA
Qei SCADA
 
Pega Lead System Architecture (CPLSA) Exam | Start Your Preparation
Pega Lead System Architecture (CPLSA) Exam | Start Your PreparationPega Lead System Architecture (CPLSA) Exam | Start Your Preparation
Pega Lead System Architecture (CPLSA) Exam | Start Your Preparation
 
Auto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningAuto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine Learning
 
Mining Assumptions for Software Components using Machine Learning
Mining Assumptions for Software Components using Machine LearningMining Assumptions for Software Components using Machine Learning
Mining Assumptions for Software Components using Machine Learning
 
Wait! What’s going on inside my database?
Wait! What’s going on inside my database?Wait! What’s going on inside my database?
Wait! What’s going on inside my database?
 
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
SQCFramework: SPARQL Query Containment Benchmarks Generation FrameworkSQCFramework: SPARQL Query Containment Benchmarks Generation Framework
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
 
Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement LearningAsynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
 
Selenium Automation Framework
Selenium Automation  FrameworkSelenium Automation  Framework
Selenium Automation Framework
 
Scaling out logistic regression with Spark
Scaling out logistic regression with SparkScaling out logistic regression with Spark
Scaling out logistic regression with Spark
 
[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach
[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach
[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach
 
Shyam_CV_24.06.15 (1)
Shyam_CV_24.06.15 (1)Shyam_CV_24.06.15 (1)
Shyam_CV_24.06.15 (1)
 
defense_PPT
defense_PPTdefense_PPT
defense_PPT
 
Katsuhiko Ogata _ Modern Control Engineering 5th Edition.pdf
Katsuhiko Ogata _ Modern Control Engineering 5th Edition.pdfKatsuhiko Ogata _ Modern Control Engineering 5th Edition.pdf
Katsuhiko Ogata _ Modern Control Engineering 5th Edition.pdf
 
Katsuhiko Ogata _ Modern Control Engineering 5th Edition.pdf
Katsuhiko Ogata _ Modern Control Engineering 5th Edition.pdfKatsuhiko Ogata _ Modern Control Engineering 5th Edition.pdf
Katsuhiko Ogata _ Modern Control Engineering 5th Edition.pdf
 
Lazy Join Optimizations Without Upfront Statistics with Matteo Interlandi
Lazy Join Optimizations Without Upfront Statistics with Matteo InterlandiLazy Join Optimizations Without Upfront Statistics with Matteo Interlandi
Lazy Join Optimizations Without Upfront Statistics with Matteo Interlandi
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 

Why Reinvent the Wheel: Let's Build Question Answering Systems Together

  • 1. Why Reinvent the Wheel- Let’s Build Question Answering Systems Together Kuldeep Singh, Arun Sethupat, Andreas Both, Saeedeh Shekarpour, Ioanna Lytra, Ricardo Usbeck, Akhilesh Vyas, Akmal Khikmatullaev, Dharmen Punjani, Christoph Lange, Maria-Esther Vidal, Jens Lehmann, Sören Auer The Web Conference 2018 26.04.2018
  • 2. ● More than 60 Question Answering (QA) systems in last 6 years for structured data. ● QA systems implement similar tasks to answer user’s question. State of the Art
  • 3. Motivating Example Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
  • 4. Motivating Example Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
  • 5. Motivating Example Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
  • 6. Problem Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
  • 7. Approach ● Step 1: Prediction of top k QA components per task ● Step 2: Formation of QA pipelines using a greedy algorithm QA Optimisation Pipeline Algorithm Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
  • 8. Frankenstein Framework Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
  • 9. Frankenstein - Characteristics Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
  • 10. Frankenstein - Characteristics Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
  • 11. Frankenstein - Characteristics Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
  • 12. Why Reinvent the Wheel- Let’s Build Question Answering Systems Together DataSet Questions Considered Complexity Simple Questions LC-QuAD 3252 High 22% QALD-5 204 Low 53% Corpus Creation
  • 13. Preparing Training Dataset- Question Features Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
  • 14. Evaluation Metrics (per component) Why Reinvent the Wheel- Let’s Build Question Answering Systems Together Metric Description Precision (Micro) (#correct answer retrieved for q)/(#answers retrieved for q) Recall (Micro) (#correct answer retrieved for q)/(# gold standard answers retrieved for q) Precision Average of Micro Precision for all questions Recall Average of Micro Recall for all questions F-Score (Micro) Harmonic Mean of Micro Precision and Micro Recall F-Score Harmonic Mean of Precision and Recall
  • 15. Evaluating Component Performance QA Task Dataset Best Component Precision Recall Macro F-score QB LC-QuAD QALD 5 NLIWOD QB NLIWOD QB 0.48 0.49 0.49 0.50 0.48 0.49 CL LC-QuAD QALD 5 OKBQA DM CLS OKBQA DM CLS 0.47 0.58 0.59 0.64 0.52 0.61 NED LC-QuAD QALD 5 0.69 0.67 0.66 0.75 RL LC-QuAD QALD 5 0.25 0.54 0.22 0.74 Why Reinvent the Wheel- Let’s Build Question Answering Systems Together DBpediaSpotlight Tag Me ReMatch 0.23 0.62 RNLIWOD 0.67 0.71
  • 16. Variation in Performance of Components Question Type Highest Precision among all components over LC-QuAD India → dbr:India (#entities=2) 0.91 English → dbr:English_Speaking_Country (#entity=1) 0.24 number of relation=1, relation is explicit 0.46 When no natural language relation (e.g. “Give me all cosmonauts”) 0.0 Why Reinvent the Wheel- Let’s Build Question Answering Systems Together NED NED RL RL
  • 17. Key Message Why Reinvent the Wheel- Let’s Build Question Answering Systems Together No one is perfect! Question Type defines the perfection!
  • 18. Evaluating Pipeline Performance (task level) QA Tasks Total Questions Answerable Questions Frankenstein* Best component of LC-QuAD (per task)Top 1 Top 2 Top 3 QB 324.3 175.4 175.4 - CL 324.3 76 76 - NED 324.3 294.2 270.9 RL 324.3 153.1 118.9 10 fold-Cross Validation Experiments on LC-QuAD Dataset Why Reinvent the Wheel- Let’s Build Question Answering Systems Together Message: Frankenstein’s Dynamic Composition outperform the best components 162.7 159.6 68.2 236.3 84.2 68.1 245.2 90.2 284.3 134.4
  • 19. Evaluating Pipeline Performance (task level) QA Tasks Total Questions Answerable Questions Frankenstein* Best component of LC-QuAD (per task)Top 1 Top 2 Top 3 QB 204 119 119 - CL 204 55 55 - NED 204 168 153 RL 204 138 107 Using LC-QuAD for training, QALD for Test data Why Reinvent the Wheel- Let’s Build Question Answering Systems Together Message: Frankenstein’s Dynamic Composition outperform the best components again! 91 52 132 83 163 121 102 55 109 46
  • 20. Evaluating Pipeline Performance (Pipeline level) Frankenstein - Pipeline Answered Question Precision Recall Macro F-score With QALD’s best components 37 0.17 0.19 0.18 Dynamic Why Reinvent the Wheel- Let’s Build Question Answering Systems Together Using LC-QuAD for Training, QALD for Test Data Message: 1. Dynamic Pipeline outperforms the static one. 2. Query Builder is bottleneck for accuracy. 3. Used Relation Linkers kills the runtime. 41 410.20 410.21 410.20
  • 21. Why Reinvent the Wheel- Let’s Build Question Answering Systems Together No one is perfect! Frankenstein can compose dynamic pipelines based on type of question. Missing intelligence in Query Builder effect QA pipeline miserably. Worst runtime → Relation Linker Using same feature set for all tasks impact performance of prediction.
  • 22. Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
  • 23. Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
  • 24. Why Reinvent the Wheel- Let’s Build Question Answering Systems Together
  • 25. Why Reinvent the Wheel- Let’s Build Question Answering Systems Together Extend Feature set and dataset Include more domain independent components in Frankenstein Improve learning mechanism Research in direction to add intelligence to Query Builder Research in direction to address implicit/hidden relations Research in direction to identify implicit entities
  • 26. Join us : https://github.com/WDAqua/Frankenstein Find us on : http://wdaqua.eu/ and http://sda.cs.uni-bonn.de/ Email : kuldeep.singh@iais.fraunhofer.de