SlideShare a Scribd company logo
1 of 44
DEEP LEARNING MODELS
FOR
QUESTION ANSWERING
Sujit Pal & Abhishek Sharma
Elsevier Search Guild Question Answering Workshop
October 5-6, 2016
About Us
Sujit Pal
Technology Research Director
Elsevier Labs
Abhishek Sharma
Organizer, DLE Meetup
and
Software Engineer, Salesforce
2
• Watching Udacity “Deep Learning” videos taught by
Vincent Vanhoucke.
• Watching “Deep Learning for Natural Language
Processing” (CS 224d) videos taught by Richard
Socher.
• Thinking that it might be interesting to do “something”
around Question Answering and Deep Learning.
How we started
3
What we knew
Deep
Learning
Question
Answering
Machine
Learning
Tensorflow
Keras
Python
Search
(Lucene/Solr/ES)
Natural
Language
Processing
gensim
4
Identify Scope
5
Question Answering Pipeline
6
Question Answering Pipeline
7
Research
8
This had just ended…
and the #4 ranked entry used Deep Learning for their solution.
9
10
11
12
13
14
15
Building Blocks
16
Most DL Networks (including Question Answering
models) composed out of these basic building blocks.
• Fully Connected Network
• Word Embedding
• Convolutional Neural Network
• Recurrent Neural Network
Building blocks
17
Fully Connected Network
Credit: neuralnetworksanddeeplearning.com
• Workhorse architecture of Deep Learning.
• Number of layers and number of units per layer increased for more
complex models.
• Used for all kinds of problem spaces.
18
Word Embedding
Credit:Sebastian Ruder // sebastianruder.com
• Projects sparse 1-hot
vector representation onto
denser lower dimensional
space.
• Unsupervised technique.
• Embedding space exhibits
Distributional Semantics.
• Has almost completely
replaced traditional
distributional features in
NLP (Deep Learning and
non Deep Learning).
• Word2Vec (CBOW and
Skip-gram), GloVe.
19
Convolutional Neural Network
• Alternate Convolution and Pooling Operations to extract relevant
features
• Mainly used in Image recognition, exploits geometry of image.
• 1D variant (Convolution and Pooling) used for text.
• Exploits word neighborhoods and extracts “meaning” of sentences or
paragraphs.
Credit: deeplearning.net
20
Recurrent Neural Network
Credit: Andrej Karpathy // karpathy.github.io
• Works with sequence input (such as text and audio).
• Exploits temporal nature of the data.
• Many variations possible (shown below).
• Basic RNN suffers from vanishing gradient problem – addressed by
Long Short Term Memory (LSTM) RNNs.
• Gated Recurrent Unit (GRU) another variant with simpler structure
and better performance than LSTM.
21
Start with bAbI
22
• Synthetic Dataset (1000, 10k, … records)
• Composed of actors, places, things, actions, etc.
• Released by Facebook Research
bAbI Dataset
• Single supporting fact
• Two supporting facts
• Three supporting facts
• Two argument relations
• Three argument relations
• Yes/No questions
• Counting
• Lists/sets
• Simple Negation
• Indefinite Knowledge
• Basic Coreference
• Conjunction
• Compound Coreference
• Time Reasoning
• Basic Deduction
• Basic Induction
• Positional Reasoning
• Size Reasoning
• Path Finding
• Agent’s Motivations
23
bAbI Format (task 1)
24
bAbI LSTM
• Implementation based
on the paper: Towards
AI-Complete Question
Answering: A Set of
Prerequisite Toy
Tasks.
• Test accuracy reported
in paper: 50%.
• Test accuracy achieved
by implementation:
56%.
• Code is here.
25
bAbI MemNN
• Implementation based on
the paper: End-to-end
Memory Networks.
• Test accuracy reported in
paper: 99%.
• Test accuracy achieved by
implementation: 42%.
• Code is here.
26
Back to Kaggle
27
Data Format
• 2000 multiple choice 8th Grade Science questions with 4
candidate answers and correct answer label.
• 2000 questions without correct answer label.
• Each question = 1 positive + 3 negative examples.
• No story here.
28
QA-LSTM
• Implementation based
on the paper: LSTM-
based Deep Learning
Models for Non-factoid
Answer Selection.
• Test accuracy reported
in paper: 64.3%
(InsuranceQA dataset).
• Test accuracy achieved
by implementation:
56.93% (unidirectional)
and 57% (bidirectional).
• Code: unidirectional,
bidirectional
29
Our Embedding Approach
• 3 approaches to Embedding
• Generate from Data
• Use External model for lookup
• Initialize with External model, then fine tune.
• Not enough question data to generate good embedding.
• Used pre-trained Google News word2vec model (trained
with 3 billion words)
• Model has 3 million word vectors of dimension (300,).
• Uses gensim to read model.
30
QA-LSTM-CNN
• Additional CNN Layer
for more effective
summarization.
• Test accuracy reported
in paper: 62.2%
(InsuranceQA dataset).
• Test accuracy achieved
by implementation:
56.3% (unidirectional),
did not try bidirectional.
• Code is here.
31
Incorporating Attention
• Vanishing Gradient problem addressed by LSTMs, but still
shows up in long range Q+A contexts.
• Solved using Attention Models
• Based on visual models of human attention.
• Allow the network to focus on certain words in question with “high
resolution” and the rest at “low resolution”.
• Similar to advice given for comprehension tests about reading the
questions, then scanning passage for question keywords.
• Implemented here as a dot product of question and answer, or
question and story vectors.
32
QA-LSTM + Attention
• Attention vector from
question and answer
combined with question.
• Test accuracy reported
in paper: 68.4%
(InsuranceQA dataset).
• Test accuracy achieved
by implementation:
62.93% (unidirectional),
60.43% (bidirectional)
• Code: unidirectional,
bidirectional.
33
Incorporating External Knowledge
• Contestants were allowed/advised to use external
sources such as ConceptNet, CK-12 books, Quizlets,
Flashcards from StudyStack, etc.
• Significant crawling/scraping and parsing effort involved.
• 4th place winner (tambietm) provides parsed download of
StudyStack Flashcards on his Google drive.
• Flashcard “story” = question || answer
34
Using Story Embedding
• Build Word2Vec model using words from Flashcards.
• Approximately 500k flashcards, 8,000 unique words.
• Provides smaller, more focused embedding space.
• Good performance boost over default Word2Vec
embedding.
35
Model Default
Embedding
Story
Embedding
QA-LSTM with Attention 62.93 76.27
QA-LSTM Bidirectional with Attention 60.43 76.27
Relating Story to Question
• Replicate bAbI setup: (story, question, answer).
• Only a subset of flashcards relate to given question.
• Using traditional IR methods to generate flashcard stories
for each question.
36
QA-LSTM + Story
• Story and Question
combined to create Fact
vector.
• Question and Answer
combined to create
Attention vector
• Fact and Attention
vectors concatenated.
• Test accuracy achieved
by implementation:
70.47% (unidirectional),
61.77% (bidirectional).
• Code: unidirectional,
bidirectional.
37
Results
Model Specifications Test Accuracy
(%)
QA-LSTM (Baseline) 56.93
QA-LSTM Bidirectional 57.0
QA-LSTM + CNN 55.7
QA-LSTM with Attention 62.93
QA-LSTM Bidirectional with Attention 60.43
QA-LSTM with Attention + Custom Embedding 76.27 *
QA-LSTM Bidirectional w/Attention + Custom Embedding 76.27 *
QA-LSTM + Attention + Story Facts 70.47
QA-LSTM Bidirectional + Attention + Story Facts 61.77
38
Model Deployment
• Our models predict answer correct vs. incorrect.
• Task is to choose the correct answer from candidate
answers.
• Re-instantiate trained model with Softmax layer removed.
• Run batch of (story, question, answer) for each candidate
answer.
• Select best scoring answer as correct answer.
39
Deploying Model - Example
40
Future Work
• Would like to implement Dynamic Memory Network
(DMN) and Hierarchical Attention Based Convolutional
Neural Network (HABCNN) models against this data.
• Would like to submit to Kaggle to see how I did once
Keras Issue 3927 is resolved.
• Would like to try out these models against Stanford
Question Answering Dataset (SQuAD) based on
Wikipedia articles.
• Would like to investigate Question Generation from text in
order to generate training sets for Elsevier corpora.
41
Administrivia
• Code for this talk:https://github.com/sujitpal/dl-models-for-
qa
• Contact me for questions and suggestions:
sujit.pal@elsevier.com
42
Closing Thoughts
• Deep Learning is rapidly becoming a general purpose
solution for nearly all learning problems.
• Information Retrieval approaches are still more successful
on Question Answering than Deep Learning, but there are
many efforts by Deep Learning researchers to change
that.
43
Thank you.
44

More Related Content

What's hot

What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...Simplilearn
 
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systemsQi He
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Saeedeh Shekarpour
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer modelsDing Li
 
Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Daniel Adenew
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelHemantha Kulathilake
 
AlexNet, VGG, GoogleNet, Resnet
AlexNet, VGG, GoogleNet, ResnetAlexNet, VGG, GoogleNet, Resnet
AlexNet, VGG, GoogleNet, ResnetJungwon Kim
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Edureka!
 
Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Prakash Pimpale
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language ModelsLeon Dohmen
 
Introduction to Keras
Introduction to KerasIntroduction to Keras
Introduction to KerasJohn Ramey
 
Reinforcement learning
Reinforcement learningReinforcement learning
Reinforcement learningDongHyun Kwak
 
Neural Networks: Least Mean Square (LSM) Algorithm
Neural Networks: Least Mean Square (LSM) AlgorithmNeural Networks: Least Mean Square (LSM) Algorithm
Neural Networks: Least Mean Square (LSM) AlgorithmMostafa G. M. Mostafa
 
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...Universitat Politècnica de Catalunya
 
Nlp toolkits and_preprocessing_techniques
Nlp toolkits and_preprocessing_techniquesNlp toolkits and_preprocessing_techniques
Nlp toolkits and_preprocessing_techniquesankit_ppt
 
Learning to Rank with Neural Networks
Learning to Rank with Neural NetworksLearning to Rank with Neural Networks
Learning to Rank with Neural NetworksBhaskar Mitra
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnnKuppusamy P
 

What's hot (20)

What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
 
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...
 
Deep learning
Deep learningDeep learning
Deep learning
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language Model
 
AlexNet, VGG, GoogleNet, Resnet
AlexNet, VGG, GoogleNet, ResnetAlexNet, VGG, GoogleNet, Resnet
AlexNet, VGG, GoogleNet, Resnet
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
 
Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 
Introduction to Keras
Introduction to KerasIntroduction to Keras
Introduction to Keras
 
Reinforcement learning
Reinforcement learningReinforcement learning
Reinforcement learning
 
Image captioning
Image captioningImage captioning
Image captioning
 
Neural Networks: Least Mean Square (LSM) Algorithm
Neural Networks: Least Mean Square (LSM) AlgorithmNeural Networks: Least Mean Square (LSM) Algorithm
Neural Networks: Least Mean Square (LSM) Algorithm
 
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
 
Nlp toolkits and_preprocessing_techniques
Nlp toolkits and_preprocessing_techniquesNlp toolkits and_preprocessing_techniques
Nlp toolkits and_preprocessing_techniques
 
Learning to Rank with Neural Networks
Learning to Rank with Neural NetworksLearning to Rank with Neural Networks
Learning to Rank with Neural Networks
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 

Viewers also liked

Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...
Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...
Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...Andrew Gardner
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksChristian Perone
 
Deep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsDeep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsBuhwan Jeong
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
 
Deep Learning through Examples
Deep Learning through ExamplesDeep Learning through Examples
Deep Learning through ExamplesSri Ambati
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningSri Ambati
 

Viewers also liked (6)

Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...
Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...
Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Deep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsDeep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applications
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
Deep Learning through Examples
Deep Learning through ExamplesDeep Learning through Examples
Deep Learning through Examples
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine Learning
 

Similar to Deep Learning Models for Question Answering

Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Lucidworks
 
Improving neural question generation using answer separation
Improving neural question generation using answer separationImproving neural question generation using answer separation
Improving neural question generation using answer separationNAVER Engineering
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNAVER Engineering
 
Deep learning Tutorial - Part II
Deep learning Tutorial - Part IIDeep learning Tutorial - Part II
Deep learning Tutorial - Part IIQuantUniversity
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017Manish Pandey
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel BriandPtidej Team
 
Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Dalei Li
 
Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...
Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...
Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...Lucidworks
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsLionel Briand
 
Practical Constraint Solving for Generating System Test Data
Practical Constraint Solving for Generating System Test DataPractical Constraint Solving for Generating System Test Data
Practical Constraint Solving for Generating System Test DataLionel Briand
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural NetworksDatabricks
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningMLAI2
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
 
Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Lionel Briand
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroSi Krishan
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...IEEEFINALYEARSTUDENTPROJECT
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...IEEEMEMTECHSTUDENTSPROJECTS
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEEFINALYEARSTUDENTPROJECTS
 

Similar to Deep Learning Models for Question Answering (20)

Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
 
Improving neural question generation using answer separation
Improving neural question generation using answer separationImproving neural question generation using answer separation
Improving neural question generation using answer separation
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltc
 
Deep learning Tutorial - Part II
Deep learning Tutorial - Part IIDeep learning Tutorial - Part II
Deep learning Tutorial - Part II
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel Briand
 
Icml2017 overview
Icml2017 overviewIcml2017 overview
Icml2017 overview
 
Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...
 
Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...
Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...
Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
 
Practical Constraint Solving for Generating System Test Data
Practical Constraint Solving for Generating System Test DataPractical Constraint Solving for Generating System Test Data
Practical Constraint Solving for Generating System Test Data
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive Learning
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An Intro
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
 

More from Sujit Pal

Supporting Concept Search using a Clinical Healthcare Knowledge Graph
Supporting Concept Search using a Clinical Healthcare Knowledge GraphSupporting Concept Search using a Clinical Healthcare Knowledge Graph
Supporting Concept Search using a Clinical Healthcare Knowledge GraphSujit Pal
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Building Learning to Rank (LTR) search reranking models using Large Language ...
Building Learning to Rank (LTR) search reranking models using Large Language ...Building Learning to Rank (LTR) search reranking models using Large Language ...
Building Learning to Rank (LTR) search reranking models using Large Language ...Sujit Pal
 
Cheap Trick for Question Answering
Cheap Trick for Question AnsweringCheap Trick for Question Answering
Cheap Trick for Question AnsweringSujit Pal
 
Searching Across Images and Test
Searching Across Images and TestSearching Across Images and Test
Searching Across Images and TestSujit Pal
 
Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...Sujit Pal
 
The power of community: training a Transformer Language Model on a shoestring
The power of community: training a Transformer Language Model on a shoestringThe power of community: training a Transformer Language Model on a shoestring
The power of community: training a Transformer Language Model on a shoestringSujit Pal
 
Backprop Visualization
Backprop VisualizationBackprop Visualization
Backprop VisualizationSujit Pal
 
Accelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn CloudAccelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn CloudSujit Pal
 
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19Sujit Pal
 
Leslie Smith's Papers discussion for DL Journal Club
Leslie Smith's Papers discussion for DL Journal ClubLeslie Smith's Papers discussion for DL Journal Club
Leslie Smith's Papers discussion for DL Journal ClubSujit Pal
 
Using Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based RetrievalUsing Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based RetrievalSujit Pal
 
Transformer Mods for Document Length Inputs
Transformer Mods for Document Length InputsTransformer Mods for Document Length Inputs
Transformer Mods for Document Length InputsSujit Pal
 
Question Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other StoriesQuestion Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other StoriesSujit Pal
 
Building Named Entity Recognition Models Efficiently using NERDS
Building Named Entity Recognition Models Efficiently using NERDSBuilding Named Entity Recognition Models Efficiently using NERDS
Building Named Entity Recognition Models Efficiently using NERDSSujit Pal
 
Graph Techniques for Natural Language Processing
Graph Techniques for Natural Language ProcessingGraph Techniques for Natural Language Processing
Graph Techniques for Natural Language ProcessingSujit Pal
 
Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildSujit Pal
 
Search summit-2018-ltr-presentation
Search summit-2018-ltr-presentationSearch summit-2018-ltr-presentation
Search summit-2018-ltr-presentationSujit Pal
 
Search summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slidesSearch summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slidesSujit Pal
 
SoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming textSoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming textSujit Pal
 

More from Sujit Pal (20)

Supporting Concept Search using a Clinical Healthcare Knowledge Graph
Supporting Concept Search using a Clinical Healthcare Knowledge GraphSupporting Concept Search using a Clinical Healthcare Knowledge Graph
Supporting Concept Search using a Clinical Healthcare Knowledge Graph
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Building Learning to Rank (LTR) search reranking models using Large Language ...
Building Learning to Rank (LTR) search reranking models using Large Language ...Building Learning to Rank (LTR) search reranking models using Large Language ...
Building Learning to Rank (LTR) search reranking models using Large Language ...
 
Cheap Trick for Question Answering
Cheap Trick for Question AnsweringCheap Trick for Question Answering
Cheap Trick for Question Answering
 
Searching Across Images and Test
Searching Across Images and TestSearching Across Images and Test
Searching Across Images and Test
 
Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...
 
The power of community: training a Transformer Language Model on a shoestring
The power of community: training a Transformer Language Model on a shoestringThe power of community: training a Transformer Language Model on a shoestring
The power of community: training a Transformer Language Model on a shoestring
 
Backprop Visualization
Backprop VisualizationBackprop Visualization
Backprop Visualization
 
Accelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn CloudAccelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn Cloud
 
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
 
Leslie Smith's Papers discussion for DL Journal Club
Leslie Smith's Papers discussion for DL Journal ClubLeslie Smith's Papers discussion for DL Journal Club
Leslie Smith's Papers discussion for DL Journal Club
 
Using Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based RetrievalUsing Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based Retrieval
 
Transformer Mods for Document Length Inputs
Transformer Mods for Document Length InputsTransformer Mods for Document Length Inputs
Transformer Mods for Document Length Inputs
 
Question Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other StoriesQuestion Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other Stories
 
Building Named Entity Recognition Models Efficiently using NERDS
Building Named Entity Recognition Models Efficiently using NERDSBuilding Named Entity Recognition Models Efficiently using NERDS
Building Named Entity Recognition Models Efficiently using NERDS
 
Graph Techniques for Natural Language Processing
Graph Techniques for Natural Language ProcessingGraph Techniques for Natural Language Processing
Graph Techniques for Natural Language Processing
 
Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search Guild
 
Search summit-2018-ltr-presentation
Search summit-2018-ltr-presentationSearch summit-2018-ltr-presentation
Search summit-2018-ltr-presentation
 
Search summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slidesSearch summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slides
 
SoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming textSoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming text
 

Recently uploaded

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 

Recently uploaded (20)

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 

Deep Learning Models for Question Answering

  • 1. DEEP LEARNING MODELS FOR QUESTION ANSWERING Sujit Pal & Abhishek Sharma Elsevier Search Guild Question Answering Workshop October 5-6, 2016
  • 2. About Us Sujit Pal Technology Research Director Elsevier Labs Abhishek Sharma Organizer, DLE Meetup and Software Engineer, Salesforce 2
  • 3. • Watching Udacity “Deep Learning” videos taught by Vincent Vanhoucke. • Watching “Deep Learning for Natural Language Processing” (CS 224d) videos taught by Richard Socher. • Thinking that it might be interesting to do “something” around Question Answering and Deep Learning. How we started 3
  • 9. This had just ended… and the #4 ranked entry used Deep Learning for their solution. 9
  • 10. 10
  • 11. 11
  • 12. 12
  • 13. 13
  • 14. 14
  • 15. 15
  • 17. Most DL Networks (including Question Answering models) composed out of these basic building blocks. • Fully Connected Network • Word Embedding • Convolutional Neural Network • Recurrent Neural Network Building blocks 17
  • 18. Fully Connected Network Credit: neuralnetworksanddeeplearning.com • Workhorse architecture of Deep Learning. • Number of layers and number of units per layer increased for more complex models. • Used for all kinds of problem spaces. 18
  • 19. Word Embedding Credit:Sebastian Ruder // sebastianruder.com • Projects sparse 1-hot vector representation onto denser lower dimensional space. • Unsupervised technique. • Embedding space exhibits Distributional Semantics. • Has almost completely replaced traditional distributional features in NLP (Deep Learning and non Deep Learning). • Word2Vec (CBOW and Skip-gram), GloVe. 19
  • 20. Convolutional Neural Network • Alternate Convolution and Pooling Operations to extract relevant features • Mainly used in Image recognition, exploits geometry of image. • 1D variant (Convolution and Pooling) used for text. • Exploits word neighborhoods and extracts “meaning” of sentences or paragraphs. Credit: deeplearning.net 20
  • 21. Recurrent Neural Network Credit: Andrej Karpathy // karpathy.github.io • Works with sequence input (such as text and audio). • Exploits temporal nature of the data. • Many variations possible (shown below). • Basic RNN suffers from vanishing gradient problem – addressed by Long Short Term Memory (LSTM) RNNs. • Gated Recurrent Unit (GRU) another variant with simpler structure and better performance than LSTM. 21
  • 23. • Synthetic Dataset (1000, 10k, … records) • Composed of actors, places, things, actions, etc. • Released by Facebook Research bAbI Dataset • Single supporting fact • Two supporting facts • Three supporting facts • Two argument relations • Three argument relations • Yes/No questions • Counting • Lists/sets • Simple Negation • Indefinite Knowledge • Basic Coreference • Conjunction • Compound Coreference • Time Reasoning • Basic Deduction • Basic Induction • Positional Reasoning • Size Reasoning • Path Finding • Agent’s Motivations 23
  • 25. bAbI LSTM • Implementation based on the paper: Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks. • Test accuracy reported in paper: 50%. • Test accuracy achieved by implementation: 56%. • Code is here. 25
  • 26. bAbI MemNN • Implementation based on the paper: End-to-end Memory Networks. • Test accuracy reported in paper: 99%. • Test accuracy achieved by implementation: 42%. • Code is here. 26
  • 28. Data Format • 2000 multiple choice 8th Grade Science questions with 4 candidate answers and correct answer label. • 2000 questions without correct answer label. • Each question = 1 positive + 3 negative examples. • No story here. 28
  • 29. QA-LSTM • Implementation based on the paper: LSTM- based Deep Learning Models for Non-factoid Answer Selection. • Test accuracy reported in paper: 64.3% (InsuranceQA dataset). • Test accuracy achieved by implementation: 56.93% (unidirectional) and 57% (bidirectional). • Code: unidirectional, bidirectional 29
  • 30. Our Embedding Approach • 3 approaches to Embedding • Generate from Data • Use External model for lookup • Initialize with External model, then fine tune. • Not enough question data to generate good embedding. • Used pre-trained Google News word2vec model (trained with 3 billion words) • Model has 3 million word vectors of dimension (300,). • Uses gensim to read model. 30
  • 31. QA-LSTM-CNN • Additional CNN Layer for more effective summarization. • Test accuracy reported in paper: 62.2% (InsuranceQA dataset). • Test accuracy achieved by implementation: 56.3% (unidirectional), did not try bidirectional. • Code is here. 31
  • 32. Incorporating Attention • Vanishing Gradient problem addressed by LSTMs, but still shows up in long range Q+A contexts. • Solved using Attention Models • Based on visual models of human attention. • Allow the network to focus on certain words in question with “high resolution” and the rest at “low resolution”. • Similar to advice given for comprehension tests about reading the questions, then scanning passage for question keywords. • Implemented here as a dot product of question and answer, or question and story vectors. 32
  • 33. QA-LSTM + Attention • Attention vector from question and answer combined with question. • Test accuracy reported in paper: 68.4% (InsuranceQA dataset). • Test accuracy achieved by implementation: 62.93% (unidirectional), 60.43% (bidirectional) • Code: unidirectional, bidirectional. 33
  • 34. Incorporating External Knowledge • Contestants were allowed/advised to use external sources such as ConceptNet, CK-12 books, Quizlets, Flashcards from StudyStack, etc. • Significant crawling/scraping and parsing effort involved. • 4th place winner (tambietm) provides parsed download of StudyStack Flashcards on his Google drive. • Flashcard “story” = question || answer 34
  • 35. Using Story Embedding • Build Word2Vec model using words from Flashcards. • Approximately 500k flashcards, 8,000 unique words. • Provides smaller, more focused embedding space. • Good performance boost over default Word2Vec embedding. 35 Model Default Embedding Story Embedding QA-LSTM with Attention 62.93 76.27 QA-LSTM Bidirectional with Attention 60.43 76.27
  • 36. Relating Story to Question • Replicate bAbI setup: (story, question, answer). • Only a subset of flashcards relate to given question. • Using traditional IR methods to generate flashcard stories for each question. 36
  • 37. QA-LSTM + Story • Story and Question combined to create Fact vector. • Question and Answer combined to create Attention vector • Fact and Attention vectors concatenated. • Test accuracy achieved by implementation: 70.47% (unidirectional), 61.77% (bidirectional). • Code: unidirectional, bidirectional. 37
  • 38. Results Model Specifications Test Accuracy (%) QA-LSTM (Baseline) 56.93 QA-LSTM Bidirectional 57.0 QA-LSTM + CNN 55.7 QA-LSTM with Attention 62.93 QA-LSTM Bidirectional with Attention 60.43 QA-LSTM with Attention + Custom Embedding 76.27 * QA-LSTM Bidirectional w/Attention + Custom Embedding 76.27 * QA-LSTM + Attention + Story Facts 70.47 QA-LSTM Bidirectional + Attention + Story Facts 61.77 38
  • 39. Model Deployment • Our models predict answer correct vs. incorrect. • Task is to choose the correct answer from candidate answers. • Re-instantiate trained model with Softmax layer removed. • Run batch of (story, question, answer) for each candidate answer. • Select best scoring answer as correct answer. 39
  • 40. Deploying Model - Example 40
  • 41. Future Work • Would like to implement Dynamic Memory Network (DMN) and Hierarchical Attention Based Convolutional Neural Network (HABCNN) models against this data. • Would like to submit to Kaggle to see how I did once Keras Issue 3927 is resolved. • Would like to try out these models against Stanford Question Answering Dataset (SQuAD) based on Wikipedia articles. • Would like to investigate Question Generation from text in order to generate training sets for Elsevier corpora. 41
  • 42. Administrivia • Code for this talk:https://github.com/sujitpal/dl-models-for- qa • Contact me for questions and suggestions: sujit.pal@elsevier.com 42
  • 43. Closing Thoughts • Deep Learning is rapidly becoming a general purpose solution for nearly all learning problems. • Information Retrieval approaches are still more successful on Question Answering than Deep Learning, but there are many efforts by Deep Learning researchers to change that. 43

Editor's Notes

  1. Proposing the bAbI dataset.
  2. Recurrent attention model over large external memory.
  3. Closest to our use case, this is what our solution to the Kaggle comp is based on.
  4. Socher’s dynamic memory models.
  5. DMN, bAbI, Socher again
  6. MCTest Benchmark, HABCNN
  7. [1] bAbI dataset, [2] representation size (number of words), [3] MCTest Benchmark, Hierarchical Attention Based CNN (HABCNN), [4] DMN, Richard Socher [5]
  8. [1] bAbI dataset, [2] representation size (number of words), [3] MCTest Benchmark, Hierarchical Attention Based CNN (HABCNN), [4] DMN, Richard Socher [5]
  9. [1] bAbI dataset, [2] representation size (number of words), [3] MCTest Benchmark, Hierarchical Attention Based CNN (HABCNN), [4] DMN, Richard Socher [5]