SlideShare a Scribd company logo
1 of 49
Download to read offline
Step-by-step approach
to question answering
Sewon Min
Seoul National University
2017.08.21
at .
Sewon Min
- Interested in Natural language understanding
with a focus on question answering
- Background
- Undergraduate in SNU (~2018)
- Research Experience in UW (2016~2017)
- Publication
- Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi, “Neural Speed Reading”.
2017. (Under review)
- Sewon Min, Minjoon Seo, Hannaneh Hajishirzi. “Question Answering through
Transfer Learning from Large Fine-grained Supervision Data”. ACL. 2017.
- Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi. “Query-reduction
Networks”. ICLR. 2017
Natural language
GAP
Current State Human-level
Natural language
GAP
Current State Human-level
Natural language
GAP
Current State Human-level
Contents
Current state in Question Answering
Expansion from current state
Question Answering through transfer learning
(my work)
Contents
Current state in Question Answering
Expansion from current state
Question Answering through transfer learning
(my work)
Question Answering
SystemQuestion Answer
Context
- Structural data (DB)
- Text in Natural Language
- Dialog History
- Web data
Question Answering
SystemQuestion Answer
Context
- Structural data (DB)
- Text in Natural Language
- Dialog History
- Web data
Context-based Question Answering
Machine Comprehension
SQuAD
Southern California, often abbreviated SoCal, is a geographic
and cultural region that generally comprises California's
southernmost 10 counties. (…)
What is Southern California often abbreviated as?
Stanford Question Answering Dataset (2016)
SQuAD
Southern California, often abbreviated SoCal, is a geographic
and cultural region that generally comprises California's
southernmost 10 counties. (…)
What is Southern California often abbreviated as?
Stanford Question Answering Dataset (2016)
SQuAD
Stanford Question Answering Dataset (2016)
Models
Match-LSTM (SMU), BiDAF (UW+AI2), DCN (Salesfoce),
R-Net (Microsoft), AoA Reader (HIT + iFLYTEK) and many others
Performance
- System: EM 55 F1 68 → EM 78 F1 85
- Human: EM 82 F1 91
More information: https://rajpurkar.github.io/SQuAD-explorer/
SQuAD
Why popular?
1. Domain: Context-based, Wikipedia, Real questions
2. Task: Span-based answer
- Closer to real QA than Cloze-style
- Easier to evaluate than Free-form
3. Proper difficulty
Who made airbus
Airbus SAS is an aircraft manufacturing subsidiary of EADS, a European aerospace
company. Airbus began as an union of aircraft companies …
____________ says he understands why @entity0 won’t play at his tournament
... @entity0 called me personally to let me know that he wouldn’t be playing here at
@entity23 ,” @entity3 said on his @entity21 events website...
WikiQA (Context Sentence Classification)
CNN/Daily Mail (Cloze Style)
What energy is used in photosynthesis?
Photosynthesis is a process used by plants and other organisms to convert light
energy, normally from the Sun, into chemical energy (…)
[light energy] [energy of light] [solar energy] [Light energy is used in photosynthesis]
MS Marco (Free-form)
SQuAD
Proper difficulty (Also limitation)
1. Small-scale context
2. Requiring lexical & syntactic information (paraphrase)
3. Span-based answer
SQuAD
Southern California, often abbreviated SoCal, is a geographic
and cultural region that generally comprises California's
southernmost 10 counties. (…)
What is Southern California often abbreviated as?
What does SoCal stand for?
Demo (BiDAF Model): https://allenai.github.io/bi-att-flow/demo/
BiDAF Demo
https://allenai.github.io/bi-att-flow/demo/
BiDAF Demo
https://allenai.github.io/bi-att-flow/demo/
Contents
Current state in Question Answering
Expansion from current state
Question Answering through transfer learning
(my work)
How to expand the task?
1. Small-scale context → Large-scale context
2. Requiring lexical information → Requiring complex reasoning
3. Span-based answer → Free form answer
Large-scale context
Longer context: WikiReading, NewsQA
Multiple context: MSMarco, TriviaQA
Open-domain: SearchQA, DrQA
Why challenging?
Cost (Time & Memory)
more information != better performance
No effective and efficient model yet!
Models with hierarchical structure
Large-scale context → More data
We have Large amount of data (such as Web data)
Approaches
1. Combination of information retrieval & question answering
2. Unsupervised learning
3. Transfer learning
Unannotated
Annotated
Annotated
Unannotated
Complex Reasoning
James the Turtle was always getting in trouble. (…) One day, James thought
he would go into town and see what kind of trouble he could get into. He
went to the grocery store and pulled all the pudding off the shelves and ate
two jars. Then he walked to the fast food restaurant and ordered 15 bags of
fries. He didn't pay, and instead headed home. (…)
Where did James go after he went to the grocery store?
A) His deck
B) His freezer
C) A fast food restaurant
D) His room
MC Test
Complex Reasoning
MCTest (7 years old)
Science Questions Dataset (Elementary school)
RACE (Middle & High school)
Very difficult, not so popular
Deep learning models have limitations
Free-form Answer
MS Marco
1. Annotation Gold Answer is difficult
What energy is used in photosynthesis?
Photosynthesis is a process used by plants and other organisms to convert light
energy, normally from the Sun, into chemical energy (…)
[light energy] [energy of light] [solar energy] [Light energy is used in photosynthesis]
Free-form Answer
- We want the answer not to be in the context.
- We prefer the full sentence to the single word.
- However, it is hard to evaluate.
- Incomplete metric. (Bag-of-word based)
What is the capital city of South Korea?
The capital city of South Korea is Seoul.
2. Evaluation is difficult
Free-form Answer
- We want the answer not to be in the context.
- We prefer the full sentence to the single word.
- However, it is hard to evaluate.
- Incomplete metric. (Bag-of-word based)
What is the capital city of South Korea?
The capital city of South Korea is Seoul.
Seoul.
The capital city of South Korea is Tokyo.
1/8
7/8
2. Evaluation is difficult
Free-form Answer
from WikiReading dataset paper (Hewlett et al.)
3. Designing generation model is difficult
Free-form Answer
WikiReading: Property instead of Question
- instance of, gender, country, date of birth, given name, …
Best model’s performance (F1)
- Given name: 88.7
- Date of opening: 30.1
Country
Folkart Towers are twin skyscrapers in the Bayrakli district of the Turkish city of Izmir.
Reaching a structural height of 200 m (656 ft) above ground level, (…)
WikiReading
3. Designing generation model is difficult
Contents
Current state in Question Answering
Expansion from current state
Question Answering through transfer learning
(my work)
Transfer learning in QA
“Question Answering through Transfer Learning from Large fine-
grained supervision data”
Background
- transfer learning is not popular in NLP
- some previous works: transfer learning does not work when
target is different from source
Our contribution
- coarser, sentence-level QA can benefit from the transfer
learning of model trained on large, span-level QA
Transfer learning in QA
SICK
(RTE)
SemEval-2016
(sentence-level QA,
community QA)
WikiQA
(sentence-level QA,
Wikipedia domain)
SQuAD
(span-level QA,
Wikipedia domain)
Transfer learning in QA
Q Who made airbus
C1 Airbus SAS is an aircraft manufacturing subsidiary of EADS, a European aerospace company.
C2 Airbus began as an union of aircraft companies.
C3 Aerospace companies allowed the establishment of a joint-stock company, owned by EADS.
A C1(Yes), C2(No), C3(No)
Q I saw an ad, data entry jobs online. It required we give a fee and they promise fixed amount
every month. Is this a scam?
C1 well probably is so i be more careful if i were u. Why you looking for online jobs
C2 SCAM!!!!!!!!!!!!!!!!!!!!!!
C3 Bcoz i got a baby and iam nt intrested to sent him in a day care. thats y iam (...)
A C1(Good), C2(Good), C3(Bad)
WikiQA
SemEval2016-task3A
Transfer learning in QA
Context Query
Embedding layer
Attention layer
Modelling layer
Pooling + classification
Class
Context Query
Embedding layer
Attention layer
Modelling layer
Output layer 1 Output layer 2
Start End
BiDAF outputs start and end
position of span.
BiDAF-T outputs classification
result.
transfer
Transfer learning in QA
74.17
74.33
83.2
79.9
76.44
75.19
75.22
62.96
rank2
rank1
SQ* (f)
SQ (f)
SQ-T (f)
SQ
SQ-T
None
77.66
79.19
80.2
78.37
76.3
57.8
47.23
76.4
rank2
rank1
SQ* (f)
SQ (f)
SQ-T (f)
SQ
SQ-T
None
WikiQA Our results (blue) and
previous SOTA (green). We achieve
new SOTA with a large gap.
SemEval2016-task3A Our results
(blue) and previous SOTA (green).
Transfer learning in QA
84.57
86.2
88.22
86.63
85
83.2
84.38
82.86
81.49
77.96
Rank2
Rank1
SQuAD*
SQuAD
SQuAD-T
None
SQuAD*
SQuAD
SQuAD-T
None
SICK Our results (blue and red). We
also pretrain the model on SNLI (red).
Previous SOTA (green)
Transfer learning in QA
Transfer learning should work better when
the source is similar to the target. (??)
span-level
(SQuAD)
sentence-level
(WikiQA etc.)
sentence-level
(SQuAD-T)
sentence-level
(WikiQA etc.)
Transfer learning in QA
74.17
74.33
83.2
79.9
76.44
75.19
75.22
62.96
rank2
rank1
SQ* (f)
SQ (f)
SQ-T (f)
SQ
SQ-T
None
77.66
79.19
80.2
78.37
76.3
57.8
47.23
76.4
rank2
rank1
SQ* (f)
SQ (f)
SQ-T (f)
SQ
SQ-T
None
WikiQA Our results (blue) and
previous SOTA (green). We achieve
new SOTA with a large gap.
SemEval2016-task3A Our results
(blue) and previous SOTA (green).
Transfer learning in QA
Top: pretrained on SQuAD-T
Bottom: pretrained on SQuAD
Transfer learning in QA
- We achieve SOTA on well-studied QA datasets by simple
transfer learning
- Span-level supervision leads to better learn lexical information
“Learned in Translation: Contextualized Word Vectors”
- Salesforce, 2017.08
- transfer learning from Translation to Sentiment analysis / classification / RTE /
QA
- SOTA in SST-5 & SNLI
Thank you!
Sewon Min
shmsw25@snu.ac.kr
https://shmsw25.github.io
Supplement
Query-reduction network
bAbI QA dataset : require reasoning, but synthetic!
Query-reduction network
Query-reduction network
Query-reduction network
Top: Avg Error on bAbI QA
Bottom: Avg Error on bAbI dialog
from: https://seominjoon.github.io/assets/slides/1705.naver.slides.pdf
Transfer learning in QA
Transfer learning in QA
SQuAD Span-level QA 100K
SQuAD-T Sentence-level QA 100K
WikiQA Sentence-level QA 2k
SemEval-2016 Sentence-level QA 2k
SICK RTE 10k
RTE (Recognizing Textual Entailment)
- determine if the premise is entailed by/contradicts/is neutral to
the hypothesis
- a.k.a. NLI (Natural Language Inference)
Neural Speed Reading
Make neural model faster to deal with large context
Coming Soon!

More Related Content

What's hot

What's hot (20)

Processing Arabic Text
Processing Arabic TextProcessing Arabic Text
Processing Arabic Text
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
 
[DL輪読会]Adaptive Gradient Methods with Dynamic Bound of Learning Rate
[DL輪読会]Adaptive Gradient Methods with Dynamic Bound of Learning Rate[DL輪読会]Adaptive Gradient Methods with Dynamic Bound of Learning Rate
[DL輪読会]Adaptive Gradient Methods with Dynamic Bound of Learning Rate
 
ScrapyとPhantomJSを用いたスクレイピングDSL
ScrapyとPhantomJSを用いたスクレイピングDSLScrapyとPhantomJSを用いたスクレイピングDSL
ScrapyとPhantomJSを用いたスクレイピングDSL
 
DeBERTA : Decoding-Enhanced BERT with Disentangled Attention
DeBERTA : Decoding-Enhanced BERT with Disentangled AttentionDeBERTA : Decoding-Enhanced BERT with Disentangled Attention
DeBERTA : Decoding-Enhanced BERT with Disentangled Attention
 
Introduction to Named Entity Recognition
Introduction to Named Entity RecognitionIntroduction to Named Entity Recognition
Introduction to Named Entity Recognition
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - Introduction
 
딥러닝 기반의 자연어처리 최근 연구 동향
딥러닝 기반의 자연어처리 최근 연구 동향딥러닝 기반의 자연어처리 최근 연구 동향
딥러닝 기반의 자연어처리 최근 연구 동향
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems
 
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic ArithmeticZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
Enriching Word Vectors with Subword Information
Enriching Word Vectors with Subword InformationEnriching Word Vectors with Subword Information
Enriching Word Vectors with Subword Information
 
[Dl輪読会]A simple neural network module for relational reasoning
[Dl輪読会]A simple neural network module for relational reasoning[Dl輪読会]A simple neural network module for relational reasoning
[Dl輪読会]A simple neural network module for relational reasoning
 
オントロジーとは?
オントロジーとは?オントロジーとは?
オントロジーとは?
 
より効果的な論文執筆を目指して ー査読者の視点に立った論文執筆ー
より効果的な論文執筆を目指して ー査読者の視点に立った論文執筆ーより効果的な論文執筆を目指して ー査読者の視点に立った論文執筆ー
より効果的な論文執筆を目指して ー査読者の視点に立った論文執筆ー
 
LDA Beginner's Tutorial
LDA Beginner's TutorialLDA Beginner's Tutorial
LDA Beginner's Tutorial
 
Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AI
 
Text prediction based on Recurrent Neural Network Language Model
Text prediction based on Recurrent Neural Network Language ModelText prediction based on Recurrent Neural Network Language Model
Text prediction based on Recurrent Neural Network Language Model
 
Pegasus
PegasusPegasus
Pegasus
 

Viewers also liked

Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
NAVER Engineering
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
NAVER Engineering
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
NAVER Engineering
 
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Jeongkyu Shin
 

Viewers also liked (20)

조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
 
바둑인을 위한 알파고
바둑인을 위한 알파고바둑인을 위한 알파고
바둑인을 위한 알파고
 
Multimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QAMultimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QA
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident network
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement Learning
 
알파고 해부하기 1부
알파고 해부하기 1부알파고 해부하기 1부
알파고 해부하기 1부
 
알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
 
Video Object Segmentation in Videos
Video Object Segmentation in VideosVideo Object Segmentation in Videos
Video Object Segmentation in Videos
 
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
 
RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
 
알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
 
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
 
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
 
what is_tabs_share
what is_tabs_sharewhat is_tabs_share
what is_tabs_share
 
[143]알파글래스의 개발과정으로 알아보는 ar 스마트글래스 광학 시스템
[143]알파글래스의 개발과정으로 알아보는 ar 스마트글래스 광학 시스템 [143]알파글래스의 개발과정으로 알아보는 ar 스마트글래스 광학 시스템
[143]알파글래스의 개발과정으로 알아보는 ar 스마트글래스 광학 시스템
 

Similar to Step-by-step approach to question answering

Lecture 6: Watson and the Social Web (2014), Chris Welty
Lecture 6: Watson and the Social Web (2014), Chris WeltyLecture 6: Watson and the Social Web (2014), Chris Welty
Lecture 6: Watson and the Social Web (2014), Chris Welty
Lora Aroyo
 
Cloud computing
Cloud computingCloud computing
Cloud computing
vizz_
 
ccna course 2
ccna course 2ccna course 2
ccna course 2
S Sridhar
 

Similar to Step-by-step approach to question answering (20)

Lecture 6: Watson and the Social Web (2014), Chris Welty
Lecture 6: Watson and the Social Web (2014), Chris WeltyLecture 6: Watson and the Social Web (2014), Chris Welty
Lecture 6: Watson and the Social Web (2014), Chris Welty
 
Session1
Session1Session1
Session1
 
Session1
Session1Session1
Session1
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...Generating Diverse and Consistent QA pairs from Contexts with Information-Max...
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...
 
Session1
Session1Session1
Session1
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Cloud computing lecture
Cloud computing lecture Cloud computing lecture
Cloud computing lecture
 
Techniques For Deep Query Understanding
Techniques For Deep Query UnderstandingTechniques For Deep Query Understanding
Techniques For Deep Query Understanding
 
Using Wikis and Chats to Promote Knowledge Building Communities
Using Wikis and Chats to Promote Knowledge Building CommunitiesUsing Wikis and Chats to Promote Knowledge Building Communities
Using Wikis and Chats to Promote Knowledge Building Communities
 
Cell Analogy Project
Cell Analogy ProjectCell Analogy Project
Cell Analogy Project
 
My lectures
My lecturesMy lectures
My lectures
 
Textbook Question Answering (TQA) with Multi-modal Context Graph Understandin...
Textbook Question Answering (TQA) with Multi-modal Context Graph Understandin...Textbook Question Answering (TQA) with Multi-modal Context Graph Understandin...
Textbook Question Answering (TQA) with Multi-modal Context Graph Understandin...
 
All good things
All good thingsAll good things
All good things
 
From Small-scale to Large-scale Text Classification
From Small-scale to Large-scale Text ClassificationFrom Small-scale to Large-scale Text Classification
From Small-scale to Large-scale Text Classification
 
ccna course 2
ccna course 2ccna course 2
ccna course 2
 
4CS001_Lecture_1.pptx
4CS001_Lecture_1.pptx4CS001_Lecture_1.pptx
4CS001_Lecture_1.pptx
 
Accessing 3D Printable Structures Online
Accessing 3D Printable Structures OnlineAccessing 3D Printable Structures Online
Accessing 3D Printable Structures Online
 
2013 10-30-sbc361-reproducible designsandsustainablesoftware
2013 10-30-sbc361-reproducible designsandsustainablesoftware2013 10-30-sbc361-reproducible designsandsustainablesoftware
2013 10-30-sbc361-reproducible designsandsustainablesoftware
 

More from NAVER Engineering

More from NAVER Engineering (20)

React vac pattern
React vac patternReact vac pattern
React vac pattern
 
디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Step-by-step approach to question answering

  • 1. Step-by-step approach to question answering Sewon Min Seoul National University 2017.08.21 at .
  • 2. Sewon Min - Interested in Natural language understanding with a focus on question answering - Background - Undergraduate in SNU (~2018) - Research Experience in UW (2016~2017) - Publication - Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi, “Neural Speed Reading”. 2017. (Under review) - Sewon Min, Minjoon Seo, Hannaneh Hajishirzi. “Question Answering through Transfer Learning from Large Fine-grained Supervision Data”. ACL. 2017. - Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi. “Query-reduction Networks”. ICLR. 2017
  • 6. Contents Current state in Question Answering Expansion from current state Question Answering through transfer learning (my work)
  • 7. Contents Current state in Question Answering Expansion from current state Question Answering through transfer learning (my work)
  • 8. Question Answering SystemQuestion Answer Context - Structural data (DB) - Text in Natural Language - Dialog History - Web data
  • 9. Question Answering SystemQuestion Answer Context - Structural data (DB) - Text in Natural Language - Dialog History - Web data Context-based Question Answering Machine Comprehension
  • 10. SQuAD Southern California, often abbreviated SoCal, is a geographic and cultural region that generally comprises California's southernmost 10 counties. (…) What is Southern California often abbreviated as? Stanford Question Answering Dataset (2016)
  • 11. SQuAD Southern California, often abbreviated SoCal, is a geographic and cultural region that generally comprises California's southernmost 10 counties. (…) What is Southern California often abbreviated as? Stanford Question Answering Dataset (2016)
  • 12. SQuAD Stanford Question Answering Dataset (2016) Models Match-LSTM (SMU), BiDAF (UW+AI2), DCN (Salesfoce), R-Net (Microsoft), AoA Reader (HIT + iFLYTEK) and many others Performance - System: EM 55 F1 68 → EM 78 F1 85 - Human: EM 82 F1 91 More information: https://rajpurkar.github.io/SQuAD-explorer/
  • 13. SQuAD Why popular? 1. Domain: Context-based, Wikipedia, Real questions 2. Task: Span-based answer - Closer to real QA than Cloze-style - Easier to evaluate than Free-form 3. Proper difficulty
  • 14. Who made airbus Airbus SAS is an aircraft manufacturing subsidiary of EADS, a European aerospace company. Airbus began as an union of aircraft companies … ____________ says he understands why @entity0 won’t play at his tournament ... @entity0 called me personally to let me know that he wouldn’t be playing here at @entity23 ,” @entity3 said on his @entity21 events website... WikiQA (Context Sentence Classification) CNN/Daily Mail (Cloze Style) What energy is used in photosynthesis? Photosynthesis is a process used by plants and other organisms to convert light energy, normally from the Sun, into chemical energy (…) [light energy] [energy of light] [solar energy] [Light energy is used in photosynthesis] MS Marco (Free-form)
  • 15. SQuAD Proper difficulty (Also limitation) 1. Small-scale context 2. Requiring lexical & syntactic information (paraphrase) 3. Span-based answer
  • 16. SQuAD Southern California, often abbreviated SoCal, is a geographic and cultural region that generally comprises California's southernmost 10 counties. (…) What is Southern California often abbreviated as? What does SoCal stand for? Demo (BiDAF Model): https://allenai.github.io/bi-att-flow/demo/
  • 19. Contents Current state in Question Answering Expansion from current state Question Answering through transfer learning (my work)
  • 20. How to expand the task? 1. Small-scale context → Large-scale context 2. Requiring lexical information → Requiring complex reasoning 3. Span-based answer → Free form answer
  • 21. Large-scale context Longer context: WikiReading, NewsQA Multiple context: MSMarco, TriviaQA Open-domain: SearchQA, DrQA Why challenging? Cost (Time & Memory) more information != better performance No effective and efficient model yet! Models with hierarchical structure
  • 22. Large-scale context → More data We have Large amount of data (such as Web data) Approaches 1. Combination of information retrieval & question answering 2. Unsupervised learning 3. Transfer learning Unannotated Annotated Annotated Unannotated
  • 23. Complex Reasoning James the Turtle was always getting in trouble. (…) One day, James thought he would go into town and see what kind of trouble he could get into. He went to the grocery store and pulled all the pudding off the shelves and ate two jars. Then he walked to the fast food restaurant and ordered 15 bags of fries. He didn't pay, and instead headed home. (…) Where did James go after he went to the grocery store? A) His deck B) His freezer C) A fast food restaurant D) His room MC Test
  • 24. Complex Reasoning MCTest (7 years old) Science Questions Dataset (Elementary school) RACE (Middle & High school) Very difficult, not so popular Deep learning models have limitations
  • 25. Free-form Answer MS Marco 1. Annotation Gold Answer is difficult What energy is used in photosynthesis? Photosynthesis is a process used by plants and other organisms to convert light energy, normally from the Sun, into chemical energy (…) [light energy] [energy of light] [solar energy] [Light energy is used in photosynthesis]
  • 26. Free-form Answer - We want the answer not to be in the context. - We prefer the full sentence to the single word. - However, it is hard to evaluate. - Incomplete metric. (Bag-of-word based) What is the capital city of South Korea? The capital city of South Korea is Seoul. 2. Evaluation is difficult
  • 27. Free-form Answer - We want the answer not to be in the context. - We prefer the full sentence to the single word. - However, it is hard to evaluate. - Incomplete metric. (Bag-of-word based) What is the capital city of South Korea? The capital city of South Korea is Seoul. Seoul. The capital city of South Korea is Tokyo. 1/8 7/8 2. Evaluation is difficult
  • 28. Free-form Answer from WikiReading dataset paper (Hewlett et al.) 3. Designing generation model is difficult
  • 29. Free-form Answer WikiReading: Property instead of Question - instance of, gender, country, date of birth, given name, … Best model’s performance (F1) - Given name: 88.7 - Date of opening: 30.1 Country Folkart Towers are twin skyscrapers in the Bayrakli district of the Turkish city of Izmir. Reaching a structural height of 200 m (656 ft) above ground level, (…) WikiReading 3. Designing generation model is difficult
  • 30. Contents Current state in Question Answering Expansion from current state Question Answering through transfer learning (my work)
  • 31. Transfer learning in QA “Question Answering through Transfer Learning from Large fine- grained supervision data” Background - transfer learning is not popular in NLP - some previous works: transfer learning does not work when target is different from source Our contribution - coarser, sentence-level QA can benefit from the transfer learning of model trained on large, span-level QA
  • 32. Transfer learning in QA SICK (RTE) SemEval-2016 (sentence-level QA, community QA) WikiQA (sentence-level QA, Wikipedia domain) SQuAD (span-level QA, Wikipedia domain)
  • 33. Transfer learning in QA Q Who made airbus C1 Airbus SAS is an aircraft manufacturing subsidiary of EADS, a European aerospace company. C2 Airbus began as an union of aircraft companies. C3 Aerospace companies allowed the establishment of a joint-stock company, owned by EADS. A C1(Yes), C2(No), C3(No) Q I saw an ad, data entry jobs online. It required we give a fee and they promise fixed amount every month. Is this a scam? C1 well probably is so i be more careful if i were u. Why you looking for online jobs C2 SCAM!!!!!!!!!!!!!!!!!!!!!! C3 Bcoz i got a baby and iam nt intrested to sent him in a day care. thats y iam (...) A C1(Good), C2(Good), C3(Bad) WikiQA SemEval2016-task3A
  • 34. Transfer learning in QA Context Query Embedding layer Attention layer Modelling layer Pooling + classification Class Context Query Embedding layer Attention layer Modelling layer Output layer 1 Output layer 2 Start End BiDAF outputs start and end position of span. BiDAF-T outputs classification result. transfer
  • 35. Transfer learning in QA 74.17 74.33 83.2 79.9 76.44 75.19 75.22 62.96 rank2 rank1 SQ* (f) SQ (f) SQ-T (f) SQ SQ-T None 77.66 79.19 80.2 78.37 76.3 57.8 47.23 76.4 rank2 rank1 SQ* (f) SQ (f) SQ-T (f) SQ SQ-T None WikiQA Our results (blue) and previous SOTA (green). We achieve new SOTA with a large gap. SemEval2016-task3A Our results (blue) and previous SOTA (green).
  • 36. Transfer learning in QA 84.57 86.2 88.22 86.63 85 83.2 84.38 82.86 81.49 77.96 Rank2 Rank1 SQuAD* SQuAD SQuAD-T None SQuAD* SQuAD SQuAD-T None SICK Our results (blue and red). We also pretrain the model on SNLI (red). Previous SOTA (green)
  • 37. Transfer learning in QA Transfer learning should work better when the source is similar to the target. (??) span-level (SQuAD) sentence-level (WikiQA etc.) sentence-level (SQuAD-T) sentence-level (WikiQA etc.)
  • 38. Transfer learning in QA 74.17 74.33 83.2 79.9 76.44 75.19 75.22 62.96 rank2 rank1 SQ* (f) SQ (f) SQ-T (f) SQ SQ-T None 77.66 79.19 80.2 78.37 76.3 57.8 47.23 76.4 rank2 rank1 SQ* (f) SQ (f) SQ-T (f) SQ SQ-T None WikiQA Our results (blue) and previous SOTA (green). We achieve new SOTA with a large gap. SemEval2016-task3A Our results (blue) and previous SOTA (green).
  • 39. Transfer learning in QA Top: pretrained on SQuAD-T Bottom: pretrained on SQuAD
  • 40. Transfer learning in QA - We achieve SOTA on well-studied QA datasets by simple transfer learning - Span-level supervision leads to better learn lexical information “Learned in Translation: Contextualized Word Vectors” - Salesforce, 2017.08 - transfer learning from Translation to Sentiment analysis / classification / RTE / QA - SOTA in SST-5 & SNLI
  • 43. Query-reduction network bAbI QA dataset : require reasoning, but synthetic!
  • 46. Query-reduction network Top: Avg Error on bAbI QA Bottom: Avg Error on bAbI dialog from: https://seominjoon.github.io/assets/slides/1705.naver.slides.pdf
  • 48. Transfer learning in QA SQuAD Span-level QA 100K SQuAD-T Sentence-level QA 100K WikiQA Sentence-level QA 2k SemEval-2016 Sentence-level QA 2k SICK RTE 10k RTE (Recognizing Textual Entailment) - determine if the premise is entailed by/contradicts/is neutral to the hypothesis - a.k.a. NLI (Natural Language Inference)
  • 49. Neural Speed Reading Make neural model faster to deal with large context Coming Soon!