SlideShare a Scribd company logo
1 of 32
Download to read offline
Abigail See* Peter J. Liu+
Christopher Manning*
*Stanford NLP +
Google Brain
Get To The Point:
Summarization with Pointer-Generator Networks
1st August 2017
● More difficult
● More flexible and human
● Necessary for future progress
● Easier
● Too restrictive (no paraphrasing)
● Most past work is extractive
Two approaches to summarization
Extractive Summarization
Select parts (typically sentences) of the original
text to form a summary.
Abstractive Summarization
Generate novel sentences using natural language
generation techniques.
● Long news articles
(average ~800 words)
● Multi-sentence summaries
(usually 3 or 4 sentences,
average 56 words)
● Summary contains
information from
throughout the article
CNN / Daily Mail
dataset
Sequence-to-sequence + attention model
<START>
Context Vector
Germany
Vocabulary
Distribution
a zoo
Attention
Distribution
"beat"
...
Encoder
Hidden
States
Decoder
HiddenStates
Germany emerge victorious in 2-0 win against Argentina on Saturday ...
Source Text
weighted sum weighted sum
Partial Summary
Sequence-to-sequence + attention model
<START> Germany
...
Encoder
Hidden
States
Germany emerge victorious in 2-0 win against Argentina on Saturday ...
Source Text
beat
Decoder
HiddenStates
Partial Summary
Sequence-to-sequence + attention model
<START>
...
Encoder
Hidden
States
Germany emerge victorious in 2-0 win against Argentina on Saturday ...
Source Text
Argentina 2-0 <STOP>beatGermany
Problem 1: The summaries sometimes reproduce factual details inaccurately.
e.g. Germany beat Argentina 3-2
Problem 2: The summaries sometimes repeat themselves.
e.g. Germany beat Germany beat Germany beat…
Two Problems
Incorrect rare or
out-of-vocabulary word
Problem 1: The summaries sometimes reproduce factual details inaccurately.
e.g. Germany beat Argentina 3-2
Problem 2: The summaries sometimes repeat themselves.
e.g. Germany beat Germany beat Germany beat…
Two Problems
Incorrect rare or
out-of-vocabulary word
Solution: Use a pointer to copy words.
Get to the point!
Source Text
Germany emerge victorious in 2-0 win against Argentina on Saturday ...
Germany
... ...
beat Argentina 2-0
point!
point!
point!
generate!
...
Best of both worlds:
extraction + abstraction
[1] Incorporating copying mechanism in sequence-to-sequence learning. Gu et al., 2016.
[2] Language as a latent variable: Discrete generative models for sentence compression. Miao and Blunsom, 2016.
Source Text
Germany emerge victorious in 2-0 win against Argentina on Saturday ...
...
<START> Germany
Vocabulary
Distribution
a zoo
beat
Partial Summary
Final Distribution
"Argentina"
a zoo
"2-0"
Context Vector
Attention
Distribution
Encoder
Hidden
States
Decoder
HiddenStates
Pointer-generator network
Improvements
Before After
UNK UNK was expelled from the
dubai open chess tournament
gaioz nigalidze was expelled from the
dubai open chess tournament
the 2015 rio olympic games the 2016 rio olympic games
Problem 1: The summaries sometimes reproduce factual details inaccurately.
e.g. Germany beat Argentina 3-2
Two Problems
Solution: Use a pointer to copy words.
Problem 2: The summaries sometimes repeat themselves.
e.g. Germany beat Germany beat Germany beat…
Problem 1: The summaries sometimes reproduce factual details inaccurately.
e.g. Germany beat Argentina 3-2
Two Problems
Solution: Use a pointer to copy words.
Problem 2: The summaries sometimes repeat themselves.
e.g. Germany beat Germany beat Germany beat…
Solution: Penalize repeatedly attending to same parts of the source text.
Reducing repetition with coverage
Coverage = cumulative attention = what has been covered so far
[4] Modeling coverage for neural machine translation. Tu et al., 2016,
[5] Coverage embedding models for neural machine translation. Mi et al., 2016
[6] Distraction-based neural networks for modeling documents. Chen et al., 2016.
Reducing repetition with coverage
Coverage = cumulative attention = what has been covered so far
1. Use coverage as extra input to attention mechanism.
[4] Modeling coverage for neural machine translation. Tu et al., 2016,
[5] Coverage embedding models for neural machine translation. Mi et al., 2016
[6] Distraction-based neural networks for modeling documents. Chen et al., 2016.
Reducing repetition with coverage
Coverage = cumulative attention = what has been covered so far
1. Use coverage as extra input to attention mechanism.
2. Penalize attending to things that have already been covered.
[4] Modeling coverage for neural machine translation. Tu et al., 2016,
[5] Coverage embedding models for neural machine translation. Mi et al., 2016
[6] Distraction-based neural networks for modeling documents. Chen et al., 2016.
Don't attend here
Result: repetition rate reduced to
level similar to human summaries
Reducing repetition with coverage
Coverage = cumulative attention = what has been covered so far
1. Use coverage as extra input to attention mechanism.
2. Penalize attending to things that have already been covered.
[4] Modeling coverage for neural machine translation. Tu et al., 2016,
[5] Coverage embedding models for neural machine translation. Mi et al., 2016
[6] Distraction-based neural networks for modeling documents. Chen et al., 2016.
Don't attend here
Summaries are still mostly extractive
Final Coverage
Source Text
Results
ROUGE-1 ROUGE-2 ROUGE-L
Nallapati et al. 2016 35.5 13.3 32.7 Previous best abstractive result
ROUGE compares the machine-generated summary to the human-written reference summary
and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
Results
ROUGE-1 ROUGE-2 ROUGE-L
Nallapati et al. 2016 35.5 13.3 32.7
Ours (seq2seq baseline) 31.3 11.8 28.8
Ours (pointer-generator) 36.4 15.7 33.4
Ours (pointer-generator + coverage) 39.5 17.3 36.4
Previous best abstractive result
Our improvements
ROUGE compares the machine-generated summary to the human-written reference summary
and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
Results
ROUGE-1 ROUGE-2 ROUGE-L
Nallapati et al. 2016 35.5 13.3 32.7
Ours (seq2seq baseline) 31.3 11.8 28.8
Ours (pointer-generator) 36.4 15.7 33.4
Ours (pointer-generator + coverage) 39.5 17.3 36.4
Paulus et al. 2017 (hybrid RL approach) 39.9 15.8 36.9
Paulus et al. 2017 (RL-only approach) 41.2 15.8 39.1
Previous best abstractive result
Our improvements
worse ROUGE; better human eval
better ROUGE; worse human eval
ROUGE compares the machine-generated summary to the human-written reference summary
and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
Results
ROUGE-1 ROUGE-2 ROUGE-L
Nallapati et al. 2016 35.5 13.3 32.7
Ours (seq2seq baseline) 31.3 11.8 28.8
Ours (pointer-generator) 36.4 15.7 33.4
Ours (pointer-generator + coverage) 39.5 17.3 36.4
Paulus et al. 2017 (hybrid RL approach) 39.9 15.8 36.9
Paulus et al. 2017 (RL-only approach) 41.2 15.8 39.1
Previous best abstractive result
Our improvements
worse ROUGE; better human eval
better ROUGE; worse human eval
?
ROUGE compares the machine-generated summary to the human-written reference summary
and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
● Summarization is subjective
○ There are many correct ways to summarize
The difficulty of evaluating summarization
● Summarization is subjective
○ There are many correct ways to summarize
● ROUGE is based on strict comparison to a reference summary
○ Intolerant to rephrasing
○ Rewards extractive strategies
The difficulty of evaluating summarization
● Summarization is subjective
○ There are many correct ways to summarize
● ROUGE is based on strict comparison to a reference summary
○ Intolerant to rephrasing
○ Rewards extractive strategies
● Take first 3 sentences as summary → higher ROUGE than (almost) any
published system
○ Partially due to news article structure
The difficulty of evaluating summarization
First sentences not always a good summary
Robots tested in
Japan companies
Irrelevant
Our system
starts here
A crowd gathers near the entrance of Tokyo's
upscale Mitsukoshi Department Store, which traces
its roots to a kimono shop in the late 17th century.
Fitting with the store's history, the new greeter wears
a traditional Japanese kimono while delivering
information to the growing crowd, whose expressions
vary from amusement to bewilderment.
It's hard to imagine the store's founders in the late
1600's could have imagined this kind of employee.
That's because the greeter is not a human -- it's a
robot.
Aiko Chihira is an android manufactured by Toshiba,
designed to look and move like a real person.
...
What next?
Extractive
methods
SAFETY
Human-level
summarization
longtext
understanding
MOUNT ABSTRACTION
Extractive
methods
paraphrasing
SAFETY
Human-level
summarization
longtext
understanding
MOUNT ABSTRACTION SWAMP OF BASIC ERRORS
repetition
copying errors
nonsense
Extractive
methods
paraphrasing
SAFETY
Human-level
summarization
MOUNT ABSTRACTION SWAMP OF BASIC ERRORS
repetition
copying errors
nonsense
Extractive
methods
RNNs
RNNs
more high-level understanding?
more scalability?
better metrics?
SAFETY
Thank you!
Blog post: www.abigailsee.com
Code: github.com/abisee/pointer-generator

More Related Content

What's hot

Max Entropy
Max EntropyMax Entropy
Max Entropyjianingy
 
Introduction to natural language processing, history and origin
Introduction to natural language processing, history and originIntroduction to natural language processing, history and origin
Introduction to natural language processing, history and originShubhankar Mohan
 
Question Answering - Application and Challenges
Question Answering - Application and ChallengesQuestion Answering - Application and Challenges
Question Answering - Application and ChallengesJens Lehmann
 
Text summarization using deep learning
Text summarization using deep learningText summarization using deep learning
Text summarization using deep learningAbu Kaisar
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptxNameetDaga1
 
NLP Techniques for Text Summarization.docx
NLP Techniques for Text Summarization.docxNLP Techniques for Text Summarization.docx
NLP Techniques for Text Summarization.docxKevinSims18
 
Attention mechanism 소개 자료
Attention mechanism 소개 자료Attention mechanism 소개 자료
Attention mechanism 소개 자료Whi Kwon
 
Self supervised learning
Self supervised learningSelf supervised learning
Self supervised learning哲东 郑
 
Big Data: Social Network Analysis
Big Data: Social Network AnalysisBig Data: Social Network Analysis
Big Data: Social Network AnalysisMichel Bruley
 
Deep Natural Language Processing for Search and Recommender Systems
Deep Natural Language Processing for Search and Recommender SystemsDeep Natural Language Processing for Search and Recommender Systems
Deep Natural Language Processing for Search and Recommender SystemsHuiji Gao
 
Question Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other StoriesQuestion Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other StoriesSujit Pal
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - IntroductionChristian Perone
 

What's hot (20)

Pegasus
PegasusPegasus
Pegasus
 
Max Entropy
Max EntropyMax Entropy
Max Entropy
 
Multimodal Deep Learning
Multimodal Deep LearningMultimodal Deep Learning
Multimodal Deep Learning
 
Introduction to natural language processing, history and origin
Introduction to natural language processing, history and originIntroduction to natural language processing, history and origin
Introduction to natural language processing, history and origin
 
Question Answering - Application and Challenges
Question Answering - Application and ChallengesQuestion Answering - Application and Challenges
Question Answering - Application and Challenges
 
Text summarization
Text summarizationText summarization
Text summarization
 
Text summarization using deep learning
Text summarization using deep learningText summarization using deep learning
Text summarization using deep learning
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
 
NLP Techniques for Text Summarization.docx
NLP Techniques for Text Summarization.docxNLP Techniques for Text Summarization.docx
NLP Techniques for Text Summarization.docx
 
Phd thesis final presentation
Phd thesis   final presentationPhd thesis   final presentation
Phd thesis final presentation
 
Attention mechanism 소개 자료
Attention mechanism 소개 자료Attention mechanism 소개 자료
Attention mechanism 소개 자료
 
Self supervised learning
Self supervised learningSelf supervised learning
Self supervised learning
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
 
Algorithme génétique
Algorithme génétiqueAlgorithme génétique
Algorithme génétique
 
Big Data: Social Network Analysis
Big Data: Social Network AnalysisBig Data: Social Network Analysis
Big Data: Social Network Analysis
 
Deep Natural Language Processing for Search and Recommender Systems
Deep Natural Language Processing for Search and Recommender SystemsDeep Natural Language Processing for Search and Recommender Systems
Deep Natural Language Processing for Search and Recommender Systems
 
Ngrams smoothing
Ngrams smoothingNgrams smoothing
Ngrams smoothing
 
Question Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other StoriesQuestion Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other Stories
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - Introduction
 
Presentation these
Presentation thesePresentation these
Presentation these
 

Similar to Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator Networks

Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 ReviewNatural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 Reviewchangedaeoh
 
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdfOffline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdfPo-Chuan Chen
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Fwdays
 
Maintaining Large Scale Julia Ecosystems
Maintaining Large Scale Julia EcosystemsMaintaining Large Scale Julia Ecosystems
Maintaining Large Scale Julia EcosystemsChris Rackauckas
 
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupDealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupYves Peirsman
 
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Yusuke Oda
 
Computer Tools for Academic Research
Computer Tools for Academic ResearchComputer Tools for Academic Research
Computer Tools for Academic ResearchMiklos Koren
 
a deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationa deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationJEE HYUN PARK
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Sudeep Das, Ph.D.
 
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...Hayahide Yamagishi
 
IFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaIFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaKei Nakagawa
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers Arvind Devaraj
 
Automatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical ReviewAutomatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical ReviewIRJET Journal
 
A Comparative Study of Automatic Text Summarization Methodologies
A Comparative Study of Automatic Text Summarization MethodologiesA Comparative Study of Automatic Text Summarization Methodologies
A Comparative Study of Automatic Text Summarization MethodologiesIRJET Journal
 
Automatic keyword extraction.pptx
Automatic keyword extraction.pptxAutomatic keyword extraction.pptx
Automatic keyword extraction.pptxBiswarupDas18
 
ChatGPT in academic settings H2.de
ChatGPT in academic settings H2.deChatGPT in academic settings H2.de
ChatGPT in academic settings H2.deDavid Döring
 

Similar to Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator Networks (20)

Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 ReviewNatural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
 
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdfOffline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdf
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
 
010821+presentation+oti.ppt
010821+presentation+oti.ppt010821+presentation+oti.ppt
010821+presentation+oti.ppt
 
team10.ppt.pptx
team10.ppt.pptxteam10.ppt.pptx
team10.ppt.pptx
 
Maintaining Large Scale Julia Ecosystems
Maintaining Large Scale Julia EcosystemsMaintaining Large Scale Julia Ecosystems
Maintaining Large Scale Julia Ecosystems
 
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupDealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
 
Summarization of Software Artifacts : A Review
Summarization of Software Artifacts : A ReviewSummarization of Software Artifacts : A Review
Summarization of Software Artifacts : A Review
 
Summarization of Software Artifacts : A Review
Summarization of Software Artifacts : A ReviewSummarization of Software Artifacts : A Review
Summarization of Software Artifacts : A Review
 
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
 
Computer Tools for Academic Research
Computer Tools for Academic ResearchComputer Tools for Academic Research
Computer Tools for Academic Research
 
a deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationa deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarization
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
 
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
 
IFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaIFTA2020 Kei Nakagawa
IFTA2020 Kei Nakagawa
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
Automatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical ReviewAutomatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical Review
 
A Comparative Study of Automatic Text Summarization Methodologies
A Comparative Study of Automatic Text Summarization MethodologiesA Comparative Study of Automatic Text Summarization Methodologies
A Comparative Study of Automatic Text Summarization Methodologies
 
Automatic keyword extraction.pptx
Automatic keyword extraction.pptxAutomatic keyword extraction.pptx
Automatic keyword extraction.pptx
 
ChatGPT in academic settings H2.de
ChatGPT in academic settings H2.deChatGPT in academic settings H2.de
ChatGPT in academic settings H2.de
 

More from Association for Computational Linguistics

Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Association for Computational Linguistics
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsAssociation for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsAssociation for Computational Linguistics
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Association for Computational Linguistics
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Association for Computational Linguistics
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Association for Computational Linguistics
 

More from Association for Computational Linguistics (20)

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal TextMuis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
 

Recently uploaded

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersChitralekhaTherkar
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 

Recently uploaded (20)

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of Powders
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 

Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator Networks

  • 1. Abigail See* Peter J. Liu+ Christopher Manning* *Stanford NLP + Google Brain Get To The Point: Summarization with Pointer-Generator Networks 1st August 2017
  • 2. ● More difficult ● More flexible and human ● Necessary for future progress ● Easier ● Too restrictive (no paraphrasing) ● Most past work is extractive Two approaches to summarization Extractive Summarization Select parts (typically sentences) of the original text to form a summary. Abstractive Summarization Generate novel sentences using natural language generation techniques.
  • 3. ● Long news articles (average ~800 words) ● Multi-sentence summaries (usually 3 or 4 sentences, average 56 words) ● Summary contains information from throughout the article CNN / Daily Mail dataset
  • 4. Sequence-to-sequence + attention model <START> Context Vector Germany Vocabulary Distribution a zoo Attention Distribution "beat" ... Encoder Hidden States Decoder HiddenStates Germany emerge victorious in 2-0 win against Argentina on Saturday ... Source Text weighted sum weighted sum Partial Summary
  • 5. Sequence-to-sequence + attention model <START> Germany ... Encoder Hidden States Germany emerge victorious in 2-0 win against Argentina on Saturday ... Source Text beat Decoder HiddenStates Partial Summary
  • 6. Sequence-to-sequence + attention model <START> ... Encoder Hidden States Germany emerge victorious in 2-0 win against Argentina on Saturday ... Source Text Argentina 2-0 <STOP>beatGermany
  • 7. Problem 1: The summaries sometimes reproduce factual details inaccurately. e.g. Germany beat Argentina 3-2 Problem 2: The summaries sometimes repeat themselves. e.g. Germany beat Germany beat Germany beat… Two Problems Incorrect rare or out-of-vocabulary word
  • 8. Problem 1: The summaries sometimes reproduce factual details inaccurately. e.g. Germany beat Argentina 3-2 Problem 2: The summaries sometimes repeat themselves. e.g. Germany beat Germany beat Germany beat… Two Problems Incorrect rare or out-of-vocabulary word Solution: Use a pointer to copy words.
  • 9. Get to the point! Source Text Germany emerge victorious in 2-0 win against Argentina on Saturday ... Germany ... ... beat Argentina 2-0 point! point! point! generate! ... Best of both worlds: extraction + abstraction [1] Incorporating copying mechanism in sequence-to-sequence learning. Gu et al., 2016. [2] Language as a latent variable: Discrete generative models for sentence compression. Miao and Blunsom, 2016.
  • 10. Source Text Germany emerge victorious in 2-0 win against Argentina on Saturday ... ... <START> Germany Vocabulary Distribution a zoo beat Partial Summary Final Distribution "Argentina" a zoo "2-0" Context Vector Attention Distribution Encoder Hidden States Decoder HiddenStates Pointer-generator network
  • 11. Improvements Before After UNK UNK was expelled from the dubai open chess tournament gaioz nigalidze was expelled from the dubai open chess tournament the 2015 rio olympic games the 2016 rio olympic games
  • 12. Problem 1: The summaries sometimes reproduce factual details inaccurately. e.g. Germany beat Argentina 3-2 Two Problems Solution: Use a pointer to copy words. Problem 2: The summaries sometimes repeat themselves. e.g. Germany beat Germany beat Germany beat…
  • 13. Problem 1: The summaries sometimes reproduce factual details inaccurately. e.g. Germany beat Argentina 3-2 Two Problems Solution: Use a pointer to copy words. Problem 2: The summaries sometimes repeat themselves. e.g. Germany beat Germany beat Germany beat… Solution: Penalize repeatedly attending to same parts of the source text.
  • 14. Reducing repetition with coverage Coverage = cumulative attention = what has been covered so far [4] Modeling coverage for neural machine translation. Tu et al., 2016, [5] Coverage embedding models for neural machine translation. Mi et al., 2016 [6] Distraction-based neural networks for modeling documents. Chen et al., 2016.
  • 15. Reducing repetition with coverage Coverage = cumulative attention = what has been covered so far 1. Use coverage as extra input to attention mechanism. [4] Modeling coverage for neural machine translation. Tu et al., 2016, [5] Coverage embedding models for neural machine translation. Mi et al., 2016 [6] Distraction-based neural networks for modeling documents. Chen et al., 2016.
  • 16. Reducing repetition with coverage Coverage = cumulative attention = what has been covered so far 1. Use coverage as extra input to attention mechanism. 2. Penalize attending to things that have already been covered. [4] Modeling coverage for neural machine translation. Tu et al., 2016, [5] Coverage embedding models for neural machine translation. Mi et al., 2016 [6] Distraction-based neural networks for modeling documents. Chen et al., 2016. Don't attend here
  • 17. Result: repetition rate reduced to level similar to human summaries Reducing repetition with coverage Coverage = cumulative attention = what has been covered so far 1. Use coverage as extra input to attention mechanism. 2. Penalize attending to things that have already been covered. [4] Modeling coverage for neural machine translation. Tu et al., 2016, [5] Coverage embedding models for neural machine translation. Mi et al., 2016 [6] Distraction-based neural networks for modeling documents. Chen et al., 2016. Don't attend here
  • 18. Summaries are still mostly extractive Final Coverage Source Text
  • 19. Results ROUGE-1 ROUGE-2 ROUGE-L Nallapati et al. 2016 35.5 13.3 32.7 Previous best abstractive result ROUGE compares the machine-generated summary to the human-written reference summary and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
  • 20. Results ROUGE-1 ROUGE-2 ROUGE-L Nallapati et al. 2016 35.5 13.3 32.7 Ours (seq2seq baseline) 31.3 11.8 28.8 Ours (pointer-generator) 36.4 15.7 33.4 Ours (pointer-generator + coverage) 39.5 17.3 36.4 Previous best abstractive result Our improvements ROUGE compares the machine-generated summary to the human-written reference summary and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
  • 21. Results ROUGE-1 ROUGE-2 ROUGE-L Nallapati et al. 2016 35.5 13.3 32.7 Ours (seq2seq baseline) 31.3 11.8 28.8 Ours (pointer-generator) 36.4 15.7 33.4 Ours (pointer-generator + coverage) 39.5 17.3 36.4 Paulus et al. 2017 (hybrid RL approach) 39.9 15.8 36.9 Paulus et al. 2017 (RL-only approach) 41.2 15.8 39.1 Previous best abstractive result Our improvements worse ROUGE; better human eval better ROUGE; worse human eval ROUGE compares the machine-generated summary to the human-written reference summary and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
  • 22. Results ROUGE-1 ROUGE-2 ROUGE-L Nallapati et al. 2016 35.5 13.3 32.7 Ours (seq2seq baseline) 31.3 11.8 28.8 Ours (pointer-generator) 36.4 15.7 33.4 Ours (pointer-generator + coverage) 39.5 17.3 36.4 Paulus et al. 2017 (hybrid RL approach) 39.9 15.8 36.9 Paulus et al. 2017 (RL-only approach) 41.2 15.8 39.1 Previous best abstractive result Our improvements worse ROUGE; better human eval better ROUGE; worse human eval ? ROUGE compares the machine-generated summary to the human-written reference summary and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
  • 23. ● Summarization is subjective ○ There are many correct ways to summarize The difficulty of evaluating summarization
  • 24. ● Summarization is subjective ○ There are many correct ways to summarize ● ROUGE is based on strict comparison to a reference summary ○ Intolerant to rephrasing ○ Rewards extractive strategies The difficulty of evaluating summarization
  • 25. ● Summarization is subjective ○ There are many correct ways to summarize ● ROUGE is based on strict comparison to a reference summary ○ Intolerant to rephrasing ○ Rewards extractive strategies ● Take first 3 sentences as summary → higher ROUGE than (almost) any published system ○ Partially due to news article structure The difficulty of evaluating summarization
  • 26. First sentences not always a good summary Robots tested in Japan companies Irrelevant Our system starts here A crowd gathers near the entrance of Tokyo's upscale Mitsukoshi Department Store, which traces its roots to a kimono shop in the late 17th century. Fitting with the store's history, the new greeter wears a traditional Japanese kimono while delivering information to the growing crowd, whose expressions vary from amusement to bewilderment. It's hard to imagine the store's founders in the late 1600's could have imagined this kind of employee. That's because the greeter is not a human -- it's a robot. Aiko Chihira is an android manufactured by Toshiba, designed to look and move like a real person. ...
  • 30. Human-level summarization longtext understanding MOUNT ABSTRACTION SWAMP OF BASIC ERRORS repetition copying errors nonsense Extractive methods paraphrasing SAFETY
  • 31. Human-level summarization MOUNT ABSTRACTION SWAMP OF BASIC ERRORS repetition copying errors nonsense Extractive methods RNNs RNNs more high-level understanding? more scalability? better metrics? SAFETY
  • 32. Thank you! Blog post: www.abigailsee.com Code: github.com/abisee/pointer-generator