SlideShare a Scribd company logo
1 of 1
Download to read offline
 Performance comparison
 Performance over long text input
The proposed models showed robustness in handling long articles
 Evaluation in the real world supports the validity of the dataset / approach
- Manual validation - Surveys on AMT workers
Detecting Incongruity Between News Headline and Body Text
via a Deep Hierarchical Encoder
Seunghyun Yoon*, Kunwoo Park*, Joongbo Shin, Hongjun Lim, Seungpil Won, Meeyoung Cha and Kyomin Jung
Seoul National University, KAIST
Full paper at emerging track on artificial intelligence for social impact (oral presentation at 29 Jan 14:00-15:30)
Research Problem
 Detecting incongruity between news headline and body text
(i.e., when a news headline does not correctly represent the story as in
advertisements, clickbaits, fake news, hijacked stories, etc.)
 Why is it important?
- People are less likely to read the whole contents but just read news headlines
- News headlines play an important role in making first impressions to readers,
which persist even after reading the whole article content
Most news are shared based on headlines in online platforms
 Previous efforts
- Public competition aiming to facilitate research on developing algorithms
(e.g., Fake News Challenge in 2017)
- Feature based approaches in detecting ambiguous headlines (Wei et al. 2017)
A sophisticated model could not be proposed or fully-evaluated
due to lack of available corpus for research
 Proposed approaches
- Attentive Hierarchical Dual Encoder (AHDE): encodes the full news article
from a word-level to a paragraph-level with attention mechanism
- Independent Paragraph (IP) Method: splits paragraphs in the body text and
learns the relationship between each paragraph and headline
Experimental Results
Dataset Generation
 Data source
- Main dataset: Crawl a near-complete set of news article published in South
Korea (period: January of 2017 – October of 2017)
- Supplementary dataset: 136K news articles in English (Horne et al. 2018)
 Two types of dataset
- Whole-corpus: Inject paragraphs that are negatively sampled from other
articles into an original article to make the headline become incongruent
- Paragraph-corpus: Transform a pair of headline and body text into multiple
sub-pairs of headline and paragraph
Methodology
 Baseline approaches
- XGBoost: the winning model in Fake News Challenge (FNC-1)
- Recurrent Dual Encoder (RDE): independently learn sequential patterns of
headline and body text via dual RNNs
- Convolutional Dual Encoder (CDE): independently learn textual patterns of
headline and body text via dual CNNs
Passage encoding
, , , ,
		 ,,
Attention
tanh 	 	 	
∑
,
∑ 	
Without IP With IP
AHDE was the best AHDE was (still) the bestEnjoy more advantages
compared to hierarchical models
Without IP With IP
Summary of contributions
 RELEASE a million-scale dataset for research on misinformation
 PROPOSE two neural networks that efficiently learn the textual relationship
between headline and body text
 SHOW that the models trained on the released corpus decent performances
on the real-world dataset
The code, data, and paper are available at http://david-yoon.github.io

More Related Content

What's hot

News Session-Based Recommendations Using Deep Neural Networks
News Session-Based Recommendations Using Deep Neural NetworksNews Session-Based Recommendations Using Deep Neural Networks
News Session-Based Recommendations Using Deep Neural Networks
Felipe Ferreira
 
On nonmetric similarity search problems in complex domains
On nonmetric similarity search problems in complex domainsOn nonmetric similarity search problems in complex domains
On nonmetric similarity search problems in complex domains
unyil96
 
36 students' interactin with librarians through twitter
36 students' interactin with librarians through twitter36 students' interactin with librarians through twitter
36 students' interactin with librarians through twitter
CITE
 
Summer internship 2014 report by Rishabh Misra, Thapar University
Summer internship 2014 report by Rishabh Misra, Thapar UniversitySummer internship 2014 report by Rishabh Misra, Thapar University
Summer internship 2014 report by Rishabh Misra, Thapar University
Rishabh Misra
 

What's hot (20)

Lexicon Based Emotion Analysis on Twitter Data
Lexicon Based Emotion Analysis on Twitter DataLexicon Based Emotion Analysis on Twitter Data
Lexicon Based Emotion Analysis on Twitter Data
 
Learning to learn with meta learning
Learning to learn with meta learningLearning to learn with meta learning
Learning to learn with meta learning
 
Analyzing User Reviews in Tourism with Topic Models
Analyzing User Reviews in Tourism with Topic ModelsAnalyzing User Reviews in Tourism with Topic Models
Analyzing User Reviews in Tourism with Topic Models
 
Tutorial on Relationship Mining In Online Social Networks
Tutorial on Relationship Mining In Online Social NetworksTutorial on Relationship Mining In Online Social Networks
Tutorial on Relationship Mining In Online Social Networks
 
Tweets Classifier
Tweets ClassifierTweets Classifier
Tweets Classifier
 
Graph
GraphGraph
Graph
 
News Session-Based Recommendations Using Deep Neural Networks
News Session-Based Recommendations Using Deep Neural NetworksNews Session-Based Recommendations Using Deep Neural Networks
News Session-Based Recommendations Using Deep Neural Networks
 
Online Learning to Rank
Online Learning to RankOnline Learning to Rank
Online Learning to Rank
 
On nonmetric similarity search problems in complex domains
On nonmetric similarity search problems in complex domainsOn nonmetric similarity search problems in complex domains
On nonmetric similarity search problems in complex domains
 
Selection of Tags for Tag Clouds
Selection of Tags for Tag CloudsSelection of Tags for Tag Clouds
Selection of Tags for Tag Clouds
 
Dynamic learning of keyword-based preferences for news recommendation (WI-2014)
Dynamic learning of keyword-based preferences for news recommendation (WI-2014)Dynamic learning of keyword-based preferences for news recommendation (WI-2014)
Dynamic learning of keyword-based preferences for news recommendation (WI-2014)
 
An overview of text mining and sentiment analysis for Decision Support System
An overview of text mining and sentiment analysis for Decision Support SystemAn overview of text mining and sentiment analysis for Decision Support System
An overview of text mining and sentiment analysis for Decision Support System
 
36 students' interactin with librarians through twitter
36 students' interactin with librarians through twitter36 students' interactin with librarians through twitter
36 students' interactin with librarians through twitter
 
IJSRED-V2I2P09
IJSRED-V2I2P09IJSRED-V2I2P09
IJSRED-V2I2P09
 
Summer internship 2014 report by Rishabh Misra, Thapar University
Summer internship 2014 report by Rishabh Misra, Thapar UniversitySummer internship 2014 report by Rishabh Misra, Thapar University
Summer internship 2014 report by Rishabh Misra, Thapar University
 
Unpacking Altmetric Donuts: Content Analysis of Tweets to Scholarly Journal A...
Unpacking Altmetric Donuts: Content Analysis of Tweets to Scholarly Journal A...Unpacking Altmetric Donuts: Content Analysis of Tweets to Scholarly Journal A...
Unpacking Altmetric Donuts: Content Analysis of Tweets to Scholarly Journal A...
 
Domain Ontology Usage Analysis Framework (OUSAF)
Domain Ontology Usage Analysis Framework (OUSAF)Domain Ontology Usage Analysis Framework (OUSAF)
Domain Ontology Usage Analysis Framework (OUSAF)
 
Unpacking Altmetric Donuts: Content Analysis of Tweets to Scholarly Journal A...
Unpacking Altmetric Donuts: Content Analysis of Tweets to Scholarly Journal A...Unpacking Altmetric Donuts: Content Analysis of Tweets to Scholarly Journal A...
Unpacking Altmetric Donuts: Content Analysis of Tweets to Scholarly Journal A...
 
Supporting scientific discovery through linkages of literature and data
Supporting scientific discovery through linkages of literature and dataSupporting scientific discovery through linkages of literature and data
Supporting scientific discovery through linkages of literature and data
 
Information Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesInformation Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slides
 

Similar to [poster] Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder

Oles Petriv “Creating one concept embedding space for persons, brands and new...
Oles Petriv “Creating one concept embedding space for persons, brands and new...Oles Petriv “Creating one concept embedding space for persons, brands and new...
Oles Petriv “Creating one concept embedding space for persons, brands and new...
Lviv Startup Club
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
Lifeng (Aaron) Han
 

Similar to [poster] Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder (20)

Opinion mining on newspaper headlines using SVM and NLP
Opinion mining on newspaper headlines using SVM and NLPOpinion mining on newspaper headlines using SVM and NLP
Opinion mining on newspaper headlines using SVM and NLP
 
Sensing complicated meanings from unstructured data: a novel hybrid approach
Sensing complicated meanings from unstructured data: a novel hybrid approachSensing complicated meanings from unstructured data: a novel hybrid approach
Sensing complicated meanings from unstructured data: a novel hybrid approach
 
Prediction of Answer Keywords using Char-RNN
Prediction of Answer Keywords using Char-RNNPrediction of Answer Keywords using Char-RNN
Prediction of Answer Keywords using Char-RNN
 
IRJET- Automatic Recapitulation of Text Document
IRJET- Automatic Recapitulation of Text DocumentIRJET- Automatic Recapitulation of Text Document
IRJET- Automatic Recapitulation of Text Document
 
A Review Of Text Mining Techniques And Applications
A Review Of Text Mining Techniques And ApplicationsA Review Of Text Mining Techniques And Applications
A Review Of Text Mining Techniques And Applications
 
Text Mining: (Asynchronous Sequences)
Text Mining: (Asynchronous Sequences)Text Mining: (Asynchronous Sequences)
Text Mining: (Asynchronous Sequences)
 
Meta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsMeta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methods
 
1808.10245v1 (1).pdf
1808.10245v1 (1).pdf1808.10245v1 (1).pdf
1808.10245v1 (1).pdf
 
Tldr
TldrTldr
Tldr
 
Oles Petriv “Creating one concept embedding space for persons, brands and new...
Oles Petriv “Creating one concept embedding space for persons, brands and new...Oles Petriv “Creating one concept embedding space for persons, brands and new...
Oles Petriv “Creating one concept embedding space for persons, brands and new...
 
Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...
 
Eat it, Review it: A New Approach for Review Prediction
Eat it, Review it: A New Approach for Review PredictionEat it, Review it: A New Approach for Review Prediction
Eat it, Review it: A New Approach for Review Prediction
 
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answering
 
team10.ppt.pptx
team10.ppt.pptxteam10.ppt.pptx
team10.ppt.pptx
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 
Journal of Computer Science Research | Vol.1, Iss.3
Journal of Computer Science Research | Vol.1, Iss.3Journal of Computer Science Research | Vol.1, Iss.3
Journal of Computer Science Research | Vol.1, Iss.3
 
A-STUDY-ON-SENTIMENT-POLARITY.pdf
A-STUDY-ON-SENTIMENT-POLARITY.pdfA-STUDY-ON-SENTIMENT-POLARITY.pdf
A-STUDY-ON-SENTIMENT-POLARITY.pdf
 
An in-depth review on News Classification through NLP
An in-depth review on News Classification through NLPAn in-depth review on News Classification through NLP
An in-depth review on News Classification through NLP
 
IRJET- Automated Document Summarization and Classification using Deep Lear...
IRJET- 	  Automated Document Summarization and Classification using Deep Lear...IRJET- 	  Automated Document Summarization and Classification using Deep Lear...
IRJET- Automated Document Summarization and Classification using Deep Lear...
 

Recently uploaded

Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
amilabibi1
 
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
ZurliaSoop
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
Kayode Fayemi
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
David Celestin
 
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven CuriosityUnlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Hung Le
 

Recently uploaded (17)

Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of Drupal
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
 
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait Cityin kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
 
Zone Chairperson Role and Responsibilities New updated.pptx
Zone Chairperson Role and Responsibilities New updated.pptxZone Chairperson Role and Responsibilities New updated.pptx
Zone Chairperson Role and Responsibilities New updated.pptx
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
Introduction to Artificial intelligence.
Introduction to Artificial intelligence.Introduction to Artificial intelligence.
Introduction to Artificial intelligence.
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven CuriosityUnlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
 
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdfSOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
 

[poster] Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder

  • 1.  Performance comparison  Performance over long text input The proposed models showed robustness in handling long articles  Evaluation in the real world supports the validity of the dataset / approach - Manual validation - Surveys on AMT workers Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder Seunghyun Yoon*, Kunwoo Park*, Joongbo Shin, Hongjun Lim, Seungpil Won, Meeyoung Cha and Kyomin Jung Seoul National University, KAIST Full paper at emerging track on artificial intelligence for social impact (oral presentation at 29 Jan 14:00-15:30) Research Problem  Detecting incongruity between news headline and body text (i.e., when a news headline does not correctly represent the story as in advertisements, clickbaits, fake news, hijacked stories, etc.)  Why is it important? - People are less likely to read the whole contents but just read news headlines - News headlines play an important role in making first impressions to readers, which persist even after reading the whole article content Most news are shared based on headlines in online platforms  Previous efforts - Public competition aiming to facilitate research on developing algorithms (e.g., Fake News Challenge in 2017) - Feature based approaches in detecting ambiguous headlines (Wei et al. 2017) A sophisticated model could not be proposed or fully-evaluated due to lack of available corpus for research  Proposed approaches - Attentive Hierarchical Dual Encoder (AHDE): encodes the full news article from a word-level to a paragraph-level with attention mechanism - Independent Paragraph (IP) Method: splits paragraphs in the body text and learns the relationship between each paragraph and headline Experimental Results Dataset Generation  Data source - Main dataset: Crawl a near-complete set of news article published in South Korea (period: January of 2017 – October of 2017) - Supplementary dataset: 136K news articles in English (Horne et al. 2018)  Two types of dataset - Whole-corpus: Inject paragraphs that are negatively sampled from other articles into an original article to make the headline become incongruent - Paragraph-corpus: Transform a pair of headline and body text into multiple sub-pairs of headline and paragraph Methodology  Baseline approaches - XGBoost: the winning model in Fake News Challenge (FNC-1) - Recurrent Dual Encoder (RDE): independently learn sequential patterns of headline and body text via dual RNNs - Convolutional Dual Encoder (CDE): independently learn textual patterns of headline and body text via dual CNNs Passage encoding , , , , ,, Attention tanh ∑ , ∑ Without IP With IP AHDE was the best AHDE was (still) the bestEnjoy more advantages compared to hierarchical models Without IP With IP Summary of contributions  RELEASE a million-scale dataset for research on misinformation  PROPOSE two neural networks that efficiently learn the textual relationship between headline and body text  SHOW that the models trained on the released corpus decent performances on the real-world dataset The code, data, and paper are available at http://david-yoon.github.io