SlideShare ist ein Scribd-Unternehmen logo
1 von 3
Downloaden Sie, um offline zu lesen
International Journal of Trend in Scientific Research and Development (IJTSRD)
Volume 4 Issue 2, February 2020 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
@ IJTSRD | Unique Paper ID – IJTSRD30014 | Volume – 4 | Issue – 2 | January-February 2020 Page 512
Fake News Detection using Machine Learning
Prasanth. K1, Praveen. N1, Vijay. S1, Auxilia Osvin Nancy. V2
1B.Tech Student, IT, SAEC, Chennai, Tamil Nadu, India
2Assistant Professor, IT, SAEC, Chennai, Tamil Nadu, India
ABSTRACT
Recently, fake news has been incurring many problems to our society. As a
result, many researchers have been working on identifying fakenews.Mostof
the fake news detection systems utilize the linguistic feature of the news.
However, they have difficulty in sensing highly ambiguous fake news which
can be detected only after identifying meaning and latest related information.
In this paper, to resolve this problem, we shall present a new Korean fake
news detection system using fact DB which is built and updated by human’s
direct judgement after collecting obvious facts. Our system receives a
proposition, and search the semantically relatedarticlesfromFactDBinorder
to verify whether the given proposition is true or not by comparing the
proposition with the related articles in fact DB. To achieve this, we utilize a
deep learning model, Bidirectional Multi-Perspective Matching for Natural
Language Sentence(BiMPM), which hasdemonstrateda goodperformancefor
the sentence matching task. However, BiMPM has some limitations inthatthe
longer the length of the input sentence is, the lower its performance is, and it
has difficulty in making an accurate judgement when an unlearned word or
relation between words appear. In order to overcome the limitations,weshall
propose a new matching technique which exploits article abstraction as well
as entity matching set in addition to BiMPM. In our experiment, weshall show
that our system improves the whole performance for fake news detection.
KEYWORDS: Fake news detection, Sentence matching, Natural Language
Processing, Deep learning, BiLSTM model, Machine Learning
How to cite this paper: Prasanth. K |
Praveen. N | Vijay. S | Auxilia Osvin Nancy.
V "Fake News Detection using Machine
Learning" Published
in International
Journal of Trend in
Scientific Research
and Development
(ijtsrd), ISSN: 2456-
6470, Volume-4 |
Issue-2, February
2020, pp.512-514, URL:
www.ijtsrd.com/papers/ijtsrd30014.pdf
Copyright © 2019 by author(s) and
International Journal ofTrendinScientific
Research and Development Journal. This
is an Open Access article distributed
under the terms of
the Creative
CommonsAttribution
License (CC BY 4.0)
(http://creativecommons.org/licenses/by
/4.0)
INRODUCTION
Fake news has been incurring many problemstooursociety.
Fake news is the ones that the writer intends to mislead in
order to achieve his/her interestspoliticallyoreconomically
on purpose [1]. With the generation of a huge volume of
internet news and social media. It becomes much more
difficult to identify fake news personally. Recently, many
researchers have worked on fake news detection system
which automaticallydeterminesifanyopinionclaimedinthe
article contains fake content [2]. In a large context,theforms
of their research are carried out with the method that
connects the linguistic patternofnewstodeception,andthat
verifies deception by utilizing external knowledge [3]. The
first approach can quickly verify fake news at a low cost.
However, in order to detect clever fake news, it is necessary
to grasp the semantic content of the article rather than
partial patterns and verify it through external facts updated
by human. Therefore, we search the in put proposition and
related articles from the Fact DB, and develop the fake news
detection system to verify if the found articles and
proposition are semantically related.
Related Works:
Recently, as the deep learning in the NLP field has been
developed, various types of the sentence matching
techniques have been introduced. We introduce the related
research of the sentence matching techniques as we divide
the works into the unsupervised learning, and supervised
learning based works.
A. Unsupervised Learning:
One of the most important elements in the sentence
matching is the way of expressing a word into a data
structure. The existing method of expressing words is one-
hot encoding vector. However, this method requires lots of
dimension to express a single word, and cannot express the
relation between words. Overcoming these shortcomings,
the word-to-vector (word2vec) [5] method was proposed
which maps significant information into the vector of fixed
dimensions. The word2vec is enabled to learn the weight to
increase the probability that the nearby words will appear
for the main word, and uses the corresponding weight as a
vector. As an extended research of word2vec, sentence-to-
vector (sent2vec).
B. Supervised Learning:
Recently, the research of the machine comprehension is
developed with attention mechanism and BiLSTM. LSTM
resolved the vanishinggradientproblemofRecursiveNeural
Network (RNN) by adding the layer that forgets the past
information, and remembers the current one to the cell.
Since LSTM handles sequential inputs, it is often used for
encoding and decoding of sentences. However, as the length
of LSTM becomes longer, the model loses the information,
and it shows a tendency to remember the latter information.
Therefore, scholars worked on improving the performance
of the existing LSTM through attention mechanism which
reminds important information selectively. They also
showed using BiLSTM together can improve performance.
The Bi-Directional Attention Flow(BiDAF)[7]minimizesthe
IJTSRD30014
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD30014 | Volume – 4 | Issue – 2 | January-February 2020 Page 513
loss of information by applying the attention mechanism at
each time stamp of LSTM. In particular, BiMPM [8] applied
BiLSTM and attention mechanism to sentence matching.
Proposed System:
This section discusses the proposed solution for fake news
detection by combining Fake News with SentimentAnalysis.
Proposed solution is shown in Fig. 1. It consists of various
steps as below:
Step 1 : Merged data set was prepared using from different
data sets namely Politifact, Kaggle and Emergent
datasets.
Step 2 : The different text preprocessing techniques like
bigrams (series of two words taken from a given
text) ,trigrams (continuous series of three words
taken from example text), CountVectorizer (count of
terms in vector/ text , term frequency-inverse
document frequency (tf-idf) vectorizer.
Step 3 : We have used tf-idf vectorizer on twitter dataset
along with cosine similarity to build our vocabulary.
Then Naive Bayes classifier was used to predict the
sentiment of news statement oftestdata set(Merged
data set) as shown in Fig. 2.
Step 4 : We added additional columns: tf-idf scores,
sentiments and Cosine similarity scores in Merged
data set.
Step 5 : Training model was built using Naive Bayes and
Random Forest (train-test ratio: 3:1)
Step 6 : Performance is evaluated and compared using
accuracy.
Proposed solution consists of important steps 2 to 4 as
preprocessing. It uses tf-idf Vectorizer with cosine
similarities method for tokenizing a collection of text
documents along with building a vocabulary of pre-existing
words. Further we encoded the novel documents using that
vocabulary. The encoded vector is returned with length of
the entire vocabulary (bag of words) and an integer count
for the number of times each word appeared had in the
document.
System Evaluation:
We evaluate the performance of proposed system in this
section. Given the relevant article on the input proposition,
the evaluation verifies the ability to determine whether the
semantic content of the input propositioncan befoundinthe
relevant article. We train the BiMPM which isthefoundation
of our system, and the experiments identify how much the
performance improves by adding modules proposed
previously. We first build the data set directly to train the
BiMPM to output true or false when given a short article
consisting of three or four sentencesandpropositions.In the
datasets construction, the following policy is set up to
proceed with the learning.
1. Extract one sentence from a short article, and use it into
an input proposition.
2. Generate the data, which is true through variations of
thesaurus, a change of word orders, and omission of
some contents.
3. Distort some information such as numbers, nouns, and
verbs or omit words to generate false data.
Fig.2. Training accuracy of test set for each epoch.
Proved the sentences used in the new testsetarelonger,and
consist of new words that are not in the previous dataset. In
terms of the using True Positive Rate (TPR) as the y-axisand
False Positive Rate (FPR) as the x-axis.
Conclusion:
In this paper, we have proposed the fake news detection
system using Machine learningwhichisbuiltandupdatedby
human’s direct judgement. Our system receives a
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD30014 | Volume – 4 | Issue – 2 | January-February 2020 Page 514
proposition as an input to verify, and search the related
articles so that it verifies if the article found by the entered
proposition can be semantically concurred with the
proposition. To achieve this, we utilized model which is a
deep learning model for sentences matching and machine
learning. However, even though has shown good
performance in variousdatasets,ithassomelimitationssuch
that the longer the length of the input sentence is, the lower
its performance is, and it has difficulty inmakinganaccurate
judgement when an unlearned word or relation between
words appear. In order to overcome the limitations,wehave
presented the new matching technique which makes use of
article abstraction as well as entity matching set besides
BiMPM. In our experiment, we have shown that our system
improved the whole performance for fake news detection.
Reference:
[1] H. Allcott and M. Gentzkow, “Social media and fake
news in the 2016 election,” Journal of economic
perspectives, vol. 31, no. 2, pp. 211–36, 2017.
[2] N. J. Conroy, V. L. Rubin, and Y. Chen, “Automatic
deception detection: Methodsforfindingfake news,”in
Proceedings of the 78th ASIS&T Annual Meeting:
Information Science with Impact: Research in and for
the Community, p. 82, American Society for
Information Science, 2015.
[3] V. L. Rubin, Y. Chen, and N. J. Conroy, “Deception
detection for news:threetypesoffakes,”inProceedings
of the 78th ASIS&T Annual Meeting: Information
Science with Impact: Research in and for the
Community, p. 83, American Society for Information
Science, 2015.
[4] S. R. Bowman, G. Angeli, C. Potts, and C. D. Manning, “A
large annotated corpus for learning natural language
inference,” arXiv preprintarXiv: 1508.05326, 2015.
[5] M. Potthast, S. K¨opsel, B. Stein, and M. Hagen, “Click
bait detection,” in Advances in Information Retrieval.
Springer, 2016.
[6] M. Pagliardini, P. Gupta, and M. Jaggi, “Unsupervised
learning of sentence embeddings using compositional
n-gram features,” arXiv preprintarXiv: 1703.02507,
2017.
[7] V. Piek et al., “Newsreader: How semantic web helps
natural language Recognizing click bait as false news,”
in ACM MDD, 2015.
[8] W. W. Yang. ” liar, liar pants on fire: A new benchmark
dataset for fake news detection,” arXiv preprint arXiv:
1705.00648, 2017.
[9] Z. Wang, W. Hamza, and R. Florian, “Bilateral multi-
perspective matching for natural language sentences,”
arXiv preprint arXiv: 1702.03814,2017.
[10] K. Shu, D. Mahudeswaran, and H. Liu. ”Fake News
Tracker: a tool for fake news collection, detection, and
visualization,” Computational and Mathematical
Organization Theory, pp.1-2, 2018.

Weitere ähnliche Inhalte

Was ist angesagt?

Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
SonuCreation
 
Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14
Rachit Goel
 
project sentiment analysis
project sentiment analysisproject sentiment analysis
project sentiment analysis
sneha penmetsa
 

Was ist angesagt? (20)

Sentiment Analysis using Twitter Data
Sentiment Analysis using Twitter DataSentiment Analysis using Twitter Data
Sentiment Analysis using Twitter Data
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Sentiment analysis using ml
Sentiment analysis using mlSentiment analysis using ml
Sentiment analysis using ml
 
Detecting fake news .pptx
Detecting fake news .pptxDetecting fake news .pptx
Detecting fake news .pptx
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on Twitter
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
 
Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14
 
Twitter sentimentanalysis report
Twitter sentimentanalysis reportTwitter sentimentanalysis report
Twitter sentimentanalysis report
 
Sentiment Analaysis on Twitter
Sentiment Analaysis on TwitterSentiment Analaysis on Twitter
Sentiment Analaysis on Twitter
 
project sentiment analysis
project sentiment analysisproject sentiment analysis
project sentiment analysis
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using python
 
Spammer detection and fake user Identification on Social Networks
Spammer detection and fake user Identification on Social NetworksSpammer detection and fake user Identification on Social Networks
Spammer detection and fake user Identification on Social Networks
 
Seminar Report Mine
Seminar Report MineSeminar Report Mine
Seminar Report Mine
 
Tweet sentiment analysis
Tweet sentiment analysisTweet sentiment analysis
Tweet sentiment analysis
 
Amazon Product Sentiment review
Amazon Product Sentiment reviewAmazon Product Sentiment review
Amazon Product Sentiment review
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
LSTM Based Sentiment Analysis
LSTM Based Sentiment AnalysisLSTM Based Sentiment Analysis
LSTM Based Sentiment Analysis
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Approaches to Sentiment Analysis
Approaches to Sentiment AnalysisApproaches to Sentiment Analysis
Approaches to Sentiment Analysis
 
Sih ppt
Sih pptSih ppt
Sih ppt
 

Ähnlich wie Fake News Detection using Machine Learning

Enhanced sentiment analysis based on improved word embeddings and XGboost
Enhanced sentiment analysis based on improved word embeddings and XGboost Enhanced sentiment analysis based on improved word embeddings and XGboost
Enhanced sentiment analysis based on improved word embeddings and XGboost
IJECEIAES
 
Automatic Query Expansion Using Word Embedding Based on Fuzzy Graph Connectiv...
Automatic Query Expansion Using Word Embedding Based on Fuzzy Graph Connectiv...Automatic Query Expansion Using Word Embedding Based on Fuzzy Graph Connectiv...
Automatic Query Expansion Using Word Embedding Based on Fuzzy Graph Connectiv...
YogeshIJTSRD
 

Ähnlich wie Fake News Detection using Machine Learning (20)

Enhanced sentiment analysis based on improved word embeddings and XGboost
Enhanced sentiment analysis based on improved word embeddings and XGboost Enhanced sentiment analysis based on improved word embeddings and XGboost
Enhanced sentiment analysis based on improved word embeddings and XGboost
 
Irjet v7 i4693
Irjet v7 i4693Irjet v7 i4693
Irjet v7 i4693
 
An in-depth review on News Classification through NLP
An in-depth review on News Classification through NLPAn in-depth review on News Classification through NLP
An in-depth review on News Classification through NLP
 
G04124041046
G04124041046G04124041046
G04124041046
 
IRJET - Fake News Detection: A Survey
IRJET -  	  Fake News Detection: A SurveyIRJET -  	  Fake News Detection: A Survey
IRJET - Fake News Detection: A Survey
 
Fake News Detection using Passive Aggressive and Naïve Bayes
Fake News Detection using Passive Aggressive and Naïve BayesFake News Detection using Passive Aggressive and Naïve Bayes
Fake News Detection using Passive Aggressive and Naïve Bayes
 
Opinion mining on newspaper headlines using SVM and NLP
Opinion mining on newspaper headlines using SVM and NLPOpinion mining on newspaper headlines using SVM and NLP
Opinion mining on newspaper headlines using SVM and NLP
 
Automated News Categorization Using Machine Learning Techniques
Automated News Categorization Using Machine Learning TechniquesAutomated News Categorization Using Machine Learning Techniques
Automated News Categorization Using Machine Learning Techniques
 
IRJET - Fake News Detection using Machine Learning
IRJET -  	  Fake News Detection using Machine LearningIRJET -  	  Fake News Detection using Machine Learning
IRJET - Fake News Detection using Machine Learning
 
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
 
Automatic Query Expansion Using Word Embedding Based on Fuzzy Graph Connectiv...
Automatic Query Expansion Using Word Embedding Based on Fuzzy Graph Connectiv...Automatic Query Expansion Using Word Embedding Based on Fuzzy Graph Connectiv...
Automatic Query Expansion Using Word Embedding Based on Fuzzy Graph Connectiv...
 
228-SE3001_2
228-SE3001_2228-SE3001_2
228-SE3001_2
 
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
IRJET-  	  A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...IRJET-  	  A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
 
A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...
 
A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...
 
Sentiment Analysis and Classification of Tweets using Data Mining
Sentiment Analysis and Classification of Tweets using Data MiningSentiment Analysis and Classification of Tweets using Data Mining
Sentiment Analysis and Classification of Tweets using Data Mining
 
Named Entity Recognition using Tweet Segmentation
Named Entity Recognition using Tweet SegmentationNamed Entity Recognition using Tweet Segmentation
Named Entity Recognition using Tweet Segmentation
 
Document Classification Using Expectation Maximization with Semi Supervised L...
Document Classification Using Expectation Maximization with Semi Supervised L...Document Classification Using Expectation Maximization with Semi Supervised L...
Document Classification Using Expectation Maximization with Semi Supervised L...
 
Document Classification Using Expectation Maximization with Semi Supervised L...
Document Classification Using Expectation Maximization with Semi Supervised L...Document Classification Using Expectation Maximization with Semi Supervised L...
Document Classification Using Expectation Maximization with Semi Supervised L...
 
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET -  	  Twitter Sentiment Analysis using Machine LearningIRJET -  	  Twitter Sentiment Analysis using Machine Learning
IRJET - Twitter Sentiment Analysis using Machine Learning
 

Mehr von ijtsrd

‘Six Sigma Technique’ A Journey Through its Implementation
‘Six Sigma Technique’ A Journey Through its Implementation‘Six Sigma Technique’ A Journey Through its Implementation
‘Six Sigma Technique’ A Journey Through its Implementation
ijtsrd
 
Dynamics of Communal Politics in 21st Century India Challenges and Prospects
Dynamics of Communal Politics in 21st Century India Challenges and ProspectsDynamics of Communal Politics in 21st Century India Challenges and Prospects
Dynamics of Communal Politics in 21st Century India Challenges and Prospects
ijtsrd
 
Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...
Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...
Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...
ijtsrd
 
The Impact of Digital Media on the Decentralization of Power and the Erosion ...
The Impact of Digital Media on the Decentralization of Power and the Erosion ...The Impact of Digital Media on the Decentralization of Power and the Erosion ...
The Impact of Digital Media on the Decentralization of Power and the Erosion ...
ijtsrd
 
Problems and Challenges of Agro Entreprenurship A Study
Problems and Challenges of Agro Entreprenurship A StudyProblems and Challenges of Agro Entreprenurship A Study
Problems and Challenges of Agro Entreprenurship A Study
ijtsrd
 
Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...
Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...
Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...
ijtsrd
 
A Study on the Effective Teaching Learning Process in English Curriculum at t...
A Study on the Effective Teaching Learning Process in English Curriculum at t...A Study on the Effective Teaching Learning Process in English Curriculum at t...
A Study on the Effective Teaching Learning Process in English Curriculum at t...
ijtsrd
 
The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...
The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...
The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...
ijtsrd
 
Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...
Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...
Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...
ijtsrd
 
Sustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. Sadiku
Sustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. SadikuSustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. Sadiku
Sustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. Sadiku
ijtsrd
 
Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...
Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...
Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...
ijtsrd
 
Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...
Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...
Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...
ijtsrd
 
Activating Geospatial Information for Sudans Sustainable Investment Map
Activating Geospatial Information for Sudans Sustainable Investment MapActivating Geospatial Information for Sudans Sustainable Investment Map
Activating Geospatial Information for Sudans Sustainable Investment Map
ijtsrd
 
Educational Unity Embracing Diversity for a Stronger Society
Educational Unity Embracing Diversity for a Stronger SocietyEducational Unity Embracing Diversity for a Stronger Society
Educational Unity Embracing Diversity for a Stronger Society
ijtsrd
 
DeepMask Transforming Face Mask Identification for Better Pandemic Control in...
DeepMask Transforming Face Mask Identification for Better Pandemic Control in...DeepMask Transforming Face Mask Identification for Better Pandemic Control in...
DeepMask Transforming Face Mask Identification for Better Pandemic Control in...
ijtsrd
 

Mehr von ijtsrd (20)

‘Six Sigma Technique’ A Journey Through its Implementation
‘Six Sigma Technique’ A Journey Through its Implementation‘Six Sigma Technique’ A Journey Through its Implementation
‘Six Sigma Technique’ A Journey Through its Implementation
 
Edge Computing in Space Enhancing Data Processing and Communication for Space...
Edge Computing in Space Enhancing Data Processing and Communication for Space...Edge Computing in Space Enhancing Data Processing and Communication for Space...
Edge Computing in Space Enhancing Data Processing and Communication for Space...
 
Dynamics of Communal Politics in 21st Century India Challenges and Prospects
Dynamics of Communal Politics in 21st Century India Challenges and ProspectsDynamics of Communal Politics in 21st Century India Challenges and Prospects
Dynamics of Communal Politics in 21st Century India Challenges and Prospects
 
Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...
Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...
Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...
 
The Impact of Digital Media on the Decentralization of Power and the Erosion ...
The Impact of Digital Media on the Decentralization of Power and the Erosion ...The Impact of Digital Media on the Decentralization of Power and the Erosion ...
The Impact of Digital Media on the Decentralization of Power and the Erosion ...
 
Online Voices, Offline Impact Ambedkars Ideals and Socio Political Inclusion ...
Online Voices, Offline Impact Ambedkars Ideals and Socio Political Inclusion ...Online Voices, Offline Impact Ambedkars Ideals and Socio Political Inclusion ...
Online Voices, Offline Impact Ambedkars Ideals and Socio Political Inclusion ...
 
Problems and Challenges of Agro Entreprenurship A Study
Problems and Challenges of Agro Entreprenurship A StudyProblems and Challenges of Agro Entreprenurship A Study
Problems and Challenges of Agro Entreprenurship A Study
 
Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...
Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...
Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...
 
The Impact of Educational Background and Professional Training on Human Right...
The Impact of Educational Background and Professional Training on Human Right...The Impact of Educational Background and Professional Training on Human Right...
The Impact of Educational Background and Professional Training on Human Right...
 
A Study on the Effective Teaching Learning Process in English Curriculum at t...
A Study on the Effective Teaching Learning Process in English Curriculum at t...A Study on the Effective Teaching Learning Process in English Curriculum at t...
A Study on the Effective Teaching Learning Process in English Curriculum at t...
 
The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...
The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...
The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...
 
Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...
Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...
Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...
 
Sustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. Sadiku
Sustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. SadikuSustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. Sadiku
Sustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. Sadiku
 
Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...
Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...
Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...
 
Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...
Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...
Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...
 
Activating Geospatial Information for Sudans Sustainable Investment Map
Activating Geospatial Information for Sudans Sustainable Investment MapActivating Geospatial Information for Sudans Sustainable Investment Map
Activating Geospatial Information for Sudans Sustainable Investment Map
 
Educational Unity Embracing Diversity for a Stronger Society
Educational Unity Embracing Diversity for a Stronger SocietyEducational Unity Embracing Diversity for a Stronger Society
Educational Unity Embracing Diversity for a Stronger Society
 
Integration of Indian Indigenous Knowledge System in Management Prospects and...
Integration of Indian Indigenous Knowledge System in Management Prospects and...Integration of Indian Indigenous Knowledge System in Management Prospects and...
Integration of Indian Indigenous Knowledge System in Management Prospects and...
 
DeepMask Transforming Face Mask Identification for Better Pandemic Control in...
DeepMask Transforming Face Mask Identification for Better Pandemic Control in...DeepMask Transforming Face Mask Identification for Better Pandemic Control in...
DeepMask Transforming Face Mask Identification for Better Pandemic Control in...
 
Streamlining Data Collection eCRF Design and Machine Learning
Streamlining Data Collection eCRF Design and Machine LearningStreamlining Data Collection eCRF Design and Machine Learning
Streamlining Data Collection eCRF Design and Machine Learning
 

Kürzlich hochgeladen

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Kürzlich hochgeladen (20)

Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 

Fake News Detection using Machine Learning

  • 1. International Journal of Trend in Scientific Research and Development (IJTSRD) Volume 4 Issue 2, February 2020 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470 @ IJTSRD | Unique Paper ID – IJTSRD30014 | Volume – 4 | Issue – 2 | January-February 2020 Page 512 Fake News Detection using Machine Learning Prasanth. K1, Praveen. N1, Vijay. S1, Auxilia Osvin Nancy. V2 1B.Tech Student, IT, SAEC, Chennai, Tamil Nadu, India 2Assistant Professor, IT, SAEC, Chennai, Tamil Nadu, India ABSTRACT Recently, fake news has been incurring many problems to our society. As a result, many researchers have been working on identifying fakenews.Mostof the fake news detection systems utilize the linguistic feature of the news. However, they have difficulty in sensing highly ambiguous fake news which can be detected only after identifying meaning and latest related information. In this paper, to resolve this problem, we shall present a new Korean fake news detection system using fact DB which is built and updated by human’s direct judgement after collecting obvious facts. Our system receives a proposition, and search the semantically relatedarticlesfromFactDBinorder to verify whether the given proposition is true or not by comparing the proposition with the related articles in fact DB. To achieve this, we utilize a deep learning model, Bidirectional Multi-Perspective Matching for Natural Language Sentence(BiMPM), which hasdemonstrateda goodperformancefor the sentence matching task. However, BiMPM has some limitations inthatthe longer the length of the input sentence is, the lower its performance is, and it has difficulty in making an accurate judgement when an unlearned word or relation between words appear. In order to overcome the limitations,weshall propose a new matching technique which exploits article abstraction as well as entity matching set in addition to BiMPM. In our experiment, weshall show that our system improves the whole performance for fake news detection. KEYWORDS: Fake news detection, Sentence matching, Natural Language Processing, Deep learning, BiLSTM model, Machine Learning How to cite this paper: Prasanth. K | Praveen. N | Vijay. S | Auxilia Osvin Nancy. V "Fake News Detection using Machine Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456- 6470, Volume-4 | Issue-2, February 2020, pp.512-514, URL: www.ijtsrd.com/papers/ijtsrd30014.pdf Copyright © 2019 by author(s) and International Journal ofTrendinScientific Research and Development Journal. This is an Open Access article distributed under the terms of the Creative CommonsAttribution License (CC BY 4.0) (http://creativecommons.org/licenses/by /4.0) INRODUCTION Fake news has been incurring many problemstooursociety. Fake news is the ones that the writer intends to mislead in order to achieve his/her interestspoliticallyoreconomically on purpose [1]. With the generation of a huge volume of internet news and social media. It becomes much more difficult to identify fake news personally. Recently, many researchers have worked on fake news detection system which automaticallydeterminesifanyopinionclaimedinthe article contains fake content [2]. In a large context,theforms of their research are carried out with the method that connects the linguistic patternofnewstodeception,andthat verifies deception by utilizing external knowledge [3]. The first approach can quickly verify fake news at a low cost. However, in order to detect clever fake news, it is necessary to grasp the semantic content of the article rather than partial patterns and verify it through external facts updated by human. Therefore, we search the in put proposition and related articles from the Fact DB, and develop the fake news detection system to verify if the found articles and proposition are semantically related. Related Works: Recently, as the deep learning in the NLP field has been developed, various types of the sentence matching techniques have been introduced. We introduce the related research of the sentence matching techniques as we divide the works into the unsupervised learning, and supervised learning based works. A. Unsupervised Learning: One of the most important elements in the sentence matching is the way of expressing a word into a data structure. The existing method of expressing words is one- hot encoding vector. However, this method requires lots of dimension to express a single word, and cannot express the relation between words. Overcoming these shortcomings, the word-to-vector (word2vec) [5] method was proposed which maps significant information into the vector of fixed dimensions. The word2vec is enabled to learn the weight to increase the probability that the nearby words will appear for the main word, and uses the corresponding weight as a vector. As an extended research of word2vec, sentence-to- vector (sent2vec). B. Supervised Learning: Recently, the research of the machine comprehension is developed with attention mechanism and BiLSTM. LSTM resolved the vanishinggradientproblemofRecursiveNeural Network (RNN) by adding the layer that forgets the past information, and remembers the current one to the cell. Since LSTM handles sequential inputs, it is often used for encoding and decoding of sentences. However, as the length of LSTM becomes longer, the model loses the information, and it shows a tendency to remember the latter information. Therefore, scholars worked on improving the performance of the existing LSTM through attention mechanism which reminds important information selectively. They also showed using BiLSTM together can improve performance. The Bi-Directional Attention Flow(BiDAF)[7]minimizesthe IJTSRD30014
  • 2. International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD30014 | Volume – 4 | Issue – 2 | January-February 2020 Page 513 loss of information by applying the attention mechanism at each time stamp of LSTM. In particular, BiMPM [8] applied BiLSTM and attention mechanism to sentence matching. Proposed System: This section discusses the proposed solution for fake news detection by combining Fake News with SentimentAnalysis. Proposed solution is shown in Fig. 1. It consists of various steps as below: Step 1 : Merged data set was prepared using from different data sets namely Politifact, Kaggle and Emergent datasets. Step 2 : The different text preprocessing techniques like bigrams (series of two words taken from a given text) ,trigrams (continuous series of three words taken from example text), CountVectorizer (count of terms in vector/ text , term frequency-inverse document frequency (tf-idf) vectorizer. Step 3 : We have used tf-idf vectorizer on twitter dataset along with cosine similarity to build our vocabulary. Then Naive Bayes classifier was used to predict the sentiment of news statement oftestdata set(Merged data set) as shown in Fig. 2. Step 4 : We added additional columns: tf-idf scores, sentiments and Cosine similarity scores in Merged data set. Step 5 : Training model was built using Naive Bayes and Random Forest (train-test ratio: 3:1) Step 6 : Performance is evaluated and compared using accuracy. Proposed solution consists of important steps 2 to 4 as preprocessing. It uses tf-idf Vectorizer with cosine similarities method for tokenizing a collection of text documents along with building a vocabulary of pre-existing words. Further we encoded the novel documents using that vocabulary. The encoded vector is returned with length of the entire vocabulary (bag of words) and an integer count for the number of times each word appeared had in the document. System Evaluation: We evaluate the performance of proposed system in this section. Given the relevant article on the input proposition, the evaluation verifies the ability to determine whether the semantic content of the input propositioncan befoundinthe relevant article. We train the BiMPM which isthefoundation of our system, and the experiments identify how much the performance improves by adding modules proposed previously. We first build the data set directly to train the BiMPM to output true or false when given a short article consisting of three or four sentencesandpropositions.In the datasets construction, the following policy is set up to proceed with the learning. 1. Extract one sentence from a short article, and use it into an input proposition. 2. Generate the data, which is true through variations of thesaurus, a change of word orders, and omission of some contents. 3. Distort some information such as numbers, nouns, and verbs or omit words to generate false data. Fig.2. Training accuracy of test set for each epoch. Proved the sentences used in the new testsetarelonger,and consist of new words that are not in the previous dataset. In terms of the using True Positive Rate (TPR) as the y-axisand False Positive Rate (FPR) as the x-axis. Conclusion: In this paper, we have proposed the fake news detection system using Machine learningwhichisbuiltandupdatedby human’s direct judgement. Our system receives a
  • 3. International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD30014 | Volume – 4 | Issue – 2 | January-February 2020 Page 514 proposition as an input to verify, and search the related articles so that it verifies if the article found by the entered proposition can be semantically concurred with the proposition. To achieve this, we utilized model which is a deep learning model for sentences matching and machine learning. However, even though has shown good performance in variousdatasets,ithassomelimitationssuch that the longer the length of the input sentence is, the lower its performance is, and it has difficulty inmakinganaccurate judgement when an unlearned word or relation between words appear. In order to overcome the limitations,wehave presented the new matching technique which makes use of article abstraction as well as entity matching set besides BiMPM. In our experiment, we have shown that our system improved the whole performance for fake news detection. Reference: [1] H. Allcott and M. Gentzkow, “Social media and fake news in the 2016 election,” Journal of economic perspectives, vol. 31, no. 2, pp. 211–36, 2017. [2] N. J. Conroy, V. L. Rubin, and Y. Chen, “Automatic deception detection: Methodsforfindingfake news,”in Proceedings of the 78th ASIS&T Annual Meeting: Information Science with Impact: Research in and for the Community, p. 82, American Society for Information Science, 2015. [3] V. L. Rubin, Y. Chen, and N. J. Conroy, “Deception detection for news:threetypesoffakes,”inProceedings of the 78th ASIS&T Annual Meeting: Information Science with Impact: Research in and for the Community, p. 83, American Society for Information Science, 2015. [4] S. R. Bowman, G. Angeli, C. Potts, and C. D. Manning, “A large annotated corpus for learning natural language inference,” arXiv preprintarXiv: 1508.05326, 2015. [5] M. Potthast, S. K¨opsel, B. Stein, and M. Hagen, “Click bait detection,” in Advances in Information Retrieval. Springer, 2016. [6] M. Pagliardini, P. Gupta, and M. Jaggi, “Unsupervised learning of sentence embeddings using compositional n-gram features,” arXiv preprintarXiv: 1703.02507, 2017. [7] V. Piek et al., “Newsreader: How semantic web helps natural language Recognizing click bait as false news,” in ACM MDD, 2015. [8] W. W. Yang. ” liar, liar pants on fire: A new benchmark dataset for fake news detection,” arXiv preprint arXiv: 1705.00648, 2017. [9] Z. Wang, W. Hamza, and R. Florian, “Bilateral multi- perspective matching for natural language sentences,” arXiv preprint arXiv: 1702.03814,2017. [10] K. Shu, D. Mahudeswaran, and H. Liu. ”Fake News Tracker: a tool for fake news collection, detection, and visualization,” Computational and Mathematical Organization Theory, pp.1-2, 2018.