SlideShare a Scribd company logo
1 of 3
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 4408
Suicide Text Detection using Machine Learning
Tanmay Khopkar1, Akshay Khane2, Prathamesh Dhumal3 , Mrs. Snehal R. Shinde4
1, 2, 3Department of Computer Engineering, Pillai HOC College of Engineering and Technology, Rasayani,
Maharashtra, India.
4Assistant Professor Snehal R. Shinde, Department of Computer Engineering, Pillai HOC College of Engineering and
Technology, Rasayani, Maharashtra, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Social media network like twitter is a
communication channel which enables it’s users to broadcast
their thoughts through brief text updates. Different people
have different thoughts which theyoftenpostonTwitter. Close
to 800,000 people commit suicide everyyear, whichis1person
every 40 seconds. Vulnerable accounts, usually post about
their depressive situations on social media via twitter prior to
act. Raw but useful data is available on social media, these
account’s social media activity can be observed by mining the
data using different Machine Learning techniques.
Convolutional Neural Network has been used to classify the
text in these raw data. This result, to the best of our
knowledge, has the best performance for classifyingdatafrom
social media.
Key Words: Text Classification, Machine Learning, CNN,
Social Media, Suicide.
1. INTRODUCTION
Social media is a type ofonlineintercommunicationplatform
that allow its users to share differentkindsofmedia publicly.
The media can be in the form of photos, videos, text
messages, etc. There are different types of social media
platforms that are very popular.
Twitter is one of the social media platforms that is used by
millions of people. Initially, Twitter only allowed sharing of
texts called as ‘Tweets”, but, it now also allows photos and
videos to be shared. Twitter can be used in many ways by its
users for e.g. advertising products, sharingthoughts,sharing
information about a campaign, an event etc. Twitter is
generally used by its users for sharing theirthoughts.People
usually share their personal, political, professional thoughts
onto the platform.
Due to the rise of internet usage, number of people using
the internet has drastically increased over time. As a result,
the number of people using social media, as in our case,
Twitter, has also increased and so is the content on social
media. People share their opinions, feelings,thoughtsetc. on
Twitter.
Feeling or emotions shared by the users can bedepressive
or joyful. The publisher of the depressive text can either be
just sharing his/hers feeling or it can be a cry for help. In an
extreme case, the user can be suicidal i.e. likely to commit
suicide. Our aim is to detect such extreme cases of suicidal
individuals using their tweets they post onto Twitter.
2. RELATED WORK
Large amount of studies have been done on recognisingthe
suicidal tendency of an individual in many fields like medical
fields using the resting-state of the heart rate, psychology
using the thoughts and feeling of the individual.
Machine learning fields alsohavebeenusedtoresearchand
determine the suicidal tendency of an individual using the
content that individual posts on the social media. Text
Classification method of machine learning is used to
determine the suicidal tendency. Besides N-gram features,
knowledge based features, syntactic features, models for
cyber suicide detection also used regression analysis ANN
and CRF. Mulholland and Quinnextractedthevocabularyand
verbal features to determine the suicidal or non-suicidal
sentences. Huang et al. built a psychological lexicon
dictionary and used a SVM classifier. [11]Chatopadhyay
proposed a mathematical feed-forward multilayer neural
network. [9]Delgado-Gomez et al and [9]Pestian et al.
compared the performance of different multivariate
complexity techniques.
3. METHODOLOGY
3.1 Pre-Processing
Data pre-processing is a process that converts the raw,
real-life data in an understandable format. The data in the
raw form is often incomplete and isn’tunderstandable.So, in
order to process the data and get the results that we desire,
we need to perform data pre-processing on the raw data set
so that actual results can be obtained.
The dataset that we have contains 5 columns, namely,id’s,
date, flag, user and tweet. But, among these columns we are
only concerned with the tweet column, so, we drop all the
columns except tweetscolumn.Thetweetsarenotpresentin
only text format. The tweets contain special characters,
hyperlinks, numbers and punctuations. We need to remove
all of these and converts the contents of the column in a
normalised form.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 4409
Fig -1: Data Pre-processing
After removing all the special characters, hyperlinks,
numbers and punctuations, we now remove the connectors
or stopwords for the tweets. The next step is tokenization,
i.e. after the connectors have been removed, we need tosplit
the string that is tweet in our case into a listtokens.Fromthe
tokens that we have now, there may be multiple tokens that
are different forms of same word, so we use Porters
stemmers from nltk library. After this, we separate the
tokens into positive and negative words and visualize them
using the Wordcloud. From the dataset, we visualize the
words in 3 different groups, namely
a) Words that are most commonly used in the
tweets,
b) Positive words that are most commonlyusedin
the tweets, and
c) Negative words that are most commonly used
in the tweets.
3.2 Classification
Classification is the process of clubbing different objects
with same properties into classes, in our case, based on the
positive and negative words, determining the tweets is
suicidal or non-suicidal. Among all of the classifiers present,
we use Convolutional Neural Network (CNN) for classifying
the tweets into suicidal or non-suicidal.
Fig -2: Convolutional Neural Network (CNN) Classifier
Neural word embedding have become popular for
modelling the individual wordsandtheinteractionsbetween
them. It is used to serve various purposes like text
classification. It uses distributedword representationwhere
a single word isn’t represented by a single neurons but a
word is represented by numerous neurons. This helps the
distributed representation system learn about the general
concepts of language and not just the independent word
representations. The most commonlyusedwordembedding
system was developed by Google called as Word2Vec which
uses two neural network architectures namely,
a) Continuous Bag of Words (CBoW) and,
b) Skip-Gram (SG)
The Google Word2Vec basically converts the words into
vectors using some pre-defined formula.Itcanevenperform
vectorization even for the unknown words. After the
vectorization of the text is done, the two main processes for
classifying the sentence, the tweet in our case are,
a) Convolutional FilterorKarnel-Theprocessinwhich
a filter is applied on the vectorised values of the
words using a sliding window of a pre-defined size
and the new vectors are generated.
b) Pooling- The process of sampling the new vectors
created by applying the convolutional filter and
combining them by taking the maximum or the
average value.
Once the vectorization of the words and then sampling of
the vectors is done, the vectors values are then given a
inputs to the neurons of the Convolutional Neural Network
(CNN). The are appropriately weighted and given to the
neurons for processing. Rectified LinearUnit(ReLU)isgiven
as activation functions to the neurons. Based on the inputs,
the weights assigned to the inputs and the activation
function assigned to the neurons,the network segregatesthe
inputs into classes.
4. CONCLUSIONS
Our systems elaborated based on real time Twitter data.
Among many widely used text classification models, based
on this study CNN suicidal detection model shows better
performance. This model shows better optimization on
iterative solutions. This system is primarily focuses on
suicidal people by mining the databasesystemsothattweets
related to suicide will get detected.
FUTURE SCOPE
When our system detects the real time suicidal text we can
make the program evaluate in such a way that we can find
the geographical location of that particular tweet.There are
many NGO’s and anti suicidal organizations work against
suicide.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 4410
we can inform these NGO’s about such accounts so as to take
further appropriate action. This system is not only can be
implemented on Twitter but also other social media
platforms like Facebook, Instagram, and WhatsApp.
REFERENCES
[1] Glen Coppersmith, Mark Dredze, Craig Harman, and
Kristy Hollingshead. 2015a. From ADHD to SAD:
Analyzing the Language of Mental Health on Twitter
through Self-Reported Diagnoses. In Computational
Linguistics and Clinical Psychology, pages 1– 10M.
Young, The Technical Writer’s Handbook. Mill Valley,
CA: University Science, 1989.
[2] American Psychiatric Association. 2013. Diagnostic and
statistical manual of mental disorders (5th ed.), 5
edition. American PsychiatricPublishing, Washington.K.
Elissa, “Title of paper if known,” unpublished.
[3] Depression and Other Common Mental Disorders: Global
Health Estimates. World Health Organization, 2017.
[4] BasicstructureaboutCNNWikipedia//en.wikipedia.org/wiki/
Convolutionalneuralnetwork.
[5] E. Okhapkina, V. Okhapkin, and O. Kazarin, “Adaptation
of information retrieval methods for identifying of
destructive informational influence in social networks,”
in 2017 31st International Conference on Advanced
Information Networking and Applications Workshops
(WAINA), pp. 87–92, Taipei, Taiwan, 2017, IEEE.
[6] M. Mulholland and J. Quinn, “Suicidal tendencies: the
automatic classification of suicidal and non-suicidal lyricists
using NLP,” in Proceedings of the Sixth International Joint
Conference on Natural Language Processing, pp. 680–684,
Nagoya, Japan, 2013.
[7] X. Huang, L. Zhang, D. Chiu, T. Liu, X. Li, and T.Zhu,
“Detecting suicidal ideation in Chinese microblogs with
psychological lexicons, “in 2014 IEEE 11th Intl Conf on
Ubiquitous Intelligence and Computing and 2014 IEEE
11th Intl Conf on Autonomic and Trusted Computing
and 2014 IEEE 14th Intl Conf onScalableComputingand
Communications and Its Associated Workshops, pp.
844–849, Bali, Indonesia, 2014, IEEE
[8] J. Pestian, H. Nasrallah, P. Matykiewicz, A. Bennett, and
A. Leenaars, “Suicide note classification using natural
language processing: a content analysis,” Biomedical
Informatics Insights, vol. 3, pp. 19–28, 2010.
[9] D. Delgado-Gomez, H. Blasco-Fontecilla, F. Sukno, M.
Socorro Ramos-Plasencia, and E. Baca-Garcia, “Suicide
attempters classification: toward predictive models of
suicidal behavior,” Neurocomputing,vol.92,pp.3–8,2012.
[10] Research Article Based on ”Supervised Learningfor
Suicidal Ideation Detection in Online User Content” by
Shaoxiong Ji, Celina Ping Yu, Sai-fu Fung, Shirui Pan,and
Guodong Long.
[11] S. Chattopadhyay,“Astudyonsuicidal risk analysis,”
in 2007 9th International Conference on e-Health
Networking, Application and Services,pp.74–78, Taipei,
Taiwan, 2007, IEEE.

More Related Content

What's hot

Fake News Detection using Machine Learning
Fake News Detection using Machine LearningFake News Detection using Machine Learning
Fake News Detection using Machine Learning
ijtsrd
 

What's hot (20)

IRJET- Sentiment Analysis of Twitter Data using Python
IRJET-  	  Sentiment Analysis of Twitter Data using PythonIRJET-  	  Sentiment Analysis of Twitter Data using Python
IRJET- Sentiment Analysis of Twitter Data using Python
 
IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci...
IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci...IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci...
IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci...
 
IRJET - Twitter Sentimental Analysis
IRJET -  	  Twitter Sentimental AnalysisIRJET -  	  Twitter Sentimental Analysis
IRJET - Twitter Sentimental Analysis
 
Sentiment Analysis on Twitter Data
Sentiment Analysis on Twitter DataSentiment Analysis on Twitter Data
Sentiment Analysis on Twitter Data
 
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET -  	  Twitter Sentiment Analysis using Machine LearningIRJET -  	  Twitter Sentiment Analysis using Machine Learning
IRJET - Twitter Sentiment Analysis using Machine Learning
 
Detection and Analysis of Twitter Trending Topics via Link-Anomaly Detection
Detection and Analysis of Twitter Trending Topics via Link-Anomaly DetectionDetection and Analysis of Twitter Trending Topics via Link-Anomaly Detection
Detection and Analysis of Twitter Trending Topics via Link-Anomaly Detection
 
IRJET - Sentiment Analysis of Posts and Comments of OSN
IRJET -  	  Sentiment Analysis of Posts and Comments of OSNIRJET -  	  Sentiment Analysis of Posts and Comments of OSN
IRJET - Sentiment Analysis of Posts and Comments of OSN
 
Fake News Detection using Machine Learning
Fake News Detection using Machine LearningFake News Detection using Machine Learning
Fake News Detection using Machine Learning
 
E017433538
E017433538E017433538
E017433538
 
Neural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment AnalysisNeural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment Analysis
 
IRJET- Fake News Detection
IRJET- Fake News DetectionIRJET- Fake News Detection
IRJET- Fake News Detection
 
IRJET- Categorization of Geo-Located Tweets for Data Analysis
IRJET- Categorization of Geo-Located Tweets for Data AnalysisIRJET- Categorization of Geo-Located Tweets for Data Analysis
IRJET- Categorization of Geo-Located Tweets for Data Analysis
 
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
 
Detailed Investigation of Text Classification and Clustering of Twitter Data ...
Detailed Investigation of Text Classification and Clustering of Twitter Data ...Detailed Investigation of Text Classification and Clustering of Twitter Data ...
Detailed Investigation of Text Classification and Clustering of Twitter Data ...
 
757
757757
757
 
IRJET- Twitter Opinion Mining
IRJET- Twitter Opinion MiningIRJET- Twitter Opinion Mining
IRJET- Twitter Opinion Mining
 
32 99-1-pb
32 99-1-pb32 99-1-pb
32 99-1-pb
 
Twitter sentiment analysis project report
Twitter sentiment analysis project reportTwitter sentiment analysis project report
Twitter sentiment analysis project report
 
Odsc 2018 detection_classification_of_fake_news_using_cnn_venkatraman
Odsc 2018 detection_classification_of_fake_news_using_cnn_venkatramanOdsc 2018 detection_classification_of_fake_news_using_cnn_venkatraman
Odsc 2018 detection_classification_of_fake_news_using_cnn_venkatraman
 
Political prediction analysis using text mining and deep learning
Political prediction analysis using text mining and deep learningPolitical prediction analysis using text mining and deep learning
Political prediction analysis using text mining and deep learning
 

Similar to IRJET - Suicidal Text Detection using Machine Learning

[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
IJET - International Journal of Engineering and Techniques
 

Similar to IRJET - Suicidal Text Detection using Machine Learning (20)

IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
 
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
 
IRJET- Design and Development of a System for Predicting Threats using Data S...
IRJET- Design and Development of a System for Predicting Threats using Data S...IRJET- Design and Development of a System for Predicting Threats using Data S...
IRJET- Design and Development of a System for Predicting Threats using Data S...
 
Sentiment Analysis on Twitter data using Machine Learning
Sentiment Analysis on Twitter data using Machine LearningSentiment Analysis on Twitter data using Machine Learning
Sentiment Analysis on Twitter data using Machine Learning
 
Derogatory Comment Classification
Derogatory Comment ClassificationDerogatory Comment Classification
Derogatory Comment Classification
 
COVID Sentiment Analysis of Social Media Data Using Enhanced Stacked Ensemble
COVID Sentiment Analysis of Social Media Data Using Enhanced Stacked EnsembleCOVID Sentiment Analysis of Social Media Data Using Enhanced Stacked Ensemble
COVID Sentiment Analysis of Social Media Data Using Enhanced Stacked Ensemble
 
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNINGA STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
 
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
 
Estimating the Efficacy of Efficient Machine Learning Classifiers for Twitter...
Estimating the Efficacy of Efficient Machine Learning Classifiers for Twitter...Estimating the Efficacy of Efficient Machine Learning Classifiers for Twitter...
Estimating the Efficacy of Efficient Machine Learning Classifiers for Twitter...
 
IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...
IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...
IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...
 
Design, analysis and implementation of geolocation based emotion detection te...
Design, analysis and implementation of geolocation based emotion detection te...Design, analysis and implementation of geolocation based emotion detection te...
Design, analysis and implementation of geolocation based emotion detection te...
 
Design, analysis and implementation of geolocation based emotion detection te...
Design, analysis and implementation of geolocation based emotion detection te...Design, analysis and implementation of geolocation based emotion detection te...
Design, analysis and implementation of geolocation based emotion detection te...
 
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
 
IRJET- Tweet Segmentation and its Application to Named Entity Recognition
IRJET- Tweet Segmentation and its Application to Named Entity RecognitionIRJET- Tweet Segmentation and its Application to Named Entity Recognition
IRJET- Tweet Segmentation and its Application to Named Entity Recognition
 
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
 
Mental Health Assistant using LSTM
Mental Health Assistant using LSTMMental Health Assistant using LSTM
Mental Health Assistant using LSTM
 
[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
 
Emotion Recognition By Textual Tweets Using Machine Learning
Emotion Recognition By Textual Tweets Using Machine LearningEmotion Recognition By Textual Tweets Using Machine Learning
Emotion Recognition By Textual Tweets Using Machine Learning
 
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNINGSENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
 
Twitter Sentiment Analysis
Twitter Sentiment AnalysisTwitter Sentiment Analysis
Twitter Sentiment Analysis
 

More from IRJET Journal

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 

Recently uploaded (20)

Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 

IRJET - Suicidal Text Detection using Machine Learning

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 4408 Suicide Text Detection using Machine Learning Tanmay Khopkar1, Akshay Khane2, Prathamesh Dhumal3 , Mrs. Snehal R. Shinde4 1, 2, 3Department of Computer Engineering, Pillai HOC College of Engineering and Technology, Rasayani, Maharashtra, India. 4Assistant Professor Snehal R. Shinde, Department of Computer Engineering, Pillai HOC College of Engineering and Technology, Rasayani, Maharashtra, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Social media network like twitter is a communication channel which enables it’s users to broadcast their thoughts through brief text updates. Different people have different thoughts which theyoftenpostonTwitter. Close to 800,000 people commit suicide everyyear, whichis1person every 40 seconds. Vulnerable accounts, usually post about their depressive situations on social media via twitter prior to act. Raw but useful data is available on social media, these account’s social media activity can be observed by mining the data using different Machine Learning techniques. Convolutional Neural Network has been used to classify the text in these raw data. This result, to the best of our knowledge, has the best performance for classifyingdatafrom social media. Key Words: Text Classification, Machine Learning, CNN, Social Media, Suicide. 1. INTRODUCTION Social media is a type ofonlineintercommunicationplatform that allow its users to share differentkindsofmedia publicly. The media can be in the form of photos, videos, text messages, etc. There are different types of social media platforms that are very popular. Twitter is one of the social media platforms that is used by millions of people. Initially, Twitter only allowed sharing of texts called as ‘Tweets”, but, it now also allows photos and videos to be shared. Twitter can be used in many ways by its users for e.g. advertising products, sharingthoughts,sharing information about a campaign, an event etc. Twitter is generally used by its users for sharing theirthoughts.People usually share their personal, political, professional thoughts onto the platform. Due to the rise of internet usage, number of people using the internet has drastically increased over time. As a result, the number of people using social media, as in our case, Twitter, has also increased and so is the content on social media. People share their opinions, feelings,thoughtsetc. on Twitter. Feeling or emotions shared by the users can bedepressive or joyful. The publisher of the depressive text can either be just sharing his/hers feeling or it can be a cry for help. In an extreme case, the user can be suicidal i.e. likely to commit suicide. Our aim is to detect such extreme cases of suicidal individuals using their tweets they post onto Twitter. 2. RELATED WORK Large amount of studies have been done on recognisingthe suicidal tendency of an individual in many fields like medical fields using the resting-state of the heart rate, psychology using the thoughts and feeling of the individual. Machine learning fields alsohavebeenusedtoresearchand determine the suicidal tendency of an individual using the content that individual posts on the social media. Text Classification method of machine learning is used to determine the suicidal tendency. Besides N-gram features, knowledge based features, syntactic features, models for cyber suicide detection also used regression analysis ANN and CRF. Mulholland and Quinnextractedthevocabularyand verbal features to determine the suicidal or non-suicidal sentences. Huang et al. built a psychological lexicon dictionary and used a SVM classifier. [11]Chatopadhyay proposed a mathematical feed-forward multilayer neural network. [9]Delgado-Gomez et al and [9]Pestian et al. compared the performance of different multivariate complexity techniques. 3. METHODOLOGY 3.1 Pre-Processing Data pre-processing is a process that converts the raw, real-life data in an understandable format. The data in the raw form is often incomplete and isn’tunderstandable.So, in order to process the data and get the results that we desire, we need to perform data pre-processing on the raw data set so that actual results can be obtained. The dataset that we have contains 5 columns, namely,id’s, date, flag, user and tweet. But, among these columns we are only concerned with the tweet column, so, we drop all the columns except tweetscolumn.Thetweetsarenotpresentin only text format. The tweets contain special characters, hyperlinks, numbers and punctuations. We need to remove all of these and converts the contents of the column in a normalised form.
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 4409 Fig -1: Data Pre-processing After removing all the special characters, hyperlinks, numbers and punctuations, we now remove the connectors or stopwords for the tweets. The next step is tokenization, i.e. after the connectors have been removed, we need tosplit the string that is tweet in our case into a listtokens.Fromthe tokens that we have now, there may be multiple tokens that are different forms of same word, so we use Porters stemmers from nltk library. After this, we separate the tokens into positive and negative words and visualize them using the Wordcloud. From the dataset, we visualize the words in 3 different groups, namely a) Words that are most commonly used in the tweets, b) Positive words that are most commonlyusedin the tweets, and c) Negative words that are most commonly used in the tweets. 3.2 Classification Classification is the process of clubbing different objects with same properties into classes, in our case, based on the positive and negative words, determining the tweets is suicidal or non-suicidal. Among all of the classifiers present, we use Convolutional Neural Network (CNN) for classifying the tweets into suicidal or non-suicidal. Fig -2: Convolutional Neural Network (CNN) Classifier Neural word embedding have become popular for modelling the individual wordsandtheinteractionsbetween them. It is used to serve various purposes like text classification. It uses distributedword representationwhere a single word isn’t represented by a single neurons but a word is represented by numerous neurons. This helps the distributed representation system learn about the general concepts of language and not just the independent word representations. The most commonlyusedwordembedding system was developed by Google called as Word2Vec which uses two neural network architectures namely, a) Continuous Bag of Words (CBoW) and, b) Skip-Gram (SG) The Google Word2Vec basically converts the words into vectors using some pre-defined formula.Itcanevenperform vectorization even for the unknown words. After the vectorization of the text is done, the two main processes for classifying the sentence, the tweet in our case are, a) Convolutional FilterorKarnel-Theprocessinwhich a filter is applied on the vectorised values of the words using a sliding window of a pre-defined size and the new vectors are generated. b) Pooling- The process of sampling the new vectors created by applying the convolutional filter and combining them by taking the maximum or the average value. Once the vectorization of the words and then sampling of the vectors is done, the vectors values are then given a inputs to the neurons of the Convolutional Neural Network (CNN). The are appropriately weighted and given to the neurons for processing. Rectified LinearUnit(ReLU)isgiven as activation functions to the neurons. Based on the inputs, the weights assigned to the inputs and the activation function assigned to the neurons,the network segregatesthe inputs into classes. 4. CONCLUSIONS Our systems elaborated based on real time Twitter data. Among many widely used text classification models, based on this study CNN suicidal detection model shows better performance. This model shows better optimization on iterative solutions. This system is primarily focuses on suicidal people by mining the databasesystemsothattweets related to suicide will get detected. FUTURE SCOPE When our system detects the real time suicidal text we can make the program evaluate in such a way that we can find the geographical location of that particular tweet.There are many NGO’s and anti suicidal organizations work against suicide.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 4410 we can inform these NGO’s about such accounts so as to take further appropriate action. This system is not only can be implemented on Twitter but also other social media platforms like Facebook, Instagram, and WhatsApp. REFERENCES [1] Glen Coppersmith, Mark Dredze, Craig Harman, and Kristy Hollingshead. 2015a. From ADHD to SAD: Analyzing the Language of Mental Health on Twitter through Self-Reported Diagnoses. In Computational Linguistics and Clinical Psychology, pages 1– 10M. Young, The Technical Writer’s Handbook. Mill Valley, CA: University Science, 1989. [2] American Psychiatric Association. 2013. Diagnostic and statistical manual of mental disorders (5th ed.), 5 edition. American PsychiatricPublishing, Washington.K. Elissa, “Title of paper if known,” unpublished. [3] Depression and Other Common Mental Disorders: Global Health Estimates. World Health Organization, 2017. [4] BasicstructureaboutCNNWikipedia//en.wikipedia.org/wiki/ Convolutionalneuralnetwork. [5] E. Okhapkina, V. Okhapkin, and O. Kazarin, “Adaptation of information retrieval methods for identifying of destructive informational influence in social networks,” in 2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 87–92, Taipei, Taiwan, 2017, IEEE. [6] M. Mulholland and J. Quinn, “Suicidal tendencies: the automatic classification of suicidal and non-suicidal lyricists using NLP,” in Proceedings of the Sixth International Joint Conference on Natural Language Processing, pp. 680–684, Nagoya, Japan, 2013. [7] X. Huang, L. Zhang, D. Chiu, T. Liu, X. Li, and T.Zhu, “Detecting suicidal ideation in Chinese microblogs with psychological lexicons, “in 2014 IEEE 11th Intl Conf on Ubiquitous Intelligence and Computing and 2014 IEEE 11th Intl Conf on Autonomic and Trusted Computing and 2014 IEEE 14th Intl Conf onScalableComputingand Communications and Its Associated Workshops, pp. 844–849, Bali, Indonesia, 2014, IEEE [8] J. Pestian, H. Nasrallah, P. Matykiewicz, A. Bennett, and A. Leenaars, “Suicide note classification using natural language processing: a content analysis,” Biomedical Informatics Insights, vol. 3, pp. 19–28, 2010. [9] D. Delgado-Gomez, H. Blasco-Fontecilla, F. Sukno, M. Socorro Ramos-Plasencia, and E. Baca-Garcia, “Suicide attempters classification: toward predictive models of suicidal behavior,” Neurocomputing,vol.92,pp.3–8,2012. [10] Research Article Based on ”Supervised Learningfor Suicidal Ideation Detection in Online User Content” by Shaoxiong Ji, Celina Ping Yu, Sai-fu Fung, Shirui Pan,and Guodong Long. [11] S. Chattopadhyay,“Astudyonsuicidal risk analysis,” in 2007 9th International Conference on e-Health Networking, Application and Services,pp.74–78, Taipei, Taiwan, 2007, IEEE.