SlideShare ist ein Scribd-Unternehmen logo
1 von 11
Downloaden Sie, um offline zu lesen
Topic Modelling in Social
Media
Group 28 Project 06
Group Members
● Prateek Mehta (201203006)
prateek.mehta@students.iiit.ac.in
● Saurabh Kanaujia (201305551)
saurabh.kanaujia@students.iiit.ac.in
● Nikita Nataraj (201101079)
nikita.nataraj@students.iiit.ac.in
Aim
● Apply topic-modelling techniques to social
media.
● Main focus: reduce the cost of computing
LDA-model in social networks and this
technique should be scalable.
● Efficient representation and calculation of
topics of whole network.
Introduction
Topical categorization of blogs, documents or other objects that can be
tagged with text, improves the experience for end users. When the set of
documents is very large and varies significantly from user to user, the task of
calculating a single global topic model, or an individual topic model for each
and every user can become very expensive in large scale internet settings. In
order to implement topic modelling, we have used LDA. Latent Dirichlet
allocation (LDA)is an unsupervised, probabilistic, text clustering algorithm.
LDA defines a generative model that can be used to model how documents
are generated given a set of topics and the words in the topics. We have
chosen to LDA because it is more convenient to model more human like
corpus, in other other words social media.
Possible Approaches
1. Find LDA model for each user in network. (very costly)
2. Find top K influential users and apply LDA model for
these.
3. Classifying communities and apply the LDA model
across communities.
We tried to implement Approach 2 and 3.
Approach No. 3 Drawbacks
● This community detection is based upon bi-directional
follower-followee relationship. only 22-23% users in
twitter have such relationship where they follow each
other.
● Implementation to find communities based upon uni-
directional follower-followee relationship was not
possible and scalable.
Approach No. 2
Phase 1: Finding Influential Users
● Top-k users found using GraphChi API page rank
algorithm.
● Fetched tweets and URLs embedded with them.
Metadata, tags, ids are also fetched.
● Crawled the URLs, and summarized them.
● Tweets document + URI summary used as training data
Approach No.2
Phase 1: Diagram
Approach No. 2
Phase 2: User Similarity
● Tweets and urls are fetched. Url is summarised to 15-
20 sentences.
● Jaccard index is calculated to match user with one of
the top users.
● Maximum Jaccard index implies that user adopts the
topic distribution with the corresponding
Approach No. 2
Phase 2: Diagram
Conclusion
Out of the three approaches that were
proposed, the second one, in which we define
100 top users and create an LDA model for
each.

Weitere ähnliche Inhalte

Andere mochten auch

Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...
Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...
Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...Digital History
 
StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...
StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...
StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...Symeon Papadopoulos
 
Topic Modelling: Tutorial on Usage and Applications
Topic Modelling: Tutorial on Usage and ApplicationsTopic Modelling: Tutorial on Usage and Applications
Topic Modelling: Tutorial on Usage and ApplicationsAyush Jain
 
Fabrikatyr lda topic modelling practical application
Fabrikatyr lda topic modelling practical applicationFabrikatyr lda topic modelling practical application
Fabrikatyr lda topic modelling practical applicationTim Carnus
 
Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...
Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...
Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...Alexis Perrier
 
Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation Daniele Di Mitri
 
Topic Modelling to identify behavioral trends in online communities
Topic Modelling to identify behavioral trends in online communities Topic Modelling to identify behavioral trends in online communities
Topic Modelling to identify behavioral trends in online communities Conor Duke
 
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016Jonathan Sedar
 
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...Vasily Leksin
 
Word2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad MahdaviWord2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad Mahdaviirpycon
 
An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"sandinmyjoints
 
Database - Entity Relationship Diagram (ERD)
Database - Entity Relationship Diagram (ERD)Database - Entity Relationship Diagram (ERD)
Database - Entity Relationship Diagram (ERD)Mudasir Qazi
 
Entity Relationship Diagram
Entity Relationship DiagramEntity Relationship Diagram
Entity Relationship DiagramShakila Mahjabin
 
How to Draw an Effective ER diagram
How to Draw an Effective ER diagramHow to Draw an Effective ER diagram
How to Draw an Effective ER diagramTech_MX
 
Topic Modelling and APIs
Topic Modelling and APIsTopic Modelling and APIs
Topic Modelling and APIsAli Kheyrollahi
 
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vecword2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec👋 Christopher Moody
 

Andere mochten auch (17)

Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...
Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...
Rob Nelson - Ideology and algorithms: the uses of nationalism in the American...
 
StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...
StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...
StreamGrid: Summarization of large-scale Events using Topic Modeling and Temp...
 
Topic Modelling: Tutorial on Usage and Applications
Topic Modelling: Tutorial on Usage and ApplicationsTopic Modelling: Tutorial on Usage and Applications
Topic Modelling: Tutorial on Usage and Applications
 
Fabrikatyr lda topic modelling practical application
Fabrikatyr lda topic modelling practical applicationFabrikatyr lda topic modelling practical application
Fabrikatyr lda topic modelling practical application
 
Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...
Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...
Topic modeling of Twitter followers - Paris Machine Learning meetup - Alex Pe...
 
Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation
 
Topic Modelling to identify behavioral trends in online communities
Topic Modelling to identify behavioral trends in online communities Topic Modelling to identify behavioral trends in online communities
Topic Modelling to identify behavioral trends in online communities
 
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
Topic Modelling on the Enron Email Corpus @ ODSC 13 Apr 2016
 
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
 
Word2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad MahdaviWord2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad Mahdavi
 
An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"
 
Database - Entity Relationship Diagram (ERD)
Database - Entity Relationship Diagram (ERD)Database - Entity Relationship Diagram (ERD)
Database - Entity Relationship Diagram (ERD)
 
Entity Relationship Diagram
Entity Relationship DiagramEntity Relationship Diagram
Entity Relationship Diagram
 
How to Draw an Effective ER diagram
How to Draw an Effective ER diagramHow to Draw an Effective ER diagram
How to Draw an Effective ER diagram
 
Topic Modelling and APIs
Topic Modelling and APIsTopic Modelling and APIs
Topic Modelling and APIs
 
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vecword2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
 
Vectors
Vectors Vectors
Vectors
 

Ähnlich wie SocialLda

Social Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network ContextSocial Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network ContextIRJET Journal
 
Scalable recommendation with social contextual information
Scalable recommendation with social contextual informationScalable recommendation with social contextual information
Scalable recommendation with social contextual informationeSAT Journals
 
Preliminry report
 Preliminry report Preliminry report
Preliminry reportJiten Ahuja
 
Metadata mapping and vocabulary: consistency for all in scholarly communicati...
Metadata mapping and vocabulary: consistency for all in scholarly communicati...Metadata mapping and vocabulary: consistency for all in scholarly communicati...
Metadata mapping and vocabulary: consistency for all in scholarly communicati...CILIP MDG
 
Improving Effort Estimation in Agile Software Development Projects
Improving Effort Estimation in Agile Software Development ProjectsImproving Effort Estimation in Agile Software Development Projects
Improving Effort Estimation in Agile Software Development ProjectsGedi Siuskus
 
Graph Neural Networks for Social Recommendation.pptx
Graph Neural Networks for Social Recommendation.pptxGraph Neural Networks for Social Recommendation.pptx
Graph Neural Networks for Social Recommendation.pptxssuser2624f71
 
Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...
Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...
Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...TELKOMNIKA JOURNAL
 
Profile Analysis of Users in Data Analytics Domain
Profile Analysis of   Users in Data Analytics DomainProfile Analysis of   Users in Data Analytics Domain
Profile Analysis of Users in Data Analytics DomainDrjabez
 
srd117.final.512Spring2016
srd117.final.512Spring2016srd117.final.512Spring2016
srd117.final.512Spring2016Saurabh Deochake
 
Integrated expert recommendation model for online communitiesst02
Integrated expert recommendation model for online communitiesst02Integrated expert recommendation model for online communitiesst02
Integrated expert recommendation model for online communitiesst02IJwest
 
A Personalized Software Assistant Framework To Achieve User Goals
A Personalized Software Assistant Framework To Achieve User GoalsA Personalized Software Assistant Framework To Achieve User Goals
A Personalized Software Assistant Framework To Achieve User GoalsPradeep K. Venkatesh
 
IRJET- Event Detection and Text Summary by Disaster Warning
IRJET- Event Detection and Text Summary by Disaster WarningIRJET- Event Detection and Text Summary by Disaster Warning
IRJET- Event Detection and Text Summary by Disaster WarningIRJET Journal
 
Social media community using optimized algorithm by M. Gomathi / Lecturer
Social media community using optimized algorithm by M. Gomathi / LecturerSocial media community using optimized algorithm by M. Gomathi / Lecturer
Social media community using optimized algorithm by M. Gomathi / Lecturergomathi chlm
 
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...IRJET Journal
 
Analyzing User Modeling on Twitter for Personalized News Recommendations
Analyzing User Modeling on Twitter for Personalized News RecommendationsAnalyzing User Modeling on Twitter for Personalized News Recommendations
Analyzing User Modeling on Twitter for Personalized News RecommendationsGUANGYUAN PIAO
 

Ähnlich wie SocialLda (20)

Topic model
Topic modelTopic model
Topic model
 
Topic modeling
Topic modelingTopic modeling
Topic modeling
 
Social Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network ContextSocial Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network Context
 
Scalable recommendation with social contextual information
Scalable recommendation with social contextual informationScalable recommendation with social contextual information
Scalable recommendation with social contextual information
 
Preliminry report
 Preliminry report Preliminry report
Preliminry report
 
Metadata mapping and vocabulary: consistency for all in scholarly communicati...
Metadata mapping and vocabulary: consistency for all in scholarly communicati...Metadata mapping and vocabulary: consistency for all in scholarly communicati...
Metadata mapping and vocabulary: consistency for all in scholarly communicati...
 
Improving Effort Estimation in Agile Software Development Projects
Improving Effort Estimation in Agile Software Development ProjectsImproving Effort Estimation in Agile Software Development Projects
Improving Effort Estimation in Agile Software Development Projects
 
Database design concept
Database design conceptDatabase design concept
Database design concept
 
Graph Neural Networks for Social Recommendation.pptx
Graph Neural Networks for Social Recommendation.pptxGraph Neural Networks for Social Recommendation.pptx
Graph Neural Networks for Social Recommendation.pptx
 
Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...
Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...
Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...
 
Profile Analysis of Users in Data Analytics Domain
Profile Analysis of   Users in Data Analytics DomainProfile Analysis of   Users in Data Analytics Domain
Profile Analysis of Users in Data Analytics Domain
 
PPT.pptx
PPT.pptxPPT.pptx
PPT.pptx
 
srd117.final.512Spring2016
srd117.final.512Spring2016srd117.final.512Spring2016
srd117.final.512Spring2016
 
Integrated expert recommendation model for online communitiesst02
Integrated expert recommendation model for online communitiesst02Integrated expert recommendation model for online communitiesst02
Integrated expert recommendation model for online communitiesst02
 
Q046049397
Q046049397Q046049397
Q046049397
 
A Personalized Software Assistant Framework To Achieve User Goals
A Personalized Software Assistant Framework To Achieve User GoalsA Personalized Software Assistant Framework To Achieve User Goals
A Personalized Software Assistant Framework To Achieve User Goals
 
IRJET- Event Detection and Text Summary by Disaster Warning
IRJET- Event Detection and Text Summary by Disaster WarningIRJET- Event Detection and Text Summary by Disaster Warning
IRJET- Event Detection and Text Summary by Disaster Warning
 
Social media community using optimized algorithm by M. Gomathi / Lecturer
Social media community using optimized algorithm by M. Gomathi / LecturerSocial media community using optimized algorithm by M. Gomathi / Lecturer
Social media community using optimized algorithm by M. Gomathi / Lecturer
 
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
 
Analyzing User Modeling on Twitter for Personalized News Recommendations
Analyzing User Modeling on Twitter for Personalized News RecommendationsAnalyzing User Modeling on Twitter for Personalized News Recommendations
Analyzing User Modeling on Twitter for Personalized News Recommendations
 

Kürzlich hochgeladen

FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxcallscotland1987
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdfssuserdda66b
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 

Kürzlich hochgeladen (20)

FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 

SocialLda

  • 1. Topic Modelling in Social Media Group 28 Project 06
  • 2. Group Members ● Prateek Mehta (201203006) prateek.mehta@students.iiit.ac.in ● Saurabh Kanaujia (201305551) saurabh.kanaujia@students.iiit.ac.in ● Nikita Nataraj (201101079) nikita.nataraj@students.iiit.ac.in
  • 3. Aim ● Apply topic-modelling techniques to social media. ● Main focus: reduce the cost of computing LDA-model in social networks and this technique should be scalable. ● Efficient representation and calculation of topics of whole network.
  • 4. Introduction Topical categorization of blogs, documents or other objects that can be tagged with text, improves the experience for end users. When the set of documents is very large and varies significantly from user to user, the task of calculating a single global topic model, or an individual topic model for each and every user can become very expensive in large scale internet settings. In order to implement topic modelling, we have used LDA. Latent Dirichlet allocation (LDA)is an unsupervised, probabilistic, text clustering algorithm. LDA defines a generative model that can be used to model how documents are generated given a set of topics and the words in the topics. We have chosen to LDA because it is more convenient to model more human like corpus, in other other words social media.
  • 5. Possible Approaches 1. Find LDA model for each user in network. (very costly) 2. Find top K influential users and apply LDA model for these. 3. Classifying communities and apply the LDA model across communities. We tried to implement Approach 2 and 3.
  • 6. Approach No. 3 Drawbacks ● This community detection is based upon bi-directional follower-followee relationship. only 22-23% users in twitter have such relationship where they follow each other. ● Implementation to find communities based upon uni- directional follower-followee relationship was not possible and scalable.
  • 7. Approach No. 2 Phase 1: Finding Influential Users ● Top-k users found using GraphChi API page rank algorithm. ● Fetched tweets and URLs embedded with them. Metadata, tags, ids are also fetched. ● Crawled the URLs, and summarized them. ● Tweets document + URI summary used as training data
  • 9. Approach No. 2 Phase 2: User Similarity ● Tweets and urls are fetched. Url is summarised to 15- 20 sentences. ● Jaccard index is calculated to match user with one of the top users. ● Maximum Jaccard index implies that user adopts the topic distribution with the corresponding
  • 10. Approach No. 2 Phase 2: Diagram
  • 11. Conclusion Out of the three approaches that were proposed, the second one, in which we define 100 top users and create an LDA model for each.