SlideShare ist ein Scribd-Unternehmen logo
1 von 45
Downloaden Sie, um offline zu lesen
Detecting Fake Engagement on
Instagram
Indira Sen
linkedin/in/indira-sen-8
a6068140
@drealcharbar fb.com/indira.sen.31
Dr. Ponnurangam Kumaraguru
(chair)
1
Thesis Committee
- Dr. Anwitaman Datta, NTU Singapore
- Mr. Nitendra Rajput, InfoEdge
- Dr. Ponnurangam Kumaraguru, IIIT Delhi (Chair)
2
Likes on Instagram
3,363 likes
3
Likes on Instagram
1,008 likes
4
Why is Engagement Important on Instagram?
5
Why Fake Likes?
- ‘Influencers’ compensated on engagement: likes and
comments
- Incentive to artificially inflate engagement metrics by
purchasing likes, like markets or like back networks
- Inflated like count fool potential brand or advertisers into
hiring ‘unworthy’ Influencers
6
Motivation
7
- Influencer Marketing - $1B industry
- Fake influencers landed deals over
$500
- How do we automatically detect fraudulent likes on
Instagram?
Core Thesis Question
Organic Likes
- Likers who engage with content
- Genuine reach
Inorganic Likes
- Likers bought from marketplaces
- Artificial reach
- Understanding properties of genuine liking behaviour B : {b1
, b2
, …, bn
}
- Reducing the effect of likes which do not match B
8
Thesis Outline
- Research Aim
- Data Collection
- Analysis of Fake Likes
- Machine Learning Classifier to Detect Fake Likes
- Estimating Reach of Users
- Conclusion
9
What is a Like Instance?
- Given a poster S whose post p has been liked by liker L,
we define a like instance as the tuple (L, p, S)
10
Research Aim
- Find out the features of liker L, post p and S, to
determine the probability of liker L genuinely liking
that particular post p.
- Identify true reach of poster by determining fake
likes received on the posted content.
11
Possible Reasons for Genuine Liking
Homepage:
followees’ posts
Explore:
Instagram’s
Recommendations
Likes of followees
12
Possible Reasons for Genuine Liking
Based on photos
you liked
Based on people
you follow
Similar to accounts
you interact with
Explore
13
Possible Reasons For Genuine Liking
- Poster is a followee
- Poster is a followee of a followee
- Topical interests in common
14
How to get Fake Likes
- Marketplaces
- Like Back collusion
networks
- Link Farming hashtags
- Bots
15
Architecture Diagram 1) Liker meta and last 18 posts
2) Poster meta and last 18 posts
3) Post meta
Fake Likes
Other Likes
Training Data
Machine
Learning
Model
Random
unknown Likes
Fake
Not Fake
Features
Features
16
1 - α
α
Data Collection: Fake Likes
Purchased Fake
Likes
Fake Likes 1: Likes given
by Honeypot victims
Likes on videos
with views = 0
Honeypot
Fake Likes 2
victim?
Instagram
Featured users
Snowball
Sample to
1M
Random
sample of
500
Honeypot Other Likes
not
victim?
17
Instagram
Featured users
Snowball
Sample to
1M
Random
sample of
500
Honeypot Other Likes
not
victim?
Data Collection: Fake Likes
Purchased Fake
Likes
Fake Likes 1: Likes given
by Honeypot victims
Likes on videos
with views = 0
Honeypot
Fake Likes 2
victim?
17
Data Collection: Fake Likes
- Honeypots to trap fake likers bought through a service
- If user falls for honeypot then we monitor their liking
behaviour
Honeypot
18
Instagram
Featured users
Snowball
Sample to
1M
Random
sample of
500
Honeypot Other Likes
not
victim?
Data Collection: Fake Likes
Purchased Fake
Likes
Fake Likes 1: Likes given
by Honeypot victims
Likes on videos
with views = 0
Honeypot
Fake Likes 2
victim?
19
Data Collection: Other Likes
Purchased Fake
Likers
Fake Likes 1: Likes given
by Honeypot victims
Likes on videos
with views = 0
Honeypot
Fake Likes 2
victim?
Instagram
Featured users
Snowball
Sample to
1M
Random
sample of
500
Honeypot Other Likes
not
victim?
20
Data Collection: Other Likes
- Randomly sample 500 users from 1M users who are not
honeypot victims
#Likes #Posts #Likers #Posters
Fake 10,417 8,408 500 7,715
Other 11,810 11,644 500 7,631
21
Thesis Outline
- Research Aim
- Data Collection
- Analysis of Fake Likes
- Machine Learning Classifier to Detect Fake Likes
- Estimating Reach of Users
- Conclusion
22
Understanding Fake Likes
- Hypotheses indicative of fake liking behaviour
- Validate with 2 sample KS test
- Network effect:
- Liker is follower of poster
- Liker is follower of follower of poster
23
Liker is Follower of Poster
- Green edges: liker relationship
- Red edges: liker - follower
relationship
- Other likes have a higher
proportion of follower-likers
24
Other Likes
Fake Likes
Network Effects
25
- 90% fake like instances have only
.25 of followee likes
90%
56%
Interest Overlap
- A user will like a post if she shares topical interests with
the post
- Affinity: lower the affinity, the higher the overlap
26
Extracting Topics
- Bio, post text and post image
- Wikification and Densecap for images
27
Extracting Topics
- Bio, post text and post image
- Wikification and Densecap for images
28
Image topics
Post caption topics
Interest Overlap
- A user will like a post if she shares topical interests with
the post
- Affinity
- non-commutative
29
Affinity
- Affinity outperforms Jaccard distance in terms of
discernibility
- post image topics strong indicators of genuine liking
30
- Our metric is able to capture semantic relationship
between entities compared to other traditional distance
metrics
- 90% of other likes have an average affinity of 0.5
- 90% of fake likes have an average affinity of 0.74
0.740.5
31
Other Features
- Celebrities tend to get more likes (engagement)
- Genuine likers will keep coming back - repeated likers
- Link Farming hashtags: #like4like, #l4l, #like2follow
- Topical hashtags
- Posting activity of liker (Badri et al, CIKM’16) and poster
- Profile picture of liker: egghead profiles (cheap to
create)
32
Automatic Detection of Fake Likes
- Using features described and a set of ML classifiers
- Fake likes : Other likes ratio → 1:2
- SVM RBF kernel gives best performance
33
Classification Model
- Performance
- Manually look at 100 false negatives and find that 70 of
them had high topical overlap
- Liker interest set was small: affinity metric limitation
Precision Recall F1-score
0 0.93 0.96 0.945
1 0.895 0.825 0.86
total 0.92 0.925 0.92
34
In the Wild Experiment
- random 1,34,669 like instances
- Categorize posts into : food, fashion, outdoors,
merchandise, people, gadgets, pets, captioned
- We find 8,557 fake likes
- Manually analyze 100 of these and find 78 to be fake
35
Thesis Outline
- Research Aim
- Data Collection
- Analysis of Fake Likes
- Machine Learning Classifier to Detect Fake Likes
- Estimating Reach of Users
- Conclusion
36
- Enable advertisers to make better decisions
- Reduce the effect of fake likes a poster may have
received
- Measure Deviation in reach
Reach Estimation
37
Who receives fake likes?
- Users posting about merchandise, outdoors (including
travel posts) and people (posts containing faces) have
highest deviation from the projected reach.
38
Who receives fake likes?
39
merchandise, outdoors (including travel posts) and people
Most posters do not have high deviation while
some users have very high deviation
Do Popular Users have more Fake Likes?
- No, users with lower follower counts who maybe trying to
gain a following higher deviation
40
‘Micro Influencers’ have higher deviation
Conclusion
- Automated method to detect fake like instances
- Performs well to identify unseen fake likes on Instagram.
- Find true reach of a user
- Helps advertisers and brands identify users with genuine,
meaningful reach
41
Challenges, Limitations and Future Work
- Availability of labeled data, approximations using
honeypot
- Data collection constraints, integrate network features
- Improve affinity, improve precision(dynamic features)
- Fine grained topical recommendations for brands and
advertisers 42
Acknowledgement
- Anupama Aggarwal, PhD Scholar, IIIT Delhi
- Committee members
- Srishti Gupta, Divyansh Agarwal, Neha Jawalkar, Sonu
Gupta, Kushagra Bhargava
- Siddharth Singh, Shiven Mian
- Members of Precog
- Family and friends
43
References
- https://instamacro.com/
- http://nymag.com/selectall/2017/08/fake-instagram-accou
nt-earns-sponsored-influencer-money.html
- http://www.independent.co.uk/life-style/gadgets-and-tech/
social-media-experiment-fake-instagram-accounts-make-
money-influencer-star-blogger-mediakix-a7887836.html
- http://nymag.com/selectall/2017/08/fake-instagram-accou
nt-earns-sponsored-influencer-money.html
44
Thanks!
Any questions?
You can find me at:
indira15021@iiitd.ac.in
45
pk@iiitd.ac.in

Weitere ähnliche Inhalte

Mehr von IIIT Hyderabad

Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...IIIT Hyderabad
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper IIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasIIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in IndiaIIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in IndiaIIIT Hyderabad
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...IIIT Hyderabad
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceIIIT Hyderabad
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...IIIT Hyderabad
 
A Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesA Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesIIIT Hyderabad
 
A Framework For Automatic Question Answering in Indian Languages
A Framework For Automatic Question Answering in Indian LanguagesA Framework For Automatic Question Answering in Indian Languages
A Framework For Automatic Question Answering in Indian LanguagesIIIT Hyderabad
 
Exposing, Examining and Intervening Fake News
Exposing, Examining and Intervening Fake NewsExposing, Examining and Intervening Fake News
Exposing, Examining and Intervening Fake NewsIIIT Hyderabad
 
It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
 It's MY JOB: Identifying and Improving Content Quality for Online recruitmen... It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...IIIT Hyderabad
 
De-anonymizing, Preserving and Democratizing Data Privacy and Ownership
De-anonymizing, Preserving and Democratizing Data Privacy and OwnershipDe-anonymizing, Preserving and Democratizing Data Privacy and Ownership
De-anonymizing, Preserving and Democratizing Data Privacy and OwnershipIIIT Hyderabad
 
Justice Delayed is Justice Denied: Enabling Legal Artificial Intelligence via...
Justice Delayed is Justice Denied: Enabling Legal Artificial Intelligence via...Justice Delayed is Justice Denied: Enabling Legal Artificial Intelligence via...
Justice Delayed is Justice Denied: Enabling Legal Artificial Intelligence via...IIIT Hyderabad
 
NLP / Language Research at Precog
NLP / Language Research at PrecogNLP / Language Research at Precog
NLP / Language Research at PrecogIIIT Hyderabad
 
“It is our choices, Harry, that show what we truly are, far more than our abi...
“It is our choices, Harry, that show what we truly are, far more than our abi...“It is our choices, Harry, that show what we truly are, far more than our abi...
“It is our choices, Harry, that show what we truly are, far more than our abi...IIIT Hyderabad
 
What's Kooking? Characterizing India's Emerging Social Network, Koo
What's Kooking? Characterizing India's Emerging Social Network, KooWhat's Kooking? Characterizing India's Emerging Social Network, Koo
What's Kooking? Characterizing India's Emerging Social Network, KooIIIT Hyderabad
 

Mehr von IIIT Hyderabad (20)

Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBias
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial Advice
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...
 
A Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesA Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian Languages
 
A Framework For Automatic Question Answering in Indian Languages
A Framework For Automatic Question Answering in Indian LanguagesA Framework For Automatic Question Answering in Indian Languages
A Framework For Automatic Question Answering in Indian Languages
 
Exposing, Examining and Intervening Fake News
Exposing, Examining and Intervening Fake NewsExposing, Examining and Intervening Fake News
Exposing, Examining and Intervening Fake News
 
It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
 It's MY JOB: Identifying and Improving Content Quality for Online recruitmen... It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
 
De-anonymizing, Preserving and Democratizing Data Privacy and Ownership
De-anonymizing, Preserving and Democratizing Data Privacy and OwnershipDe-anonymizing, Preserving and Democratizing Data Privacy and Ownership
De-anonymizing, Preserving and Democratizing Data Privacy and Ownership
 
Justice Delayed is Justice Denied: Enabling Legal Artificial Intelligence via...
Justice Delayed is Justice Denied: Enabling Legal Artificial Intelligence via...Justice Delayed is Justice Denied: Enabling Legal Artificial Intelligence via...
Justice Delayed is Justice Denied: Enabling Legal Artificial Intelligence via...
 
NLP / Language Research at Precog
NLP / Language Research at PrecogNLP / Language Research at Precog
NLP / Language Research at Precog
 
“It is our choices, Harry, that show what we truly are, far more than our abi...
“It is our choices, Harry, that show what we truly are, far more than our abi...“It is our choices, Harry, that show what we truly are, far more than our abi...
“It is our choices, Harry, that show what we truly are, far more than our abi...
 
What's Kooking? Characterizing India's Emerging Social Network, Koo
What's Kooking? Characterizing India's Emerging Social Network, KooWhat's Kooking? Characterizing India's Emerging Social Network, Koo
What's Kooking? Characterizing India's Emerging Social Network, Koo
 

Kürzlich hochgeladen

Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Call Girls Mumbai
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueBhangaleSonal
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxNadaHaitham1
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"mphochane1998
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEselvakumar948
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Servicemeghakumariji156
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadhamedmustafa094
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...drmkjayanthikannan
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...Amil baba
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptNANDHAKUMARA10
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdfAldoGarca30
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxmaisarahman1
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 

Kürzlich hochgeladen (20)

Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptx
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal load
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 

Detecting Fake Engagement on Instagram

  • 1. Detecting Fake Engagement on Instagram Indira Sen linkedin/in/indira-sen-8 a6068140 @drealcharbar fb.com/indira.sen.31 Dr. Ponnurangam Kumaraguru (chair) 1
  • 2. Thesis Committee - Dr. Anwitaman Datta, NTU Singapore - Mr. Nitendra Rajput, InfoEdge - Dr. Ponnurangam Kumaraguru, IIIT Delhi (Chair) 2
  • 5. Why is Engagement Important on Instagram? 5
  • 6. Why Fake Likes? - ‘Influencers’ compensated on engagement: likes and comments - Incentive to artificially inflate engagement metrics by purchasing likes, like markets or like back networks - Inflated like count fool potential brand or advertisers into hiring ‘unworthy’ Influencers 6
  • 7. Motivation 7 - Influencer Marketing - $1B industry - Fake influencers landed deals over $500
  • 8. - How do we automatically detect fraudulent likes on Instagram? Core Thesis Question Organic Likes - Likers who engage with content - Genuine reach Inorganic Likes - Likers bought from marketplaces - Artificial reach - Understanding properties of genuine liking behaviour B : {b1 , b2 , …, bn } - Reducing the effect of likes which do not match B 8
  • 9. Thesis Outline - Research Aim - Data Collection - Analysis of Fake Likes - Machine Learning Classifier to Detect Fake Likes - Estimating Reach of Users - Conclusion 9
  • 10. What is a Like Instance? - Given a poster S whose post p has been liked by liker L, we define a like instance as the tuple (L, p, S) 10
  • 11. Research Aim - Find out the features of liker L, post p and S, to determine the probability of liker L genuinely liking that particular post p. - Identify true reach of poster by determining fake likes received on the posted content. 11
  • 12. Possible Reasons for Genuine Liking Homepage: followees’ posts Explore: Instagram’s Recommendations Likes of followees 12
  • 13. Possible Reasons for Genuine Liking Based on photos you liked Based on people you follow Similar to accounts you interact with Explore 13
  • 14. Possible Reasons For Genuine Liking - Poster is a followee - Poster is a followee of a followee - Topical interests in common 14
  • 15. How to get Fake Likes - Marketplaces - Like Back collusion networks - Link Farming hashtags - Bots 15
  • 16. Architecture Diagram 1) Liker meta and last 18 posts 2) Poster meta and last 18 posts 3) Post meta Fake Likes Other Likes Training Data Machine Learning Model Random unknown Likes Fake Not Fake Features Features 16 1 - α α
  • 17. Data Collection: Fake Likes Purchased Fake Likes Fake Likes 1: Likes given by Honeypot victims Likes on videos with views = 0 Honeypot Fake Likes 2 victim? Instagram Featured users Snowball Sample to 1M Random sample of 500 Honeypot Other Likes not victim? 17 Instagram Featured users Snowball Sample to 1M Random sample of 500 Honeypot Other Likes not victim? Data Collection: Fake Likes Purchased Fake Likes Fake Likes 1: Likes given by Honeypot victims Likes on videos with views = 0 Honeypot Fake Likes 2 victim? 17
  • 18. Data Collection: Fake Likes - Honeypots to trap fake likers bought through a service - If user falls for honeypot then we monitor their liking behaviour Honeypot 18
  • 19. Instagram Featured users Snowball Sample to 1M Random sample of 500 Honeypot Other Likes not victim? Data Collection: Fake Likes Purchased Fake Likes Fake Likes 1: Likes given by Honeypot victims Likes on videos with views = 0 Honeypot Fake Likes 2 victim? 19
  • 20. Data Collection: Other Likes Purchased Fake Likers Fake Likes 1: Likes given by Honeypot victims Likes on videos with views = 0 Honeypot Fake Likes 2 victim? Instagram Featured users Snowball Sample to 1M Random sample of 500 Honeypot Other Likes not victim? 20
  • 21. Data Collection: Other Likes - Randomly sample 500 users from 1M users who are not honeypot victims #Likes #Posts #Likers #Posters Fake 10,417 8,408 500 7,715 Other 11,810 11,644 500 7,631 21
  • 22. Thesis Outline - Research Aim - Data Collection - Analysis of Fake Likes - Machine Learning Classifier to Detect Fake Likes - Estimating Reach of Users - Conclusion 22
  • 23. Understanding Fake Likes - Hypotheses indicative of fake liking behaviour - Validate with 2 sample KS test - Network effect: - Liker is follower of poster - Liker is follower of follower of poster 23
  • 24. Liker is Follower of Poster - Green edges: liker relationship - Red edges: liker - follower relationship - Other likes have a higher proportion of follower-likers 24 Other Likes Fake Likes
  • 25. Network Effects 25 - 90% fake like instances have only .25 of followee likes 90% 56%
  • 26. Interest Overlap - A user will like a post if she shares topical interests with the post - Affinity: lower the affinity, the higher the overlap 26
  • 27. Extracting Topics - Bio, post text and post image - Wikification and Densecap for images 27
  • 28. Extracting Topics - Bio, post text and post image - Wikification and Densecap for images 28 Image topics Post caption topics
  • 29. Interest Overlap - A user will like a post if she shares topical interests with the post - Affinity - non-commutative 29
  • 30. Affinity - Affinity outperforms Jaccard distance in terms of discernibility - post image topics strong indicators of genuine liking 30
  • 31. - Our metric is able to capture semantic relationship between entities compared to other traditional distance metrics - 90% of other likes have an average affinity of 0.5 - 90% of fake likes have an average affinity of 0.74 0.740.5 31
  • 32. Other Features - Celebrities tend to get more likes (engagement) - Genuine likers will keep coming back - repeated likers - Link Farming hashtags: #like4like, #l4l, #like2follow - Topical hashtags - Posting activity of liker (Badri et al, CIKM’16) and poster - Profile picture of liker: egghead profiles (cheap to create) 32
  • 33. Automatic Detection of Fake Likes - Using features described and a set of ML classifiers - Fake likes : Other likes ratio → 1:2 - SVM RBF kernel gives best performance 33
  • 34. Classification Model - Performance - Manually look at 100 false negatives and find that 70 of them had high topical overlap - Liker interest set was small: affinity metric limitation Precision Recall F1-score 0 0.93 0.96 0.945 1 0.895 0.825 0.86 total 0.92 0.925 0.92 34
  • 35. In the Wild Experiment - random 1,34,669 like instances - Categorize posts into : food, fashion, outdoors, merchandise, people, gadgets, pets, captioned - We find 8,557 fake likes - Manually analyze 100 of these and find 78 to be fake 35
  • 36. Thesis Outline - Research Aim - Data Collection - Analysis of Fake Likes - Machine Learning Classifier to Detect Fake Likes - Estimating Reach of Users - Conclusion 36
  • 37. - Enable advertisers to make better decisions - Reduce the effect of fake likes a poster may have received - Measure Deviation in reach Reach Estimation 37
  • 38. Who receives fake likes? - Users posting about merchandise, outdoors (including travel posts) and people (posts containing faces) have highest deviation from the projected reach. 38
  • 39. Who receives fake likes? 39 merchandise, outdoors (including travel posts) and people Most posters do not have high deviation while some users have very high deviation
  • 40. Do Popular Users have more Fake Likes? - No, users with lower follower counts who maybe trying to gain a following higher deviation 40 ‘Micro Influencers’ have higher deviation
  • 41. Conclusion - Automated method to detect fake like instances - Performs well to identify unseen fake likes on Instagram. - Find true reach of a user - Helps advertisers and brands identify users with genuine, meaningful reach 41
  • 42. Challenges, Limitations and Future Work - Availability of labeled data, approximations using honeypot - Data collection constraints, integrate network features - Improve affinity, improve precision(dynamic features) - Fine grained topical recommendations for brands and advertisers 42
  • 43. Acknowledgement - Anupama Aggarwal, PhD Scholar, IIIT Delhi - Committee members - Srishti Gupta, Divyansh Agarwal, Neha Jawalkar, Sonu Gupta, Kushagra Bhargava - Siddharth Singh, Shiven Mian - Members of Precog - Family and friends 43
  • 44. References - https://instamacro.com/ - http://nymag.com/selectall/2017/08/fake-instagram-accou nt-earns-sponsored-influencer-money.html - http://www.independent.co.uk/life-style/gadgets-and-tech/ social-media-experiment-fake-instagram-accounts-make- money-influencer-star-blogger-mediakix-a7887836.html - http://nymag.com/selectall/2017/08/fake-instagram-accou nt-earns-sponsored-influencer-money.html 44
  • 45. Thanks! Any questions? You can find me at: indira15021@iiitd.ac.in 45 pk@iiitd.ac.in