SlideShare a Scribd company logo
1 of 63
On the Origins of Memes by Means of Fringe
Web Communities
Savvas Zannettou, Tristan Caulfield, Jeremy Blackburn, Emiliano De Cristofaro,
Michael Sirivianos, Gianluca Stringhini, Guillermo Suarez-Tangil
WARNING
IMAGERY IN THIS TALK IS
UNCENSORED AND MIGHT BE
OFFENSIVE
Talk Outline
Who we are and what
is our main line of work
Cross Platform Analysis
of Memes
Personal attacks
Who are we?
International Data-
driven Research for
Advanced Modeling
and Analysis Lab
(IDRAMA Lab)
IDRAMA Lab Overview
• Team of international researchers and academics
• Various backgrounds ranging from Computer Security, Cryptography to Social
Network Analysis and Physics
• Geographically distributed
IDRAMA Goal
Understand and mitigate emerging socio-technological issues on the Web
IDRAMA Approach
• Large-scale data-driven approach
• Analyzing billions of posts over several years
• Cross-Platform
• Twitter, Reddit, 4chan, Gab, etc.
• Quantitative Analysis
Online platforms do not exist in a vacuum
Looking at a single platform at a time is not enough to
capture online dynamics
We lack tools to effectively trace how
information spreads across different platforms
8
Memes are fun!
Not always though…
Hateful/Racist Memes
Hateful/Racist Memes
Memes in politics
Memes in politics
Memes in politics
Memes have become a popular,
and seemingly effective, method to
transmit ideology.
Memes have been weaponized
But what do
we really
know about
memes?
• How can we track meme propagation
across Web communities?
• Can we characterize Web
communities through their memes?
• Can we measure the influence of Web
communities with respect to memes
they share?
Memes processing pipeline
3.Clustering
1.pH ashExtraction
2.pH ash-basedPairw ise
DistanceCalculation
pHashes of some or all Web
communities' images
Clusters of images
5.ClusterAnnotation
Pairwise Comparisons
of pHashes
annotated
images
6.Associationof
Images toClusters
Annotated
Clusters
pHashes of 
annotated images
pHashes
(all Web Communities)
7.Analysis and
Influ
e
nceEs timation
Occurrences of Memes in
all Web Communities
4.Screenshot
Classifie
r
annotated
images
pHashes of non-screenshot
annotated images
Know Your
Meme
Generic
Annotation
Sites
Meme Annotation Sites
Generic
Web
Communities
4chan Twitter Reddit Gab
Web Communities posting Memes
images
Let’s see our data sources…
3.Clustering
1.pH ashExtraction
2.pH ash-basedPairw ise
DistanceCalculation
pHashes of some or all Web
communities' images
Clusters of images
5.ClusterAnnotation
Pairwise Comparisons
of pHashes
annotated
images
6.Associationof
Images toClusters
Annotated
Clusters
pHashes of 
annotated images
pHashes
(all Web Communities)
7.Analysis and
Influ
e
nceEs timation
Occurrences of Memes in
all Web Communities
4.Screenshot
Classifie
r
annotated
images
pHashes of non-screenshot
annotated images
Know Your
Meme
Generic
Annotation
Sites
Meme Annotation Sites
Generic
Web
Communities
4chan Twitter Reddit Gab
Web Communities posting Memes
images
Know Your Meme (KYM)
• Crowdsourced encyclopedia for Memes
• Provides useful metadata
• E.g., origin, descriptive tags, description, examples, image galleries
• Built custom crawler
• Obtained data for 15K KYM entries
• Download every image per entry (706K)
Datasets
# of posts 1.4B 1.0B 48M 12M 15K
# of posts with
images
242M 62M 13M 955K 15K
# of Images 114M 40M 4M 235K 706K
Perceptual hashing extraction
3.Clustering
1.pH ashExtraction
2.pH ash-basedPairw ise
DistanceCalculation
pHashes of some or all Web
communities' images
Clusters of images
5.ClusterAnnotation
Pairwise Comparisons
of pHashes
annotated
images
6.Associationof
Images toClusters
Annotated
Clusters
pHashes of 
annotated images
pHashes
(all Web Communities)
7.Analysis and
Influ
e
nceEs timation
Occurrences of Memes in
all Web Communities
4.Screenshot
Classifie
r
annotated
images
pHashes of non-screenshot
annotated images
Know Your
Meme
Generic
Annotation
Sites
Meme Annotation Sites
Generic
Web
Communities
4chan Twitter Reddit Gab
Web Communities posting Memes
images
Perceptual hashing (pHash)
• Generates a hash for each image
• Visually similar images have minor differences in
their hashes
• Reduces dimensionality of the images
• Run the pHash algorithm for
• All images from KYM (706K)
• All images from Twitter, Reddit, /pol/, and Gab
(159.5M)
Creating clusters of images/memes
3.Clustering
1.pH ashExtraction
2.pH ash-basedPairw ise
DistanceCalculation
pHashes of some or all Web
communities' images
Clusters of images
5.ClusterAnnotation
Pairwise Comparisons
of pHashes
annotated
images
6.Associationof
Images toClusters
Annotated
Clusters
pHashes of 
annotated images
pHashes
(all Web Communities)
7.Analysis and
Influ
e
nceEs timation
Occurrences of Memes in
all Web Communities
4.Screenshot
Classifie
r
annotated
images
pHashes of non-screenshot
annotated images
Know Your
Meme
Generic
Annotation
Sites
Meme Annotation Sites
Generic
Web
Communities
4chan Twitter Reddit Gab
Web Communities posting Memes
images
Pairwise comparisons and clustering
• Calculated all pairwise comparisons between
all pHashes from /pol/, The_Donald, and Gab
• Used TensorFlow and GPUs to speed-up the
process
• Hamming distance
• Performed clustering using:
• DBSCAN algorithm
Example clusters
Nut Button Meme Goofy’s Time Meme
Annotating clusters
3.Clustering
1.pH ashExtraction
2.pH ash-basedPairw ise
DistanceCalculation
pHashes of some or all Web
communities' images
Clusters of images
5.ClusterAnnotation
Pairwise Comparisons
of pHashes
annotated
images
6.Associationof
Images toClusters
Annotated
Clusters
pHashes of 
annotated images
pHashes
(all Web Communities)
7.Analysis and
Influ
e
nceEs timation
Occurrences of Memes in
all Web Communities
4.Screenshot
Classifie
r
annotated
images
pHashes of non-screenshot
annotated images
Know Your
Meme
Generic
Annotation
Sites
Meme Annotation Sites
Generic
Web
Communities
4chan Twitter Reddit Gab
Web Communities posting Memes
images
Annotating clusters
• Calculated medoid of each cluster
• “Representative” image in cluster
• Compared all medoids with all KYM images
• We have a hit if the Hamming distance is <= pre-
defined threshold
• Assign the representative label according to:
• Number of hits
• Average distance between all hits
• Performed small-scale evaluation of
annotations
Finding all memes and analyzing final dataset
3.Clustering
1.pH ashExtraction
2.pH ash-basedPairw ise
DistanceCalculation
pHashes of some or all Web
communities' images
Clusters of images
5.ClusterAnnotation
Pairwise Comparisons
of pHashes
annotated
images
6.Associationof
Images toClusters
Annotated
Clusters
pHashes of 
annotated images
pHashes
(all Web Communities)
7.Analysis and
Influ
e
nceEs timation
Occurrences of Memes in
all Web Communities
4.Screenshot
Classifie
r
annotated
images
pHashes of non-screenshot
annotated images
Know Your
Meme
Generic
Annotation
Sites
Meme Annotation Sites
Generic
Web
Communities
4chan Twitter Reddit Gab
Web Communities posting Memes
images
Top memes per Web community
Studying specific groups of memes
• Focus on racist and political memes
• Use KYM tags to find relevant memes
• “politics,” “2016 us presidential election,” “trump,” and
“clinton” tags
• “racism,” “racist,” or “antisemitism” tags
• Obtain 117 racist memes and 556 political memes
from KYM dataset
How are memes shared over time?
Political Memes Racist Memes
How are memes shared over time?
Political Memes Racist Memes
2nd US
presidential
debate
How are memes shared over time?
Political Memes Racist Memes
2016 US
elections
2nd US
presidential
debate
How memes are shared over time?
Political Memes Racist Memes
2016 US
elections
Gab activity
increase
2017
2nd US
presidential
debate
How are memes shared over time?
Political Memes Racist Memes
2016 US
elections
Gab activity
increase
2017
/pol/
constant
share
2nd US
presidential
debate
How are memes shared over time?
Political Memes Racist Memes
2016 US
elections
Gab activity
increase
2017
/pol/
constant
share
Gab activity
increase in
2017
2nd US
presidential
debate
How to quantify the influence?
• Hawkes processes
• Assume K processes
• Each with a rate of events (i.e., posting of a meme),
called the background rate
• An event can cause impulse responses in other
processes
• Increases the rates of other processes for a period of
time
• Enables us to assess root cause of events
Hawkes processes example
A
B
C
1
2
3
4
Background Rate A
Background Rate B
Background Rate C
Hawkes processes example
A
B
C
1
Background Rate A
Background Rate B
Background Rate C
Hawkes processes example
A
B
C
1
Background Rate A
Background Rate B
Background Rate C
Hawkes processes example
A
B
C
1
2
Background Rate A
Background Rate B
Background Rate C
Hawkes processes example
A
B
C
1
2
Background Rate A
Background Rate B
Background Rate C
Hawkes processes example
A
B
C
1
2
3
Background Rate A
Background Rate B
Background Rate C
Hawkes processes example
A
B
C
1
2
3
Background Rate A
Background Rate B
Background Rate C
Hawkes processes example
A
B
C
1
2
3
4
Background Rate A
Background Rate B
Background Rate C
For our
purposes…
• Hawkes model with 5 processes
• One for each platform/community (/pol/,
The_Donald, Reddit, Twitter, Gab)
• Distinct model for each cluster; fit each
model with Gibbs sampling
• Calculate the influence and efficiency of each
community
Communities’ influence (racist memes)
/pol/ is most
influential in terms
of spreading racist
memes
Communities’ efficiency (racist memes)
If we look at the
influence normalized
to the number of
memes posted, the
The_Donald is most
efficient in terms of
disseminating memes
Summary
• Proposed meme processing pipeline
• Code and datasets available on Github
(https://github.com/memespaper/memes_pipeline)
• Important differences between the memes posted on
Web communities
• Quantified influence among Web communities
Now For Some
“Fun”
• As researchers, our goal is to share
what we learn
• We write papers; sometimes
people even read them!
• Unfortunately, this can attract some
unwanted attention…
Nature did a really nice interview with Gianluca about our work.
We were pretty excited!
“Wow! We get to share our work with a general audience!!!”
From what we could
determine, this image was
produced by Daily Stormer
users…
A literal, non-satirical neo-
Nazi community
2/25/19
Email received in response to our work on
anti-Semitism
Remarks
• These type of problems are not easy for researchers
• It’s disturbing content; stressful and emotionally draining
• We are putting ourselves at personal risk of attack
• As researchers studying these fringe Web communities we should be
prepared for such kind of personal attacks
• We should regularly check with colleagues/students working on these
communities to ensure that they do not sink into the cesspool
Acknowledgments

More Related Content

Similar to On the Origins of Memes by Means of Fringe Web Communities - Invited talk ta Sigmetrics

Using Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter NetworkUsing Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter NetworkSteve Kramer
 
A network based model for predicting a hashtag break out in twitter
A network based model for predicting a hashtag break out in twitter A network based model for predicting a hashtag break out in twitter
A network based model for predicting a hashtag break out in twitter Sultan Alzahrani
 
CPRS Ottawa-Gatineau - Measuring Social Media Workshop - Sean Howard - thornl...
CPRS Ottawa-Gatineau - Measuring Social Media Workshop - Sean Howard - thornl...CPRS Ottawa-Gatineau - Measuring Social Media Workshop - Sean Howard - thornl...
CPRS Ottawa-Gatineau - Measuring Social Media Workshop - Sean Howard - thornl...keelangreen
 
Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Stanford University
 
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Artificial Intelligence Institute at UofSC
 
Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...
Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...
Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...Labic Ufes
 
final_nlp
final_nlpfinal_nlp
final_nlpaphex34
 
TextMiningTwitters
TextMiningTwittersTextMiningTwitters
TextMiningTwittersLiu Chang
 
Temporal Effects on Hashtag Reuse in Twitter
Temporal Effects on Hashtag Reuse in TwitterTemporal Effects on Hashtag Reuse in Twitter
Temporal Effects on Hashtag Reuse in TwitterDominik Kowald
 
Narrative Mind Lessons Learned
Narrative Mind Lessons LearnedNarrative Mind Lessons Learned
Narrative Mind Lessons LearnedH4Diadmin
 
Narrative Mind Lessons Learned H4D Stanford 2016
Narrative Mind Lessons Learned H4D Stanford 2016Narrative Mind Lessons Learned H4D Stanford 2016
Narrative Mind Lessons Learned H4D Stanford 2016Stanford University
 
Choosing the right crowd. Expert finding in social networks. edbt 2013
Choosing the right crowd. Expert finding in social networks. edbt 2013Choosing the right crowd. Expert finding in social networks. edbt 2013
Choosing the right crowd. Expert finding in social networks. edbt 2013Marco Brambilla
 
CansecWest2019: Infosec Frameworks for Misinformation
CansecWest2019: Infosec Frameworks for MisinformationCansecWest2019: Infosec Frameworks for Misinformation
CansecWest2019: Infosec Frameworks for Misinformationbodaceacat
 
Terp breuer misinfosecframeworks_cansecwest2019
Terp breuer misinfosecframeworks_cansecwest2019Terp breuer misinfosecframeworks_cansecwest2019
Terp breuer misinfosecframeworks_cansecwest2019bodaceacat
 
Misinfosec frameworks Cansecwest 2019
Misinfosec frameworks Cansecwest 2019Misinfosec frameworks Cansecwest 2019
Misinfosec frameworks Cansecwest 2019bodaceacat
 
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...Marco Brambilla
 
Getting Insight from Big Data
Getting Insight from Big DataGetting Insight from Big Data
Getting Insight from Big DataUjang Fahmi
 
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Lviv Data Science Summer School
 

Similar to On the Origins of Memes by Means of Fringe Web Communities - Invited talk ta Sigmetrics (20)

Using Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter NetworkUsing Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter Network
 
A network based model for predicting a hashtag break out in twitter
A network based model for predicting a hashtag break out in twitter A network based model for predicting a hashtag break out in twitter
A network based model for predicting a hashtag break out in twitter
 
CPRS Ottawa-Gatineau - Measuring Social Media Workshop - Sean Howard - thornl...
CPRS Ottawa-Gatineau - Measuring Social Media Workshop - Sean Howard - thornl...CPRS Ottawa-Gatineau - Measuring Social Media Workshop - Sean Howard - thornl...
CPRS Ottawa-Gatineau - Measuring Social Media Workshop - Sean Howard - thornl...
 
Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016
 
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
 
Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...
Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...
Multiple points of view in #VemPraRua Retweets: the perspectival method of ne...
 
final_nlp
final_nlpfinal_nlp
final_nlp
 
TextMiningTwitters
TextMiningTwittersTextMiningTwitters
TextMiningTwitters
 
Temporal Effects on Hashtag Reuse in Twitter
Temporal Effects on Hashtag Reuse in TwitterTemporal Effects on Hashtag Reuse in Twitter
Temporal Effects on Hashtag Reuse in Twitter
 
Narrative Mind Lessons Learned
Narrative Mind Lessons LearnedNarrative Mind Lessons Learned
Narrative Mind Lessons Learned
 
Narrative Mind Lessons Learned H4D Stanford 2016
Narrative Mind Lessons Learned H4D Stanford 2016Narrative Mind Lessons Learned H4D Stanford 2016
Narrative Mind Lessons Learned H4D Stanford 2016
 
Choosing the right crowd. Expert finding in social networks. edbt 2013
Choosing the right crowd. Expert finding in social networks. edbt 2013Choosing the right crowd. Expert finding in social networks. edbt 2013
Choosing the right crowd. Expert finding in social networks. edbt 2013
 
CansecWest2019: Infosec Frameworks for Misinformation
CansecWest2019: Infosec Frameworks for MisinformationCansecWest2019: Infosec Frameworks for Misinformation
CansecWest2019: Infosec Frameworks for Misinformation
 
Terp breuer misinfosecframeworks_cansecwest2019
Terp breuer misinfosecframeworks_cansecwest2019Terp breuer misinfosecframeworks_cansecwest2019
Terp breuer misinfosecframeworks_cansecwest2019
 
Misinfosec frameworks Cansecwest 2019
Misinfosec frameworks Cansecwest 2019Misinfosec frameworks Cansecwest 2019
Misinfosec frameworks Cansecwest 2019
 
Ir1
Ir1Ir1
Ir1
 
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
Answering Search Queries with CrowdSearcher: a crowdsourcing and social netwo...
 
Trend Analysis
Trend AnalysisTrend Analysis
Trend Analysis
 
Getting Insight from Big Data
Getting Insight from Big DataGetting Insight from Big Data
Getting Insight from Big Data
 
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
 

Recently uploaded

Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptxArvind Kumar
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspectsmuralinath2
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptxSilpa
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxSilpa
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...Monika Rani
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Silpa
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Silpa
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.Silpa
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Silpa
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Silpa
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIADr. TATHAGAT KHOBRAGADE
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusNazaninKarimi6
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry Areesha Ahmad
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Silpa
 

Recently uploaded (20)

Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 

On the Origins of Memes by Means of Fringe Web Communities - Invited talk ta Sigmetrics

  • 1. On the Origins of Memes by Means of Fringe Web Communities Savvas Zannettou, Tristan Caulfield, Jeremy Blackburn, Emiliano De Cristofaro, Michael Sirivianos, Gianluca Stringhini, Guillermo Suarez-Tangil
  • 2. WARNING IMAGERY IN THIS TALK IS UNCENSORED AND MIGHT BE OFFENSIVE
  • 3. Talk Outline Who we are and what is our main line of work Cross Platform Analysis of Memes Personal attacks
  • 4. Who are we? International Data- driven Research for Advanced Modeling and Analysis Lab (IDRAMA Lab)
  • 5. IDRAMA Lab Overview • Team of international researchers and academics • Various backgrounds ranging from Computer Security, Cryptography to Social Network Analysis and Physics • Geographically distributed
  • 6. IDRAMA Goal Understand and mitigate emerging socio-technological issues on the Web
  • 7. IDRAMA Approach • Large-scale data-driven approach • Analyzing billions of posts over several years • Cross-Platform • Twitter, Reddit, 4chan, Gab, etc. • Quantitative Analysis
  • 8. Online platforms do not exist in a vacuum Looking at a single platform at a time is not enough to capture online dynamics We lack tools to effectively trace how information spreads across different platforms 8
  • 9.
  • 10.
  • 17. Memes in politics Memes have become a popular, and seemingly effective, method to transmit ideology. Memes have been weaponized
  • 18. But what do we really know about memes? • How can we track meme propagation across Web communities? • Can we characterize Web communities through their memes? • Can we measure the influence of Web communities with respect to memes they share?
  • 19. Memes processing pipeline 3.Clustering 1.pH ashExtraction 2.pH ash-basedPairw ise DistanceCalculation pHashes of some or all Web communities' images Clusters of images 5.ClusterAnnotation Pairwise Comparisons of pHashes annotated images 6.Associationof Images toClusters Annotated Clusters pHashes of  annotated images pHashes (all Web Communities) 7.Analysis and Influ e nceEs timation Occurrences of Memes in all Web Communities 4.Screenshot Classifie r annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities 4chan Twitter Reddit Gab Web Communities posting Memes images
  • 20. Let’s see our data sources… 3.Clustering 1.pH ashExtraction 2.pH ash-basedPairw ise DistanceCalculation pHashes of some or all Web communities' images Clusters of images 5.ClusterAnnotation Pairwise Comparisons of pHashes annotated images 6.Associationof Images toClusters Annotated Clusters pHashes of  annotated images pHashes (all Web Communities) 7.Analysis and Influ e nceEs timation Occurrences of Memes in all Web Communities 4.Screenshot Classifie r annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities 4chan Twitter Reddit Gab Web Communities posting Memes images
  • 21. Know Your Meme (KYM) • Crowdsourced encyclopedia for Memes • Provides useful metadata • E.g., origin, descriptive tags, description, examples, image galleries • Built custom crawler • Obtained data for 15K KYM entries • Download every image per entry (706K)
  • 22. Datasets # of posts 1.4B 1.0B 48M 12M 15K # of posts with images 242M 62M 13M 955K 15K # of Images 114M 40M 4M 235K 706K
  • 23. Perceptual hashing extraction 3.Clustering 1.pH ashExtraction 2.pH ash-basedPairw ise DistanceCalculation pHashes of some or all Web communities' images Clusters of images 5.ClusterAnnotation Pairwise Comparisons of pHashes annotated images 6.Associationof Images toClusters Annotated Clusters pHashes of  annotated images pHashes (all Web Communities) 7.Analysis and Influ e nceEs timation Occurrences of Memes in all Web Communities 4.Screenshot Classifie r annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities 4chan Twitter Reddit Gab Web Communities posting Memes images
  • 24. Perceptual hashing (pHash) • Generates a hash for each image • Visually similar images have minor differences in their hashes • Reduces dimensionality of the images • Run the pHash algorithm for • All images from KYM (706K) • All images from Twitter, Reddit, /pol/, and Gab (159.5M)
  • 25. Creating clusters of images/memes 3.Clustering 1.pH ashExtraction 2.pH ash-basedPairw ise DistanceCalculation pHashes of some or all Web communities' images Clusters of images 5.ClusterAnnotation Pairwise Comparisons of pHashes annotated images 6.Associationof Images toClusters Annotated Clusters pHashes of  annotated images pHashes (all Web Communities) 7.Analysis and Influ e nceEs timation Occurrences of Memes in all Web Communities 4.Screenshot Classifie r annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities 4chan Twitter Reddit Gab Web Communities posting Memes images
  • 26. Pairwise comparisons and clustering • Calculated all pairwise comparisons between all pHashes from /pol/, The_Donald, and Gab • Used TensorFlow and GPUs to speed-up the process • Hamming distance • Performed clustering using: • DBSCAN algorithm
  • 27. Example clusters Nut Button Meme Goofy’s Time Meme
  • 28. Annotating clusters 3.Clustering 1.pH ashExtraction 2.pH ash-basedPairw ise DistanceCalculation pHashes of some or all Web communities' images Clusters of images 5.ClusterAnnotation Pairwise Comparisons of pHashes annotated images 6.Associationof Images toClusters Annotated Clusters pHashes of  annotated images pHashes (all Web Communities) 7.Analysis and Influ e nceEs timation Occurrences of Memes in all Web Communities 4.Screenshot Classifie r annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities 4chan Twitter Reddit Gab Web Communities posting Memes images
  • 29. Annotating clusters • Calculated medoid of each cluster • “Representative” image in cluster • Compared all medoids with all KYM images • We have a hit if the Hamming distance is <= pre- defined threshold • Assign the representative label according to: • Number of hits • Average distance between all hits • Performed small-scale evaluation of annotations
  • 30. Finding all memes and analyzing final dataset 3.Clustering 1.pH ashExtraction 2.pH ash-basedPairw ise DistanceCalculation pHashes of some or all Web communities' images Clusters of images 5.ClusterAnnotation Pairwise Comparisons of pHashes annotated images 6.Associationof Images toClusters Annotated Clusters pHashes of  annotated images pHashes (all Web Communities) 7.Analysis and Influ e nceEs timation Occurrences of Memes in all Web Communities 4.Screenshot Classifie r annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities 4chan Twitter Reddit Gab Web Communities posting Memes images
  • 31. Top memes per Web community
  • 32. Studying specific groups of memes • Focus on racist and political memes • Use KYM tags to find relevant memes • “politics,” “2016 us presidential election,” “trump,” and “clinton” tags • “racism,” “racist,” or “antisemitism” tags • Obtain 117 racist memes and 556 political memes from KYM dataset
  • 33. How are memes shared over time? Political Memes Racist Memes
  • 34. How are memes shared over time? Political Memes Racist Memes 2nd US presidential debate
  • 35. How are memes shared over time? Political Memes Racist Memes 2016 US elections 2nd US presidential debate
  • 36. How memes are shared over time? Political Memes Racist Memes 2016 US elections Gab activity increase 2017 2nd US presidential debate
  • 37. How are memes shared over time? Political Memes Racist Memes 2016 US elections Gab activity increase 2017 /pol/ constant share 2nd US presidential debate
  • 38. How are memes shared over time? Political Memes Racist Memes 2016 US elections Gab activity increase 2017 /pol/ constant share Gab activity increase in 2017 2nd US presidential debate
  • 39. How to quantify the influence? • Hawkes processes • Assume K processes • Each with a rate of events (i.e., posting of a meme), called the background rate • An event can cause impulse responses in other processes • Increases the rates of other processes for a period of time • Enables us to assess root cause of events
  • 40. Hawkes processes example A B C 1 2 3 4 Background Rate A Background Rate B Background Rate C
  • 41. Hawkes processes example A B C 1 Background Rate A Background Rate B Background Rate C
  • 42. Hawkes processes example A B C 1 Background Rate A Background Rate B Background Rate C
  • 43. Hawkes processes example A B C 1 2 Background Rate A Background Rate B Background Rate C
  • 44. Hawkes processes example A B C 1 2 Background Rate A Background Rate B Background Rate C
  • 45. Hawkes processes example A B C 1 2 3 Background Rate A Background Rate B Background Rate C
  • 46. Hawkes processes example A B C 1 2 3 Background Rate A Background Rate B Background Rate C
  • 47. Hawkes processes example A B C 1 2 3 4 Background Rate A Background Rate B Background Rate C
  • 48. For our purposes… • Hawkes model with 5 processes • One for each platform/community (/pol/, The_Donald, Reddit, Twitter, Gab) • Distinct model for each cluster; fit each model with Gibbs sampling • Calculate the influence and efficiency of each community
  • 49. Communities’ influence (racist memes) /pol/ is most influential in terms of spreading racist memes
  • 50. Communities’ efficiency (racist memes) If we look at the influence normalized to the number of memes posted, the The_Donald is most efficient in terms of disseminating memes
  • 51. Summary • Proposed meme processing pipeline • Code and datasets available on Github (https://github.com/memespaper/memes_pipeline) • Important differences between the memes posted on Web communities • Quantified influence among Web communities
  • 52. Now For Some “Fun” • As researchers, our goal is to share what we learn • We write papers; sometimes people even read them! • Unfortunately, this can attract some unwanted attention…
  • 53.
  • 54. Nature did a really nice interview with Gianluca about our work. We were pretty excited! “Wow! We get to share our work with a general audience!!!”
  • 55.
  • 56.
  • 57.
  • 58. From what we could determine, this image was produced by Daily Stormer users… A literal, non-satirical neo- Nazi community
  • 59.
  • 60.
  • 61. 2/25/19 Email received in response to our work on anti-Semitism
  • 62. Remarks • These type of problems are not easy for researchers • It’s disturbing content; stressful and emotionally draining • We are putting ourselves at personal risk of attack • As researchers studying these fringe Web communities we should be prepared for such kind of personal attacks • We should regularly check with colleagues/students working on these communities to ensure that they do not sink into the cesspool

Editor's Notes

  1. .
  2. Say jewish not jew