SlideShare ist ein Scribd-Unternehmen logo
1 von 47
Social Influence & Homophily
Nitish Upreti
nzu100@cse.psu.edu
OUTLINE
•
•
•
•
•
•
•
•

Introduction and Review.
Motivation
Related Work
Problem Definition
Statistics Background
Methodology
Where to go from here?
Summary
PROBLEM DEFINITION
“Identifying and measuring
individual Homophily and Social
Influence effects on a dataset.”
Quick Review
• Social Influence : Our friendship and behavior
is affected by Social Influence (to conform to
our neighbors value).
• Selection: We have a tendency to be friends
with people who are like us.
• Homophily: A widely observed social
phenomena which states that “we tend to be
similar to our friends”.
Quick Note before we start…
We will refer to Selection as Homophily
(Reason: Authors assume that if Homophily
effects are present, we tend to select individuals
with similar values)
MOTIVATION
Selection Vs social influence: Why do
we care?
• If Social Influence is a significant factor, then
targeting key individuals and trying to modify
undesirable behavior can be effective since we
are then viewing such behavior as a process of
influence spread.
• Otherwise, focusing on a few individuals will
at best change the behavior of a few
individuals.
REAL WORLD SCENARIO
• A firm selling products to consumers in a
social network.
• The firm knows that friends in the network
often make similar purchases.
• What is the reason behind this similarity?
• Is it because they have similar tastes, since,
after all, they are friends?
• Is it because one influences the other’s
decision, as they communicate frequently?
Credits: (Homophily or Influence? – Analysis of Purchase
Decisions in a Social Network Context Liye Ma, Alan Montgomery and Ramayya Krishnan )
How can the firm take advantage?
• If it is the taste similarity that drives the
similar decisions, the firm should directly
target friends of that customer by offering
discounts to them.
• If, it is social influence that drives the
similarity, the firm should incentivize that
customer to promote the product or service to
her friends.
Credits: (Homophily or Influence? – Analysis of Purchase
Decisions in a Social Network Context Liye Ma, Alan Montgomery and Ramayya Krishnan )
SELF ANALYSIS
A Real World Problem worth Solving.
EXISTING WORK
• A lot of research has gone into understanding
“Homophily” and “Social Influence” in social
networks.
• Quickly mention studies which involve direct
analysis of “Identifying and measuring
Homophily and social influence effects”.
• This problem area serves as one of the biggest
open ended challenges to Social Scientists. (
will make a good class project as well :D )
SURVEY OF RELATED WORK
RELATED WORK - 1
• “Homophily or Influence? – Analysis of
Purchase Decisions in a Social Network
Context”
http://people.stern.nyu.edu/bakos/wise/papers/wise2009-5b2_paper.pdf
QUICK LOOK AT THE STUDY
• Phone call history dataset (3.7 Million) from
an Indian Telecom company over a 6 month
period for purchase records of monthly Caller
Ring Back Tones (CRBT) subscription.
• Social Influence & Homophily is studied.
• Study builds a “Hierarchical Bayesian model”
which simultaneously accounts for both
Homophily and social influence effect in
consumers’ decision process.
RELATED WORK - 2
• “Social selection and peer influence in an
online social network.”
http://www.irle.berkeley.edu/culture/conf2012/lewis_soc12.pdf
QUICK LOOK AT THE STUDY
• Employs Facebook activity of college students.
• Coevolution of friendship and tastes in music,
movies and books over a 4 year time period is
analyzed.
• A “Stochastic actor-based” modeling is
employed to analyze individual effects of
Social Influence & Homophily.
RELATED WORK - 3
• “Distinguishing influence-based contagion
from Homophily driven diffusion in dynamic
networks.”
http://www.pnas.org/content106/51/21544.full.pdf
QUICK LOOK AT THE STUDY
• Employs the study of a longitudinal dataset
that combines the global network of daily
instant messaging (IM) traffic among 27.4
million users of Yahoo with day-by-day
adoption of a mobile service application
(Yahoo! Go)
• A sample estimation framework to distinguish
influence based on “Matched sample
estimation” is developed.
ANALYSIS OF EXISTING APPROACHES
• Empirical Investigations
(Focuses on demonstrating the presence
Homophily and Influence in real world data sets)

of

• Significance Tests for Relational and Social
network data
(Focuses mostly on static networks)

• Modeling Techniques
Homophily & Influence.

for

distinguishing

(Accuracy is impacted by suitability of model)
TODAY’S
FOCUS
“Randomization
Tests
for
Distinguishing Social Influence and
Homophily
Effects.”
https://www.cs.purdue.edu/homes/neville/papers/lafond-neville-www2010.pdf
INTRODUCTION
• In Social Network, connected instances are
likely to have auto correlated attributes value.
• “Two friends are more likely to share a
common political belief than two random
strangers.”
• Presents a Randomization technique for
temporal network data for measuring
individual contribution of Homophily and
Social Influence (details coming soon!).
THE EXPERIMENT / SUPPORT
• A subset of data from a Facebook group in
Purdue.
• Time step from 2008(t) to 2009(t+1)
• Hypothesis tested on :
1. Semi Synthetic Data with no Homophily & Social Influence.
2. Semi Synthetic Data with strong Homophily or Influence
effect.
3. Actual experiment on real dataset.

• Efficacy of the approach was proven for all
conditions.
PROBLEM DEFINITION
• Relational data represented as an undirected,
attributed graph G=(V,E)
• Each node v belongs to V, has a number of
attributes (X1………….Xm)
• For a time step ‘t’, the attributes and
relationships can change.
• Significant Influence : Attributes in t+1 depend
on link structure at t.
• Significant Homophily : Link structure in t+1 will
depend on attributes at t.
(Keep them in mind! We will come back to them)
BACKGROUND
• In Statistics, an association is a relationship
between
two
statistically
dependent
quantities.
• ‘Relation Autocorrelation’ : Statistical
dependency between values of the same
variable on related object. ( Abundant in our
dataset) Why?
• In this work we use the Chi-Square statistics.
STATISTICS 101
CHI-SQUARE STATISTICS
• How likely is an observed distribution due to
chance?
• Observe 100 students to see “whether attending
class influences how students perform on exam?”
• Four categories :
–
–
–
–

Students who attend class and pass.
Students who attend class and do not pass.
Students who do not attend class and pass.
Students who do not attend class and do not pass.

• Null Hypothesis : There is no difference based on
attending classes.
CHI-SQUARE Continued….
• The test compares the observed data to a model that
distributes the data according to the expectation that
the variables are independent. Wherever the observed
data doesn't fit the model, the likelihood that the
variables are dependent becomes stronger, thus
proving the null hypothesis incorrect!
• Degree of freedom : Values in final calculations that
are free to vary.
• Calculate the Chi Square value. (How?)
• Calculate the more interesting ‘p’ value (Percentage
likelihood that the null hypothesis is correct)
Calculating Relational Autocorrelation
CORRELATION GAIN
gain(t,t+1) = C( Xt+1, Gt+1 ) – C( Xt , Gt)
(The gain could be due to Homophily or Social Influence)
HOMOPHILY Continued…
If a Homophily effect is present in the data, the
autocorrelation will increase when we consider
the link changes from time t to time t+ 1 :
C( Xt , Gt+1 ) – C( Xt , Gt )
(The Chi-Square value is a single number that adds up all the
differences between our actual data and the data expected.)
SOCIAL INFLUENCE Continued…
If an influence effect is present in the data, the
autocorrelation will increase when we consider
the attribute changes from time t to time t + 1:
C( Xt +1 , Gt ) – C( Xt , Gt )
(The Chi-Square value is a single number that adds up all the
differences between our actual data and the data expected.)
METHODOLOGY
(Randomization Tests)
RANDOMIZATION TESTS
• Provide a robust statistical technique for
hypothesis testing.
• Generates several Pseudosamples (permutations
of original data sets).
• Correlation gain is calculated for each
Pseudosample.
• Value of observed gain is then compared to
distribution of scores.
• A high variance in comparison to the distribution
is deemed significant.
ANALYSIS OF KEY ISSUES
AND ASSUMPTIONS
(For Randomization Tests)

• Make an appropriate NULL Hypothesis.
• The data is permuted in a way that accurately
reflects the null hypothesis.
SELF ANALYSIS
The Approach is quite relevant and appropriate
as there are no assumptions on the underlying
model.
Also both the attribute values and link change
over time which focuses on assessing both
Influence and Homophily.
NULL HYPOTHESIS
• H0H : Link changes are random and are not due
to attribute values in t.
• H0I : Attribute changes are random and are not
due to friends in t.
• H0F : Both attribute and link changes are
random.
POSSIBLE PERMUTATIONS
CHOICE BASED RANDOMIZATION
• For H0H we can maintain the edge addition in t+1
but randomize the choice of target node so that
each node has the same number of additions and
deletions.
• For H0I we can randomized the choice of attribute
value to replace in t+1, so that any similarity of
the value is destroyed.
• This is popularly referred to as “choice-based”
randomization, as we are randomizing the result
of choices(attribute/link changes)
CALCULATING CHOICE BASED
RANDOMIZATION
•
•
•
•

Non Trivial Problem.
A greedy assignment is involved.
Collect all the changes (edge & attributes).
Sort the nodes and attributes from those with
least number of random options to those with
largest options.
• Prevents abusing the underlying NULL
hypothesis
SELF ANALYSIS
Where to go from here?
• Changing the granularity of time step to
investigate deeper.
• Investigating why certain groups had more of
Homophily or Social Influence?
• Apart from friendship, considering other
influential effects.
SUMMARY
• Successful Employed a Randomization Technique
for distinguishing Homophily and Social Influence.
• Tested the hypothesis on different synthetic-real
world data sets.
• Different groups had Influence and Homophily
vary to different degree based on group
properties.
PERSONAL TAKEAWAY

Take a Statistics Class !
THANK YOU!

Weitere ähnliche Inhalte

Was ist angesagt?

02 Network Data Collection
02 Network Data Collection02 Network Data Collection
02 Network Data Collectiondnac
 
03 Ego Network Analysis
03 Ego Network Analysis03 Ego Network Analysis
03 Ego Network Analysisdnac
 
Social Media and Social Movements: Descriptive metaanalysis 2011-2014
Social Media and Social Movements: Descriptive metaanalysis 2011-2014Social Media and Social Movements: Descriptive metaanalysis 2011-2014
Social Media and Social Movements: Descriptive metaanalysis 2011-2014Marcelo Luis Barbosa dos Santos
 
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Sc Huang
 
The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...Amit Sharma
 
Practical Applications for Social Network Analysis in Public Sector Marketing...
Practical Applications for Social Network Analysis in Public Sector Marketing...Practical Applications for Social Network Analysis in Public Sector Marketing...
Practical Applications for Social Network Analysis in Public Sector Marketing...Mike Kujawski
 
Using Agent-Based Simulation to integrate micro/qualitative evidence, macro-q...
Using Agent-Based Simulation to integrate micro/qualitative evidence, macro-q...Using Agent-Based Simulation to integrate micro/qualitative evidence, macro-q...
Using Agent-Based Simulation to integrate micro/qualitative evidence, macro-q...Bruce Edmonds
 
Social Network Analysis (SNA) Made Easy
Social Network Analysis (SNA) Made EasySocial Network Analysis (SNA) Made Easy
Social Network Analysis (SNA) Made EasyJeff Mohr
 
The Complexity of Data: Computer Simulation and “Everyday” Social Science
The Complexity of Data: Computer Simulation and “Everyday” Social ScienceThe Complexity of Data: Computer Simulation and “Everyday” Social Science
The Complexity of Data: Computer Simulation and “Everyday” Social ScienceEdmund Chattoe-Brown
 
Mining and analyzing social media part 2 - hicss47 tutorial - dave king
Mining and analyzing social media   part 2 - hicss47 tutorial - dave kingMining and analyzing social media   part 2 - hicss47 tutorial - dave king
Mining and analyzing social media part 2 - hicss47 tutorial - dave kingDave King
 
Recommendation systems
Recommendation systems  Recommendation systems
Recommendation systems Badr Hirchoua
 
Practical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaPractical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaDiana Maynard
 
Text mining and analytics v6 - p2
Text mining and analytics   v6 - p2Text mining and analytics   v6 - p2
Text mining and analytics v6 - p2Dave King
 

Was ist angesagt? (20)

02 Network Data Collection
02 Network Data Collection02 Network Data Collection
02 Network Data Collection
 
03 Ego Network Analysis
03 Ego Network Analysis03 Ego Network Analysis
03 Ego Network Analysis
 
00 Social Influence Effects on Men's HIV Testing
00 Social Influence Effects on Men's HIV Testing00 Social Influence Effects on Men's HIV Testing
00 Social Influence Effects on Men's HIV Testing
 
20 Network Experiments
20 Network Experiments20 Network Experiments
20 Network Experiments
 
18 Diffusion Models and Peer Influence
18 Diffusion Models and Peer Influence18 Diffusion Models and Peer Influence
18 Diffusion Models and Peer Influence
 
Social Media and Social Movements: Descriptive metaanalysis 2011-2014
Social Media and Social Movements: Descriptive metaanalysis 2011-2014Social Media and Social Movements: Descriptive metaanalysis 2011-2014
Social Media and Social Movements: Descriptive metaanalysis 2011-2014
 
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
 
The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...
 
Practical Applications for Social Network Analysis in Public Sector Marketing...
Practical Applications for Social Network Analysis in Public Sector Marketing...Practical Applications for Social Network Analysis in Public Sector Marketing...
Practical Applications for Social Network Analysis in Public Sector Marketing...
 
Using Agent-Based Simulation to integrate micro/qualitative evidence, macro-q...
Using Agent-Based Simulation to integrate micro/qualitative evidence, macro-q...Using Agent-Based Simulation to integrate micro/qualitative evidence, macro-q...
Using Agent-Based Simulation to integrate micro/qualitative evidence, macro-q...
 
Social Network Analysis (SNA) Made Easy
Social Network Analysis (SNA) Made EasySocial Network Analysis (SNA) Made Easy
Social Network Analysis (SNA) Made Easy
 
The Complexity of Data: Computer Simulation and “Everyday” Social Science
The Complexity of Data: Computer Simulation and “Everyday” Social ScienceThe Complexity of Data: Computer Simulation and “Everyday” Social Science
The Complexity of Data: Computer Simulation and “Everyday” Social Science
 
Can we predict your sentiments by listening to your peers?
Can we predict your sentiments by listening to your peers?Can we predict your sentiments by listening to your peers?
Can we predict your sentiments by listening to your peers?
 
Mining and analyzing social media part 2 - hicss47 tutorial - dave king
Mining and analyzing social media   part 2 - hicss47 tutorial - dave kingMining and analyzing social media   part 2 - hicss47 tutorial - dave king
Mining and analyzing social media part 2 - hicss47 tutorial - dave king
 
03 RDS
03 RDS03 RDS
03 RDS
 
CSE509 Lecture 6
CSE509 Lecture 6CSE509 Lecture 6
CSE509 Lecture 6
 
Observational studies in social media
Observational studies in social mediaObservational studies in social media
Observational studies in social media
 
Recommendation systems
Recommendation systems  Recommendation systems
Recommendation systems
 
Practical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaPractical Opinion Mining for Social Media
Practical Opinion Mining for Social Media
 
Text mining and analytics v6 - p2
Text mining and analytics   v6 - p2Text mining and analytics   v6 - p2
Text mining and analytics v6 - p2
 

Ähnlich wie Socail Influence & Homophilly

Introduction to quantitative and qualitative research
Introduction to quantitative and qualitative researchIntroduction to quantitative and qualitative research
Introduction to quantitative and qualitative researchLiz FitzGerald
 
Squaring the Circle? Challenges of Reconciling Agent Based Modelling with “Ev...
Squaring the Circle? Challenges of Reconciling Agent Based Modelling with “Ev...Squaring the Circle? Challenges of Reconciling Agent Based Modelling with “Ev...
Squaring the Circle? Challenges of Reconciling Agent Based Modelling with “Ev...Edmund Chattoe-Brown
 
How to Design Research from Ilm Ideas on Slide Share
How to Design Research from Ilm Ideas on Slide Share How to Design Research from Ilm Ideas on Slide Share
How to Design Research from Ilm Ideas on Slide Share ilmideas
 
How to Develop and Implement Effective Research Tools from Ilm Ideas on Slide...
How to Develop and Implement Effective Research Tools from Ilm Ideas on Slide...How to Develop and Implement Effective Research Tools from Ilm Ideas on Slide...
How to Develop and Implement Effective Research Tools from Ilm Ideas on Slide...ilmideas
 
Lowry student theory-review s001 "An Introduction to Multilevel Theorizing"
Lowry student theory-review s001 "An Introduction to Multilevel Theorizing"Lowry student theory-review s001 "An Introduction to Multilevel Theorizing"
Lowry student theory-review s001 "An Introduction to Multilevel Theorizing"Paul Lowry
 
Overview of the Possibilities of Quantitative Methods in Political Science
Overview of the Possibilities of Quantitative Methods in Political ScienceOverview of the Possibilities of Quantitative Methods in Political Science
Overview of the Possibilities of Quantitative Methods in Political Scienceenvironmentalconflicts
 
Social Dynamics on Networks
Social Dynamics on NetworksSocial Dynamics on Networks
Social Dynamics on NetworksMason Porter
 
KICSS2020 Invited Talk 2: Prof. Quan Bai from University of Tasmania
KICSS2020 Invited Talk 2: Prof. Quan Bai from University of TasmaniaKICSS2020 Invited Talk 2: Prof. Quan Bai from University of Tasmania
KICSS2020 Invited Talk 2: Prof. Quan Bai from University of TasmaniaJawad Haqbeen
 
WSDM 2018 Tutorial on Influence Maximization in Online Social Networks
WSDM 2018 Tutorial on Influence Maximization in Online Social NetworksWSDM 2018 Tutorial on Influence Maximization in Online Social Networks
WSDM 2018 Tutorial on Influence Maximization in Online Social NetworksCigdem Aslay
 
Research Methodology 4
Research Methodology   4Research Methodology   4
Research Methodology 4ayat_ismail
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...jemille6
 
Introduction to participatory systemic inquiry mongolia
Introduction to participatory systemic inquiry   mongoliaIntroduction to participatory systemic inquiry   mongolia
Introduction to participatory systemic inquiry mongoliaGreen Economy Coalition
 
You Want Me to Measure What?
You Want Me to Measure What?You Want Me to Measure What?
You Want Me to Measure What?Dave Hogue
 

Ähnlich wie Socail Influence & Homophilly (20)

intro-qual-quant.pptx
intro-qual-quant.pptxintro-qual-quant.pptx
intro-qual-quant.pptx
 
intro-qual-quant.pptx
intro-qual-quant.pptxintro-qual-quant.pptx
intro-qual-quant.pptx
 
Introduction to quantitative and qualitative research
Introduction to quantitative and qualitative researchIntroduction to quantitative and qualitative research
Introduction to quantitative and qualitative research
 
intro-qual-quant.pptx
intro-qual-quant.pptxintro-qual-quant.pptx
intro-qual-quant.pptx
 
meta_intro_141.ppt
meta_intro_141.pptmeta_intro_141.ppt
meta_intro_141.ppt
 
meta_intro_141.ppt
meta_intro_141.pptmeta_intro_141.ppt
meta_intro_141.ppt
 
Squaring the Circle? Challenges of Reconciling Agent Based Modelling with “Ev...
Squaring the Circle? Challenges of Reconciling Agent Based Modelling with “Ev...Squaring the Circle? Challenges of Reconciling Agent Based Modelling with “Ev...
Squaring the Circle? Challenges of Reconciling Agent Based Modelling with “Ev...
 
How to Design Research from Ilm Ideas on Slide Share
How to Design Research from Ilm Ideas on Slide Share How to Design Research from Ilm Ideas on Slide Share
How to Design Research from Ilm Ideas on Slide Share
 
How to Develop and Implement Effective Research Tools from Ilm Ideas on Slide...
How to Develop and Implement Effective Research Tools from Ilm Ideas on Slide...How to Develop and Implement Effective Research Tools from Ilm Ideas on Slide...
How to Develop and Implement Effective Research Tools from Ilm Ideas on Slide...
 
Topic_4_Survey.pdf
Topic_4_Survey.pdfTopic_4_Survey.pdf
Topic_4_Survey.pdf
 
Lowry student theory-review s001 "An Introduction to Multilevel Theorizing"
Lowry student theory-review s001 "An Introduction to Multilevel Theorizing"Lowry student theory-review s001 "An Introduction to Multilevel Theorizing"
Lowry student theory-review s001 "An Introduction to Multilevel Theorizing"
 
Overview of the Possibilities of Quantitative Methods in Political Science
Overview of the Possibilities of Quantitative Methods in Political ScienceOverview of the Possibilities of Quantitative Methods in Political Science
Overview of the Possibilities of Quantitative Methods in Political Science
 
Social Dynamics on Networks
Social Dynamics on NetworksSocial Dynamics on Networks
Social Dynamics on Networks
 
KICSS2020 Invited Talk 2: Prof. Quan Bai from University of Tasmania
KICSS2020 Invited Talk 2: Prof. Quan Bai from University of TasmaniaKICSS2020 Invited Talk 2: Prof. Quan Bai from University of Tasmania
KICSS2020 Invited Talk 2: Prof. Quan Bai from University of Tasmania
 
WSDM 2018 Tutorial on Influence Maximization in Online Social Networks
WSDM 2018 Tutorial on Influence Maximization in Online Social NetworksWSDM 2018 Tutorial on Influence Maximization in Online Social Networks
WSDM 2018 Tutorial on Influence Maximization in Online Social Networks
 
Methodology and IRB/URR
Methodology and IRB/URRMethodology and IRB/URR
Methodology and IRB/URR
 
Research Methodology 4
Research Methodology   4Research Methodology   4
Research Methodology 4
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
 
Introduction to participatory systemic inquiry mongolia
Introduction to participatory systemic inquiry   mongoliaIntroduction to participatory systemic inquiry   mongolia
Introduction to participatory systemic inquiry mongolia
 
You Want Me to Measure What?
You Want Me to Measure What?You Want Me to Measure What?
You Want Me to Measure What?
 

Mehr von Nitish Upreti

Facebook's TAO & Unicorn data storage and search platforms
Facebook's TAO & Unicorn data storage and search platformsFacebook's TAO & Unicorn data storage and search platforms
Facebook's TAO & Unicorn data storage and search platformsNitish Upreti
 
PSU CSE 541 Project Idea
PSU CSE 541 Project IdeaPSU CSE 541 Project Idea
PSU CSE 541 Project IdeaNitish Upreti
 

Mehr von Nitish Upreti (7)

Facebook's TAO & Unicorn data storage and search platforms
Facebook's TAO & Unicorn data storage and search platformsFacebook's TAO & Unicorn data storage and search platforms
Facebook's TAO & Unicorn data storage and search platforms
 
Spark
SparkSpark
Spark
 
Blinkdb
BlinkdbBlinkdb
Blinkdb
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Project progress
Project progressProject progress
Project progress
 
Software testing
Software testingSoftware testing
Software testing
 
PSU CSE 541 Project Idea
PSU CSE 541 Project IdeaPSU CSE 541 Project Idea
PSU CSE 541 Project Idea
 

Kürzlich hochgeladen

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Socail Influence & Homophilly

  • 1. Social Influence & Homophily Nitish Upreti nzu100@cse.psu.edu
  • 2. OUTLINE • • • • • • • • Introduction and Review. Motivation Related Work Problem Definition Statistics Background Methodology Where to go from here? Summary
  • 3. PROBLEM DEFINITION “Identifying and measuring individual Homophily and Social Influence effects on a dataset.”
  • 4. Quick Review • Social Influence : Our friendship and behavior is affected by Social Influence (to conform to our neighbors value). • Selection: We have a tendency to be friends with people who are like us. • Homophily: A widely observed social phenomena which states that “we tend to be similar to our friends”.
  • 5. Quick Note before we start… We will refer to Selection as Homophily (Reason: Authors assume that if Homophily effects are present, we tend to select individuals with similar values)
  • 7. Selection Vs social influence: Why do we care? • If Social Influence is a significant factor, then targeting key individuals and trying to modify undesirable behavior can be effective since we are then viewing such behavior as a process of influence spread. • Otherwise, focusing on a few individuals will at best change the behavior of a few individuals.
  • 8. REAL WORLD SCENARIO • A firm selling products to consumers in a social network. • The firm knows that friends in the network often make similar purchases. • What is the reason behind this similarity? • Is it because they have similar tastes, since, after all, they are friends? • Is it because one influences the other’s decision, as they communicate frequently? Credits: (Homophily or Influence? – Analysis of Purchase Decisions in a Social Network Context Liye Ma, Alan Montgomery and Ramayya Krishnan )
  • 9. How can the firm take advantage? • If it is the taste similarity that drives the similar decisions, the firm should directly target friends of that customer by offering discounts to them. • If, it is social influence that drives the similarity, the firm should incentivize that customer to promote the product or service to her friends. Credits: (Homophily or Influence? – Analysis of Purchase Decisions in a Social Network Context Liye Ma, Alan Montgomery and Ramayya Krishnan )
  • 10. SELF ANALYSIS A Real World Problem worth Solving.
  • 11. EXISTING WORK • A lot of research has gone into understanding “Homophily” and “Social Influence” in social networks. • Quickly mention studies which involve direct analysis of “Identifying and measuring Homophily and social influence effects”. • This problem area serves as one of the biggest open ended challenges to Social Scientists. ( will make a good class project as well :D )
  • 13. RELATED WORK - 1 • “Homophily or Influence? – Analysis of Purchase Decisions in a Social Network Context” http://people.stern.nyu.edu/bakos/wise/papers/wise2009-5b2_paper.pdf
  • 14. QUICK LOOK AT THE STUDY • Phone call history dataset (3.7 Million) from an Indian Telecom company over a 6 month period for purchase records of monthly Caller Ring Back Tones (CRBT) subscription. • Social Influence & Homophily is studied. • Study builds a “Hierarchical Bayesian model” which simultaneously accounts for both Homophily and social influence effect in consumers’ decision process.
  • 15. RELATED WORK - 2 • “Social selection and peer influence in an online social network.” http://www.irle.berkeley.edu/culture/conf2012/lewis_soc12.pdf
  • 16. QUICK LOOK AT THE STUDY • Employs Facebook activity of college students. • Coevolution of friendship and tastes in music, movies and books over a 4 year time period is analyzed. • A “Stochastic actor-based” modeling is employed to analyze individual effects of Social Influence & Homophily.
  • 17. RELATED WORK - 3 • “Distinguishing influence-based contagion from Homophily driven diffusion in dynamic networks.” http://www.pnas.org/content106/51/21544.full.pdf
  • 18. QUICK LOOK AT THE STUDY • Employs the study of a longitudinal dataset that combines the global network of daily instant messaging (IM) traffic among 27.4 million users of Yahoo with day-by-day adoption of a mobile service application (Yahoo! Go) • A sample estimation framework to distinguish influence based on “Matched sample estimation” is developed.
  • 19. ANALYSIS OF EXISTING APPROACHES • Empirical Investigations (Focuses on demonstrating the presence Homophily and Influence in real world data sets) of • Significance Tests for Relational and Social network data (Focuses mostly on static networks) • Modeling Techniques Homophily & Influence. for distinguishing (Accuracy is impacted by suitability of model)
  • 20. TODAY’S FOCUS “Randomization Tests for Distinguishing Social Influence and Homophily Effects.” https://www.cs.purdue.edu/homes/neville/papers/lafond-neville-www2010.pdf
  • 21. INTRODUCTION • In Social Network, connected instances are likely to have auto correlated attributes value. • “Two friends are more likely to share a common political belief than two random strangers.” • Presents a Randomization technique for temporal network data for measuring individual contribution of Homophily and Social Influence (details coming soon!).
  • 22. THE EXPERIMENT / SUPPORT • A subset of data from a Facebook group in Purdue. • Time step from 2008(t) to 2009(t+1) • Hypothesis tested on : 1. Semi Synthetic Data with no Homophily & Social Influence. 2. Semi Synthetic Data with strong Homophily or Influence effect. 3. Actual experiment on real dataset. • Efficacy of the approach was proven for all conditions.
  • 23. PROBLEM DEFINITION • Relational data represented as an undirected, attributed graph G=(V,E) • Each node v belongs to V, has a number of attributes (X1………….Xm) • For a time step ‘t’, the attributes and relationships can change. • Significant Influence : Attributes in t+1 depend on link structure at t. • Significant Homophily : Link structure in t+1 will depend on attributes at t. (Keep them in mind! We will come back to them)
  • 24. BACKGROUND • In Statistics, an association is a relationship between two statistically dependent quantities. • ‘Relation Autocorrelation’ : Statistical dependency between values of the same variable on related object. ( Abundant in our dataset) Why? • In this work we use the Chi-Square statistics.
  • 26. CHI-SQUARE STATISTICS • How likely is an observed distribution due to chance? • Observe 100 students to see “whether attending class influences how students perform on exam?” • Four categories : – – – – Students who attend class and pass. Students who attend class and do not pass. Students who do not attend class and pass. Students who do not attend class and do not pass. • Null Hypothesis : There is no difference based on attending classes.
  • 27. CHI-SQUARE Continued…. • The test compares the observed data to a model that distributes the data according to the expectation that the variables are independent. Wherever the observed data doesn't fit the model, the likelihood that the variables are dependent becomes stronger, thus proving the null hypothesis incorrect! • Degree of freedom : Values in final calculations that are free to vary. • Calculate the Chi Square value. (How?) • Calculate the more interesting ‘p’ value (Percentage likelihood that the null hypothesis is correct)
  • 29. CORRELATION GAIN gain(t,t+1) = C( Xt+1, Gt+1 ) – C( Xt , Gt) (The gain could be due to Homophily or Social Influence)
  • 30.
  • 31. HOMOPHILY Continued… If a Homophily effect is present in the data, the autocorrelation will increase when we consider the link changes from time t to time t+ 1 : C( Xt , Gt+1 ) – C( Xt , Gt ) (The Chi-Square value is a single number that adds up all the differences between our actual data and the data expected.)
  • 32.
  • 33. SOCIAL INFLUENCE Continued… If an influence effect is present in the data, the autocorrelation will increase when we consider the attribute changes from time t to time t + 1: C( Xt +1 , Gt ) – C( Xt , Gt ) (The Chi-Square value is a single number that adds up all the differences between our actual data and the data expected.)
  • 35. RANDOMIZATION TESTS • Provide a robust statistical technique for hypothesis testing. • Generates several Pseudosamples (permutations of original data sets). • Correlation gain is calculated for each Pseudosample. • Value of observed gain is then compared to distribution of scores. • A high variance in comparison to the distribution is deemed significant.
  • 36. ANALYSIS OF KEY ISSUES AND ASSUMPTIONS (For Randomization Tests) • Make an appropriate NULL Hypothesis. • The data is permuted in a way that accurately reflects the null hypothesis.
  • 37. SELF ANALYSIS The Approach is quite relevant and appropriate as there are no assumptions on the underlying model. Also both the attribute values and link change over time which focuses on assessing both Influence and Homophily.
  • 38. NULL HYPOTHESIS • H0H : Link changes are random and are not due to attribute values in t. • H0I : Attribute changes are random and are not due to friends in t. • H0F : Both attribute and link changes are random.
  • 40. CHOICE BASED RANDOMIZATION • For H0H we can maintain the edge addition in t+1 but randomize the choice of target node so that each node has the same number of additions and deletions. • For H0I we can randomized the choice of attribute value to replace in t+1, so that any similarity of the value is destroyed. • This is popularly referred to as “choice-based” randomization, as we are randomizing the result of choices(attribute/link changes)
  • 41. CALCULATING CHOICE BASED RANDOMIZATION • • • • Non Trivial Problem. A greedy assignment is involved. Collect all the changes (edge & attributes). Sort the nodes and attributes from those with least number of random options to those with largest options. • Prevents abusing the underlying NULL hypothesis
  • 42.
  • 43.
  • 44. SELF ANALYSIS Where to go from here? • Changing the granularity of time step to investigate deeper. • Investigating why certain groups had more of Homophily or Social Influence? • Apart from friendship, considering other influential effects.
  • 45. SUMMARY • Successful Employed a Randomization Technique for distinguishing Homophily and Social Influence. • Tested the hypothesis on different synthetic-real world data sets. • Different groups had Influence and Homophily vary to different degree based on group properties.
  • 46. PERSONAL TAKEAWAY Take a Statistics Class !