SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Who Will Follow You
                    Back? Reciprocal
                    Relationship
1
1
                2
                    Prediction*  3
John Hopcroft, Tiancheng Lou, Jie Tang
Department of Computer Science, Cornell University,
2Institutefor Interdisciplinary Information Sciences, Tsinghua
University
3Department of Computer Science, Tsinghua University
reciprocal
                                                       parasocial

Motivation                                                               v2
                                                                                          v3
                                                             v1




   Two kinds of relationships in social network,                              v4

                                                                   v5
      one-way(called parasocial) relationship and,
      two-way(called reciprocal) relationship
                                                                                    v6
   Two-way(reciprocal) relationship
      usually developed from a one-way relationship
      more trustful.                                    after 3 days               prediction
   Try to understand(predict) the formation of two-
    way relationships
      micro-level dynamics of the social network.
                                                                          v2
                                                                                           v3
      underlying community structure?                        v1


      how users influence each other?
                                                                                v4

                                                                    v5




                                                                                     v6
Example : real friend relationship

   On Twitter : Who Will Follow You Back?

                                     30%
                                       ?

                100%
                  ?
                                      60%
                                        ?          JimmyQiao

   Ladygaga           1%
                       ?
                           Shiteng




              Obama                        Huwei
Several key challenges

   How to model the formation of
    two-way relationships?                                        y2= ?
                                                                  y2= ?
                                                                y2
                                                                y2
        SVM & LRC                                                                  y4 = 0
                                                                                     4


   How to combine many social                                                       y4
                                                                                      4


    theories into the prediction              y1= 1 y1
                                              y1= 1 y1

    model?                                                                y3
                                                                          y3
                                                                          y3 = ?
                                                                          y3 = ?

                          v2
                                         v3
               v1




                               v4

                     v5
                                                              (v1,, v2)
                                                              (v1 v2)
                                                                                   (v3, v4)
                                                                                     3   4

                                    v6            (v1,, v5)
                                                  (v1 v5)

                                                                     (v2,, v4)
                                                                     (v2 v4)
Outline

   Previous works
   Our approach
   Experimental results
   Conclusion & future works
Link prediction

   Unsupervised link prediction
        Scores & intution, such as preferential attachment [N01].
   Supervised link prediction
        supervised random walks [BL11].
        logistic regression model to predict positive and negative links [L10].
   Main differences:
        We predict a directed link instead of only handles undirected social networks.
        Our model is dynamic and learned from the evolution of the Twitter network.
Social behavior analysis

   Existing works on social behavior analysis:
      The difference of the social influence on difference topics and to model the topic-
       level social influence in social networks. [T09]
      How social actions evolve in a dynamic social network? [T10]
   Main differences:
        The proposed methods in previous work can be used here
        but the problem is fundamentally different.
Twitter study

   The twitter network.
        The topological and geographical properties. [J07]
        Twittersphere and some notable properties, such as a non-power-law follower
         distribution, and low reciprocity. [K10]
   The twitter users.
        Influential users.
        Tweeting behaviors of users.
   The tweets.
        Utilize the real-time nature to detect a target event. [S10]
        TwitterMonitor, to detect emerging topics. [M10]
Outline

   Previous works
   Our approach
   Experimental results
   Conclusion & future works
Factor graph model

   Problem definition
        Given a network at time t, i.e., Gt = (Vt, Et, Xt, Yt)
        Variables y are partially labeled.
        Goal : infer unknown variables.
   Factor graph model
        P(Y | X, G) = P(X, G|Y) P(Y) / P(X, G) = C0 P(X | Y) P(Y | G)
        In P(X | Y), assuming that the generative probability is conditionally independent,
        P(Y | X, G) = C0P(Y | G)ΠP(xi|yi)
        How to deal with P(Y | G) ?
Random Field

                              Given an undirected graph G=(V,
     Ya                  Yb      E) such that Y={Yv|v∈V}, if

                              the probability of Yv given X and
          Yc                     those random variables
                                 corresponding to nodes
                                 neighboring v in G. Then (X, Y)
                    Ye           is a conditional random field.
    Yd
                              undirected graphical model
               Yf                globally conditioned on X
Conditional random field(CRF)

   Model them in a Markov random field, by the Hammersley-Clifford theorem,
      P(xi|yi) = 1/Z1 * exp {Σαj fj (xij, yi)}
      P(Y|G) = 1/Z2 * exp {ΣcΣkμkhk(Yc)}
      Z1 and Z2 are normalization factors.
   So, the likelihood probability is :
      P(Y | X, G) = 1/Z0 * exp {Σαj fj (xij, yi) + ΣcΣkμkhk(Yc)}
      Z0 is normalization factors.
Maximize likelihood

      Objective function
         O(θ) = log Pθ(Y | X, G) = ΣiΣjαj fj (xij, yi) + ΣΣμk hk(Yc) – log Z

      Learning the model to
         estimate a parameter configuration θ= {α, μ} to maximize the objective
          function :
         that is, the goal is to compute θ* = argmax O(θ)
Learning algorithm

           θ*
    Goal : = argmax O(θ)                     L                                    (k )        (k)       ( Z ( x ( k ) ))'
                                                                     Fj ( y              ,x         )
                                                 j          k                                              Z ( x(k ) )
   Newton methold.
   The gradient of each μk with           Z ( x(k ) )                   exp             F ( y, x ( k ) )
                                                                     y
    regard to the objective
                                                                                  exp(              F ( y, x ( k ) )) F j ( y, x ( k ) )
    function.                              (Z ( x    (k )
                                                            ))   '
                                                                          y
        dθ/ dμk= E[hk(Yc)] – EPμ (Yc|X,
                                   k         Z (x    (k )
                                                            )                                       exp       F ( y, x ( k ) )
         G)[hk(Yc)]
                                                                                               y


                                                                                         exp(           F ( y, x ( k ) )
                                                                                                                     (k )
                                                                                                                                ) F j ( y, x ( k ) )
                                                                              y              exp          F ( y, x          )
                                                                                         y

                                                                                     p ( y | x ( k ) ) F j ( y, x ( k ) )
                                                                              y

                                                                          E p (Y | x( k ) ) F j (Y , x ( k ) )
Learning algorithm

   One challenge : how to calculate the marginal distribution Pμ (yc|x, G).
                                                                      k


      Intractable to directly calculate the marginal distribution.
      Use approximation algorithm : Belief propagation.
Belief propagation

   Sum-product
     The algorithm works by passing real values functions called “messages” along
      the edges between the nodes.
     Extend edges to triangles
Belief propagation

   A “message” from a variable node v to a factor node u is
    μv->u(xv) = Πu*∈N(v){u}μu*->v(xv)
   A “message” from a factor node u to a variable node v is
    μu->v(xu) = Σx’ux’v=xvΠu*∈N(u){v}μv*->u(x’u)
Learning algorithm(TriFG model)

Input : network Gt, learning rateη
Output : estimated parametersθ

Initalize θ= 0;
Repeat
     Perform LBP to calculate marginal distribution of unknown variables P(yi|xi, G);
     Perform LBP to calculate marginal distribution of triad c, i.e. P(yc|Xc, G);
     Calculate the gradient of μk according to :
          dθ/ dμk= E[hk(Yc)] – EPμ (Yc|X, G)[hk(Yc)]
                                    k


     Update parameter θ with the learning rate η:
          θ new = θold + ηd θ
Until Convergence;
Prediction features

   Geographic distance
        Global vs Local
   Homophily
        Link homophily
        Status homophily
   Implicit structure
        Retweet or reply
                                      (A) and (B) are balanced, but (C) and (D) are not.
        Retweeting seems to be     Users who share                     Elite users have a
         more helpful                             Global
                                    common links will                      Local
                                                                        much stronger
   Structural balance              have a tendency to                  tendency to
      Two-way relationships are    follow each other.                  follow each other
       balanced (88%),
      But, one-way relationships
       are not (only 29%).
Our approach : TriFG

   TriFG model
     Features based on observations
     Partially labeled
     Conditional random field
     Triad correlation factors
Outline

   Previous works
   Our approach
   Experimental results
   Conclusion & future works
Data collection

   Huge sub-network of twitter
      13,442,659 users and 56,893,234 following links.
      Extracted 35,746,366 tweets.
   Dynamic networks
      With an average of 728,509 new links per day.
      Averagely 3,337 new follow-back links per day.
      13 time stamps by viewing every four days as a time stamp
Prediction performance

   Baseline algorithms
         SVM & LRC & CRF
   Accurately infer 90% of reciprocal relationships in twitter.
         Data     Algotithm     Precision   Recall   F1Measure   Accuracy
                    SVM          0.6908     0.6129    0.6495      0.9590
      Test          LRC          0.6957     0.2581    0.3765      0.9510
      Case
                    CRF          1.0000     0.6290    0.7723      0.9770
       1
                   TriFG         1.0000     0.8548    0.9217      0.9910
                    SVM          0.7323     0.6212    0.6722      0.9534
      Test          LRC          0.8333     0.3030    0.4444      0.9417
      Case
                    CRF          1.0000     0.6333    0.7755      0.9717
       2
                   TriFG         1.0000     0.8788    0.9355      0.9907
Convergence property

   BLP-based learning algorithm can converge in less than 10 iterations.
   After only 7 learning iterations, the prediction performance of TriFG
    becomes stable.
Effect of Time Span

   Distribution of time span in which follow back action is performed.
      60% of follow-backs are performed in the next time stamp,
      37% of follow-backs would be still performed in the following 3 time stamps.
   How different settings of the time span.
      The prediction performance of TriFG drops sharply when setting the time span
       as two or less.
      The performance is acceptable, when setting it as three time stamps.
Factor contribution analysis

   We consider five different factor functions
        Geographic distance(G)
        Link homophily(L)
        Status homophily(S)
        Implicit network correlation(I)
        Structural balance correlation(B)
Outline

   Previous works
   Our approach
   Experimental results
   Conclusion & future works
Conclusion

   Reciprocal relationship prediction in social network
   Incorporates social theories into prediction model.
   Several interesting phenomena.
      Elite users tend to follow each other.
      Two-way relationships on Twitter are balanced, but one-way relationships are
       not.
      Social networks are going global, but also stay local.
Future works

   Other social theories for reciprocal relationship prediction.
   User feedback.
   Incorporating user interactions.
   Building a theory for different kinds of networks.
   Thanks!
   Q&A
Reference
    [BL11] L.Backstrom and J.Leskovec. Supervised random walks :
     predicting and recommending links in social networks. In WSDM’11
    [C10] D.J.Crandall, L.Backstrom, D. Cosley, S.Suri, D.Huttenlocher, and J.
     Kleinberg. Inferring social ties from geographic coincidences. PNAS, Dec.
     2010
    [W10] C.Wang, J. Han, Y.Jia, J.Tang, D.Zhang, Y. Yu and J.Guo. Mining
     advisor-advisee relationships from research publication networks. In
     KDD’10.
    [N01]M.E.J. Newman. Clustering and preferential attachment in growing
     networks. Phys. Rev. E, 2001
    [L10] J.Leskovec, D.Huttenlocher, and J.Kleinberg. Predicting positive and
     negative links in online social networks. In WWW10.
    [T10] C.Tan, J. Tang, J. Sun, Q.Lin, and F.Wang. Social action tracking
     via noise tolerant time-varying factor graphs. In KDD10
    [T09] J. Tang, J. Sun, C. Wang, and Z. Yang. Social influence analysis in
     large-scale networks. In KDD09.
Reference
    [J07]A. Java, X.Song, T.Finin, and B.L. Tseng. Why we twitter : An
     analysis of a microblogging community. In KDD2007.
    [K10]H. Kwak, C.Lee, H.Park, and S.B. Moon. What is twitter, a social
     network or a news media? In WWW2010.
    [M10]M.Mathioudakis and N.Koudas. Twittermonitor : trend detection over
     the twitter stream. In SIGMOD10.
    [S10]T. Sakaki, M. Okazaki, and Y.Matsuo. Earthquake shakes twitter
     users: real-time event detection by social sensors. In WWW10.

Weitere ähnliche Inhalte

Kürzlich hochgeladen

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Kürzlich hochgeladen (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

Empfohlen

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 

Empfohlen (20)

Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 

Who will-follow-you-back-reciprocal-relationship-prediction

  • 1. Who Will Follow You Back? Reciprocal Relationship 1 1 2 Prediction* 3 John Hopcroft, Tiancheng Lou, Jie Tang Department of Computer Science, Cornell University, 2Institutefor Interdisciplinary Information Sciences, Tsinghua University 3Department of Computer Science, Tsinghua University
  • 2. reciprocal parasocial Motivation v2 v3 v1  Two kinds of relationships in social network, v4 v5  one-way(called parasocial) relationship and,  two-way(called reciprocal) relationship v6  Two-way(reciprocal) relationship  usually developed from a one-way relationship  more trustful. after 3 days prediction  Try to understand(predict) the formation of two- way relationships  micro-level dynamics of the social network. v2 v3  underlying community structure? v1  how users influence each other? v4 v5 v6
  • 3. Example : real friend relationship On Twitter : Who Will Follow You Back? 30% ? 100% ? 60% ? JimmyQiao Ladygaga 1% ? Shiteng Obama Huwei
  • 4. Several key challenges  How to model the formation of two-way relationships? y2= ? y2= ? y2 y2  SVM & LRC y4 = 0 4  How to combine many social y4 4 theories into the prediction y1= 1 y1 y1= 1 y1 model? y3 y3 y3 = ? y3 = ? v2 v3 v1 v4 v5 (v1,, v2) (v1 v2) (v3, v4) 3 4 v6 (v1,, v5) (v1 v5) (v2,, v4) (v2 v4)
  • 5. Outline  Previous works  Our approach  Experimental results  Conclusion & future works
  • 6. Link prediction  Unsupervised link prediction  Scores & intution, such as preferential attachment [N01].  Supervised link prediction  supervised random walks [BL11].  logistic regression model to predict positive and negative links [L10].  Main differences:  We predict a directed link instead of only handles undirected social networks.  Our model is dynamic and learned from the evolution of the Twitter network.
  • 7. Social behavior analysis  Existing works on social behavior analysis:  The difference of the social influence on difference topics and to model the topic- level social influence in social networks. [T09]  How social actions evolve in a dynamic social network? [T10]  Main differences:  The proposed methods in previous work can be used here  but the problem is fundamentally different.
  • 8. Twitter study  The twitter network.  The topological and geographical properties. [J07]  Twittersphere and some notable properties, such as a non-power-law follower distribution, and low reciprocity. [K10]  The twitter users.  Influential users.  Tweeting behaviors of users.  The tweets.  Utilize the real-time nature to detect a target event. [S10]  TwitterMonitor, to detect emerging topics. [M10]
  • 9. Outline  Previous works  Our approach  Experimental results  Conclusion & future works
  • 10. Factor graph model  Problem definition  Given a network at time t, i.e., Gt = (Vt, Et, Xt, Yt)  Variables y are partially labeled.  Goal : infer unknown variables.  Factor graph model  P(Y | X, G) = P(X, G|Y) P(Y) / P(X, G) = C0 P(X | Y) P(Y | G)  In P(X | Y), assuming that the generative probability is conditionally independent,  P(Y | X, G) = C0P(Y | G)ΠP(xi|yi)  How to deal with P(Y | G) ?
  • 11. Random Field Given an undirected graph G=(V, Ya Yb E) such that Y={Yv|v∈V}, if the probability of Yv given X and Yc those random variables corresponding to nodes neighboring v in G. Then (X, Y) Ye is a conditional random field. Yd undirected graphical model Yf globally conditioned on X
  • 12. Conditional random field(CRF)  Model them in a Markov random field, by the Hammersley-Clifford theorem,  P(xi|yi) = 1/Z1 * exp {Σαj fj (xij, yi)}  P(Y|G) = 1/Z2 * exp {ΣcΣkμkhk(Yc)}  Z1 and Z2 are normalization factors.  So, the likelihood probability is :  P(Y | X, G) = 1/Z0 * exp {Σαj fj (xij, yi) + ΣcΣkμkhk(Yc)}  Z0 is normalization factors.
  • 13. Maximize likelihood  Objective function  O(θ) = log Pθ(Y | X, G) = ΣiΣjαj fj (xij, yi) + ΣΣμk hk(Yc) – log Z  Learning the model to  estimate a parameter configuration θ= {α, μ} to maximize the objective function :  that is, the goal is to compute θ* = argmax O(θ)
  • 14. Learning algorithm  θ* Goal : = argmax O(θ) L (k ) (k) ( Z ( x ( k ) ))' Fj ( y ,x ) j k Z ( x(k ) )  Newton methold.  The gradient of each μk with Z ( x(k ) ) exp F ( y, x ( k ) ) y regard to the objective exp( F ( y, x ( k ) )) F j ( y, x ( k ) ) function. (Z ( x (k ) )) ' y  dθ/ dμk= E[hk(Yc)] – EPμ (Yc|X, k Z (x (k ) ) exp F ( y, x ( k ) ) G)[hk(Yc)] y exp( F ( y, x ( k ) ) (k ) ) F j ( y, x ( k ) ) y exp F ( y, x ) y p ( y | x ( k ) ) F j ( y, x ( k ) ) y E p (Y | x( k ) ) F j (Y , x ( k ) )
  • 15. Learning algorithm  One challenge : how to calculate the marginal distribution Pμ (yc|x, G). k  Intractable to directly calculate the marginal distribution.  Use approximation algorithm : Belief propagation.
  • 16. Belief propagation  Sum-product  The algorithm works by passing real values functions called “messages” along the edges between the nodes.  Extend edges to triangles
  • 17. Belief propagation  A “message” from a variable node v to a factor node u is μv->u(xv) = Πu*∈N(v){u}μu*->v(xv)  A “message” from a factor node u to a variable node v is μu->v(xu) = Σx’ux’v=xvΠu*∈N(u){v}μv*->u(x’u)
  • 18. Learning algorithm(TriFG model) Input : network Gt, learning rateη Output : estimated parametersθ Initalize θ= 0; Repeat Perform LBP to calculate marginal distribution of unknown variables P(yi|xi, G); Perform LBP to calculate marginal distribution of triad c, i.e. P(yc|Xc, G); Calculate the gradient of μk according to : dθ/ dμk= E[hk(Yc)] – EPμ (Yc|X, G)[hk(Yc)] k Update parameter θ with the learning rate η: θ new = θold + ηd θ Until Convergence;
  • 19. Prediction features  Geographic distance  Global vs Local  Homophily  Link homophily  Status homophily  Implicit structure  Retweet or reply (A) and (B) are balanced, but (C) and (D) are not.  Retweeting seems to be Users who share Elite users have a more helpful Global common links will Local much stronger  Structural balance have a tendency to tendency to  Two-way relationships are follow each other. follow each other balanced (88%),  But, one-way relationships are not (only 29%).
  • 20. Our approach : TriFG  TriFG model  Features based on observations  Partially labeled  Conditional random field  Triad correlation factors
  • 21. Outline  Previous works  Our approach  Experimental results  Conclusion & future works
  • 22. Data collection  Huge sub-network of twitter  13,442,659 users and 56,893,234 following links.  Extracted 35,746,366 tweets.  Dynamic networks  With an average of 728,509 new links per day.  Averagely 3,337 new follow-back links per day.  13 time stamps by viewing every four days as a time stamp
  • 23. Prediction performance  Baseline algorithms  SVM & LRC & CRF  Accurately infer 90% of reciprocal relationships in twitter. Data Algotithm Precision Recall F1Measure Accuracy SVM 0.6908 0.6129 0.6495 0.9590 Test LRC 0.6957 0.2581 0.3765 0.9510 Case CRF 1.0000 0.6290 0.7723 0.9770 1 TriFG 1.0000 0.8548 0.9217 0.9910 SVM 0.7323 0.6212 0.6722 0.9534 Test LRC 0.8333 0.3030 0.4444 0.9417 Case CRF 1.0000 0.6333 0.7755 0.9717 2 TriFG 1.0000 0.8788 0.9355 0.9907
  • 24. Convergence property  BLP-based learning algorithm can converge in less than 10 iterations.  After only 7 learning iterations, the prediction performance of TriFG becomes stable.
  • 25. Effect of Time Span  Distribution of time span in which follow back action is performed.  60% of follow-backs are performed in the next time stamp,  37% of follow-backs would be still performed in the following 3 time stamps.  How different settings of the time span.  The prediction performance of TriFG drops sharply when setting the time span as two or less.  The performance is acceptable, when setting it as three time stamps.
  • 26. Factor contribution analysis  We consider five different factor functions  Geographic distance(G)  Link homophily(L)  Status homophily(S)  Implicit network correlation(I)  Structural balance correlation(B)
  • 27. Outline  Previous works  Our approach  Experimental results  Conclusion & future works
  • 28. Conclusion  Reciprocal relationship prediction in social network  Incorporates social theories into prediction model.  Several interesting phenomena.  Elite users tend to follow each other.  Two-way relationships on Twitter are balanced, but one-way relationships are not.  Social networks are going global, but also stay local.
  • 29. Future works  Other social theories for reciprocal relationship prediction.  User feedback.  Incorporating user interactions.  Building a theory for different kinds of networks.
  • 30. Thanks!  Q&A
  • 31. Reference  [BL11] L.Backstrom and J.Leskovec. Supervised random walks : predicting and recommending links in social networks. In WSDM’11  [C10] D.J.Crandall, L.Backstrom, D. Cosley, S.Suri, D.Huttenlocher, and J. Kleinberg. Inferring social ties from geographic coincidences. PNAS, Dec. 2010  [W10] C.Wang, J. Han, Y.Jia, J.Tang, D.Zhang, Y. Yu and J.Guo. Mining advisor-advisee relationships from research publication networks. In KDD’10.  [N01]M.E.J. Newman. Clustering and preferential attachment in growing networks. Phys. Rev. E, 2001  [L10] J.Leskovec, D.Huttenlocher, and J.Kleinberg. Predicting positive and negative links in online social networks. In WWW10.  [T10] C.Tan, J. Tang, J. Sun, Q.Lin, and F.Wang. Social action tracking via noise tolerant time-varying factor graphs. In KDD10  [T09] J. Tang, J. Sun, C. Wang, and Z. Yang. Social influence analysis in large-scale networks. In KDD09.
  • 32. Reference  [J07]A. Java, X.Song, T.Finin, and B.L. Tseng. Why we twitter : An analysis of a microblogging community. In KDD2007.  [K10]H. Kwak, C.Lee, H.Park, and S.B. Moon. What is twitter, a social network or a news media? In WWW2010.  [M10]M.Mathioudakis and N.Koudas. Twittermonitor : trend detection over the twitter stream. In SIGMOD10.  [S10]T. Sakaki, M. Okazaki, and Y.Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In WWW10.