SlideShare ist ein Scribd-Unternehmen logo
1 von 30
PERSONALIZED RETWEET PREDICTION IN TWITTER


        Liangjie Hong*, Lehigh University
        Aziz Doumith, Lehigh University
  1
        Brian D. Davison, Lehigh University
OVERVIEW
 Motivation
 Related Work

 Our Method

 Experimental Results




                         2
MOTIVATIONS
Social information platforms




                               3
MOTIVATIONS
Information overload




                       4
MOTIVATIONS
Information shortage




                       5
MOTIVATIONS




                                                                                          6


              Photo from: http://www.jenful.com/2011/06/google-1-and-the-filter-bubble/
MOTIVATIONS




                                                 7


              Photo from: http://graphlab.org/
MOTIVATIONS




                                               8


              Photo from: http://kexino.com/
MOTIVATIONS




                                                                        9


              Photo from: http://performancemarketingassociation.com/
TASKS
Given a target user and his/her friends, provide a ranked
list of tweets from these friends such that the tweets that
are potentially retweeted will be ranked higher.




                                                                              10


                    Photo from: http://performancemarketingassociation.com/
RELATED WORK
Generic Popular Tweets Analysis/Prediction
 [Suh et al., SocialCom 2010]

 [Y. Kim and K. Shim, ICDM, 2011]

 [Uysal and W. B. Croft, CIKM 2011]

 [Hong et al., WWW 2011]

Personalized Tweets Prediction
 [Chen et al., SIGIR 2012]

 [Peng et al., ICDM Workshop 2011]



                                             11
RELATED WORK
Generic Popular Tweets Analysis/Prediction
 [Suh et al., SocialCom 2010]

 [Y. Kim and K. Shim, ICDM, 2011]

 [Uysal and W. B. Croft, CIKM 2011]

 [Hong et al., WWW 2011]

Personalized Tweets Analysis/Prediction
 [Chen et al., SIGIR 2012]

 [Peng et al., ICDM Workshop 2011]


       Understanding users’ behaviors & content modeling
                                                           12
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering

 Incorporating a rich-set of features

 Coupled modeling with content

 Learning a correct objective function

 Scalability




                                          13
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering




   Latent factor models



                                        14
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering

 Incorporating a rich-set of features




   Latent factor models
       Factorization Machines [Rendle, ACM TIST 2012]
                                                         15
OUR METHOD
Factorization Machines
 Generic enough
       matrix factorization
       pairwise interaction tensor factorization
       SVD++
       neighborhood models
       …
   Technically mature
       [Rendle, ICDM 2010]
       [Rendle et al., SIGIR 2011]
       [Freudenthaler et al., NIPS Workshop 2011]
       [Rendle et al., WSDM 2012]
                                                     16
       [Rendle, ACM TIST 2012]
OUR METHOD
Extending Factorization Machines
 Non-negative decomposition of term-tweet matrix
       Compatible to standard topic models
   Co-Factorization Machines
       Multiple aspects of the dataset
         Shared feature paradigm
         Shared latent space paradigm

         Regularized latent space paradigm




                                                    17
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering

 Incorporating a rich-set of features

 Coupled modeling with content

 Learning a correct objective function

 Scalability




                                          18
OUR METHOD
Design requirements
 Learning objective functions for different aspects
     User decisions
         Ranking-based loss
          Weighted Approximately Rank Pairwise loss (WARP)
     Content modeling
       Log-Poisson loss
       Logistic loss




                                                             19
OUR METHOD
WARP loss
 Proposed by [Usunier et al., ICML 2009]

 Image retrieval tasks and IR tasks
    [Weston et al., Machine Learning 2010]
    [Weston et al., ICML 2012]
    [Weston et al., UAI 2012]
    [Bordes et al, AISTATS 2012]
   Can mimic many ranking measures
       NDCG, MAP, Precision@k


   Applied to collaborative filtering
                                             20
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering

 Incorporating a rich-set of features

 Coupled modeling with content

 Learning a correct objective function

 Scalability (Stochastic Gradient Descent)




                                              21
EXPERIMENTS
Twitter data
 0.7M target users with 11M tweets

 4.3M neighbor users with 27M tweets

 “Complete” sample for each target user

 Mean Average Precision (MAP) as measure

 Train/test on consecutive time periods




                                            22
EXPERIMENTS
Comparisons
 Matrix factorization (MF)

 Matrix factorization with attributes (MFA)

 CPTR [Chen et al, SIGIR 2012]

 Factorization machines with attributes (FMA)

 CoFM with shared features (CoFM-SF)

 CoFM with shared latent spaces (CoFM-SL)

 CoFM with latent space regularization (CoFM-REG)



                                                     23
EXPERIMENTS
Comparisons
 Matrix factorization (MF)

 Matrix factorization with attributes (MFA)

 CPTR [Chen et al, SIGIR 2012]

 Factorization machines with attributes (FMA)

 CoFM with shared features (CoFM-SF)

 CoFM with shared latent spaces (CoFM-SL)

 CoFM with latent space regularization (CoFM-REG)



                                                     24
EXPERIMENTS




              25
EXPERIMENTS




              26
EXPERIMENTS




              27
EXPERIMENTS
Examples of topics are shown. The terms are top ranked terms in
each topic. The topic names in bold are given by the authors.
Entertainment


album music lady artist video listen itunes apple produced movies #bieber bieber new songs

Finance

percent billion bank financial debt banks euro crisis rates greece bailout spain economy


Politics


party election budget tax president million obama money pay bill federal increase cuts
                                                                                             28
CONCLUSIONS
   Main contributions
     Propose Co-Factorization Machines (CoFM) to handle
      two (multiple) aspects of the dataset.
     Apply FM to text data with constraints to mimic topic
      models
     Introduce WARP loss into collaborative filtering/recsys
      models
     Explore a wide range of features and demonstrate the
      effectiveness of feature sets with significant
      improvement over several non-trival baselines.


                                                                29
THANK YOU.




   Liangjie Hong
   PhD candidate
   WUME Lab
   Lehigh University
   lih307@cse.lehigh.edu   30

Weitere ähnliche Inhalte

Ähnlich wie Personalized Retweet Prediction in Twitter

Towards Method Engineering of Model-Driven User Interface Development
Towards Method Engineering ofModel-Driven User Interface Development Towards Method Engineering ofModel-Driven User Interface Development
Towards Method Engineering of Model-Driven User Interface Development
Jean Vanderdonckt
 
PresentationTest
PresentationTestPresentationTest
PresentationTest
bolu804
 
Systems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature conceptsSystems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature concepts
ijcsit
 

Ähnlich wie Personalized Retweet Prediction in Twitter (20)

Towards Method Engineering of Model-Driven User Interface Development
Towards Method Engineering ofModel-Driven User Interface Development Towards Method Engineering ofModel-Driven User Interface Development
Towards Method Engineering of Model-Driven User Interface Development
 
Automating Software Development Using Artificial Intelligence (AI)
Automating Software Development Using Artificial Intelligence (AI)Automating Software Development Using Artificial Intelligence (AI)
Automating Software Development Using Artificial Intelligence (AI)
 
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...
 
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
 
Requirements Variability Specification for Data Intensive Software
Requirements Variability Specification for Data Intensive Software Requirements Variability Specification for Data Intensive Software
Requirements Variability Specification for Data Intensive Software
 
Discreate eventsimulation idef
Discreate eventsimulation idefDiscreate eventsimulation idef
Discreate eventsimulation idef
 
Productivity mdd mdb_code_centric
Productivity mdd mdb_code_centricProductivity mdd mdb_code_centric
Productivity mdd mdb_code_centric
 
OO Development 1 - Introduction to Object-Oriented Development
OO Development 1 - Introduction to Object-Oriented DevelopmentOO Development 1 - Introduction to Object-Oriented Development
OO Development 1 - Introduction to Object-Oriented Development
 
Sub1583
Sub1583Sub1583
Sub1583
 
PresentationTest
PresentationTestPresentationTest
PresentationTest
 
Large Graph Mining
Large Graph MiningLarge Graph Mining
Large Graph Mining
 
2008.11560v2.pdf
2008.11560v2.pdf2008.11560v2.pdf
2008.11560v2.pdf
 
Fundamentals of Deep Recommender Systems
 Fundamentals of Deep Recommender Systems Fundamentals of Deep Recommender Systems
Fundamentals of Deep Recommender Systems
 
Introduction to MDE
Introduction to MDEIntroduction to MDE
Introduction to MDE
 
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...
 
Systems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature conceptsSystems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature concepts
 
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
 
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...
 
Self-adaptation Driven by goals in SysML Models
Self-adaptation Driven by goals in SysML ModelsSelf-adaptation Driven by goals in SysML Models
Self-adaptation Driven by goals in SysML Models
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Personalized Retweet Prediction in Twitter

  • 1. PERSONALIZED RETWEET PREDICTION IN TWITTER Liangjie Hong*, Lehigh University Aziz Doumith, Lehigh University 1 Brian D. Davison, Lehigh University
  • 2. OVERVIEW  Motivation  Related Work  Our Method  Experimental Results 2
  • 6. MOTIVATIONS 6 Photo from: http://www.jenful.com/2011/06/google-1-and-the-filter-bubble/
  • 7. MOTIVATIONS 7 Photo from: http://graphlab.org/
  • 8. MOTIVATIONS 8 Photo from: http://kexino.com/
  • 9. MOTIVATIONS 9 Photo from: http://performancemarketingassociation.com/
  • 10. TASKS Given a target user and his/her friends, provide a ranked list of tweets from these friends such that the tweets that are potentially retweeted will be ranked higher. 10 Photo from: http://performancemarketingassociation.com/
  • 11. RELATED WORK Generic Popular Tweets Analysis/Prediction  [Suh et al., SocialCom 2010]  [Y. Kim and K. Shim, ICDM, 2011]  [Uysal and W. B. Croft, CIKM 2011]  [Hong et al., WWW 2011] Personalized Tweets Prediction  [Chen et al., SIGIR 2012]  [Peng et al., ICDM Workshop 2011] 11
  • 12. RELATED WORK Generic Popular Tweets Analysis/Prediction  [Suh et al., SocialCom 2010]  [Y. Kim and K. Shim, ICDM, 2011]  [Uysal and W. B. Croft, CIKM 2011]  [Hong et al., WWW 2011] Personalized Tweets Analysis/Prediction  [Chen et al., SIGIR 2012]  [Peng et al., ICDM Workshop 2011] Understanding users’ behaviors & content modeling 12
  • 13. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Incorporating a rich-set of features  Coupled modeling with content  Learning a correct objective function  Scalability 13
  • 14. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Latent factor models 14
  • 15. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Incorporating a rich-set of features  Latent factor models  Factorization Machines [Rendle, ACM TIST 2012] 15
  • 16. OUR METHOD Factorization Machines  Generic enough  matrix factorization  pairwise interaction tensor factorization  SVD++  neighborhood models  …  Technically mature  [Rendle, ICDM 2010]  [Rendle et al., SIGIR 2011]  [Freudenthaler et al., NIPS Workshop 2011]  [Rendle et al., WSDM 2012] 16  [Rendle, ACM TIST 2012]
  • 17. OUR METHOD Extending Factorization Machines  Non-negative decomposition of term-tweet matrix  Compatible to standard topic models  Co-Factorization Machines  Multiple aspects of the dataset  Shared feature paradigm  Shared latent space paradigm  Regularized latent space paradigm 17
  • 18. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Incorporating a rich-set of features  Coupled modeling with content  Learning a correct objective function  Scalability 18
  • 19. OUR METHOD Design requirements  Learning objective functions for different aspects  User decisions  Ranking-based loss Weighted Approximately Rank Pairwise loss (WARP)  Content modeling  Log-Poisson loss  Logistic loss 19
  • 20. OUR METHOD WARP loss  Proposed by [Usunier et al., ICML 2009]  Image retrieval tasks and IR tasks [Weston et al., Machine Learning 2010] [Weston et al., ICML 2012] [Weston et al., UAI 2012] [Bordes et al, AISTATS 2012]  Can mimic many ranking measures  NDCG, MAP, Precision@k  Applied to collaborative filtering 20
  • 21. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Incorporating a rich-set of features  Coupled modeling with content  Learning a correct objective function  Scalability (Stochastic Gradient Descent) 21
  • 22. EXPERIMENTS Twitter data  0.7M target users with 11M tweets  4.3M neighbor users with 27M tweets  “Complete” sample for each target user  Mean Average Precision (MAP) as measure  Train/test on consecutive time periods 22
  • 23. EXPERIMENTS Comparisons  Matrix factorization (MF)  Matrix factorization with attributes (MFA)  CPTR [Chen et al, SIGIR 2012]  Factorization machines with attributes (FMA)  CoFM with shared features (CoFM-SF)  CoFM with shared latent spaces (CoFM-SL)  CoFM with latent space regularization (CoFM-REG) 23
  • 24. EXPERIMENTS Comparisons  Matrix factorization (MF)  Matrix factorization with attributes (MFA)  CPTR [Chen et al, SIGIR 2012]  Factorization machines with attributes (FMA)  CoFM with shared features (CoFM-SF)  CoFM with shared latent spaces (CoFM-SL)  CoFM with latent space regularization (CoFM-REG) 24
  • 28. EXPERIMENTS Examples of topics are shown. The terms are top ranked terms in each topic. The topic names in bold are given by the authors. Entertainment album music lady artist video listen itunes apple produced movies #bieber bieber new songs Finance percent billion bank financial debt banks euro crisis rates greece bailout spain economy Politics party election budget tax president million obama money pay bill federal increase cuts 28
  • 29. CONCLUSIONS  Main contributions  Propose Co-Factorization Machines (CoFM) to handle two (multiple) aspects of the dataset.  Apply FM to text data with constraints to mimic topic models  Introduce WARP loss into collaborative filtering/recsys models  Explore a wide range of features and demonstrate the effectiveness of feature sets with significant improvement over several non-trival baselines. 29
  • 30. THANK YOU. Liangjie Hong PhD candidate WUME Lab Lehigh University lih307@cse.lehigh.edu 30