SlideShare a Scribd company logo
1 of 25
Download to read offline
ICTAI 2011, Boca Raton
                                              November 7, 2011


Collaborative Filtering Based on
          Star Users
                   Qiang Liu
      with Bingfei Cheng and Congfu Xu

       College of Computer Science and Technology
                    Zhejiang University
            Hangzhou, Zhejiang 310027, China
                   2012dtd@gmail.com
Outline
 Introduction
 Star-user-based Collaborative Filtering
 Experimental Results
 Conclusion
INTRODUCTION
Collaborative Filtering

                                           User-based
                   Neighborhood-based
Collaborative                              Item-based
Filtering(CF)
                                   Bayesian Model
                                   Factorization Model
                   Model-based    Maximum Entropy
                                   Classification or Clustering
                                   ……
Motivation
   To improve the most widely used
    technology in real-life recommender
    systems.
Neighborhood Model
                                      Similarity between users:
                                                           cov(������,������)
                                  

                                                            ������������ ������������
                                      ◦ Pearson:
                                                               ������∙������
                                                             ������ ������
                                      ◦ Cosine:
                                      ◦ Other similarity measures

                                     Weighted sum of neighbors’
                                      ratings:
                                      ◦ ������������,������ = ������������ +
                                                           ∑������∈������ ������������,������ −������������ ∙ ������������,������
                                                                ∑������∈������ ������������,������

Common items:1,4,6
Rating vectors of common items:
          a=[1,4,5]
          b=[2,2,5]
Challenges faced by traditional
methods

   Matching similar users (computing similarities ):
       Sparsity and noise
       Scalability
       ……
STAR-USER-BASED CF
The MPN users
 Let A, B, C, D are neighbors of users A, B,
  C, D respectively.
 Then area E is the set of the most
  popular neighbors(MPN).
What is star user
 Star users are special users who have
  rated all items with relatively stable
  standard.
 We maintain a small set of star users, and
  treat them as fixed neighbors of every
  general user
Problem Formulation
                Filling the following matrix ℛ ∈ ������������×������ .

                                 Items (N)

                          ������������     …      ������������     …    ������������

                  ������������     ?       .         .     .     ?
Star users(H)
                  …         .      .         .     .      .

                  ������������      .      .     ������������,������   .      .

                   ...      .      .         .     .      .

                  ������������     ?       .         .     .     ?
Prediction Model
                       Selecting Star Neighbors:             Generate predictions
                                                               based on star users’
                            General Users (M)

                          ������������         ������������         ������������
                                                               ratings:
                                                                             �
                                                                   ������������,������ = ������������ +
                                                                                      ∑������∈������ ������������,������ −������������ ∙ ������������,������
                                                                                            ∑������∈������ ������������,������
                 ������������
                                 …              …
                                                               
Star Users (H)




                                                               The parameters are ������������,������
                            .     .      .      .     .


                 ������������                 ������������,������                  and ������������,������ .
                 …          .     .      .      .     .
                                                           
                            .     .             .     .


                 ������������
                  ...       .     .      .      .     .
                            .     .      .      .     .

                            Relationship Matrix W
How we get star users(1)

    1. Initialization star user matrix ℛ.
   Training Stage:

    2. Predict each rating ������̂������,������ in the training set:
                                    ∑������∈������(������������,������ − ������̅������ ) × ������������,������
               ������̂������,������   = ������̅������ +
                                           ∑������∈������ ������������,������
    3. The residual is ������������,������ = ������������,������ − ������̂������,������
       gradient of ������������,������ 2 is:
                                                                         and the


                              ������������,������ 2 = −2������������,������ ∙ ∑
                                                                ������−1
                 ������                                                  ∙������������,������
                                                                  ������
               ������������������,������                                          ������∈������ ������������,������
How we get star users(2)

    4. Update each element of matrix ℛ:
   Training Stage:

                                                ������������,������
        ������������,������   ← ������������,������ + ������ ∙ ������������,������ ∙
                                             ∑������∈������ ������������,������

    5. Repeat steps 2 to 4 until convergence.
How we get star users(3)

    ◦ ������ (users):The update frequency of ������̅������ .
   Parameters:

    ◦ ������ ������������������������������������������������������������ :The update frequency of
      ������������,������ ∈ ������ for each u, and s.
    w������,������ is computed using Pearson Correlation

                  ������ ∈ ������������×������
   Maintain the relationship matrix W:

    until recommending stage.
EXPERIMENTAL RESULTS
Results on MovieLens Dataset




RMSE of our approach against    Time requirement comparison
various H and comparison with
kNN
Item-based Model
 We firstly train a small set of star items
  instead of star users.
 Predictions are computed as:

                        ∑������∈������ ′ ������������,������ − ������������ × ������������,������
                                           �
    ������������,������   = ������̅������ +
                                 ∑������∈������ ′ ������������,������
Results on Netflix Dataset




Our approach with different values   Our approach with different values
of learning rate                     of H
Discussion
   Comparison with kNN  Comparison with SVD

    ◦ Accuracy                    ◦ Scientific explanation
    ◦ Data Sparsity               ◦ Parameters
    ◦ Scalability                 ◦ Updating

    ������ ������2 × ������ ′
          → ������(������ × ������ × ������ ′ )
    where ������ ≪ ������.
CONCLUSION
Summary
 We proposed a novel CF model based on
  star users.
 The original intention is to improve
  traditional neighborhood-based CF model.
 Experimental results on two datasets
  verified the effectiveness of our approach.
Future work
 Incorporating contextual information into
  our model.
 Validating our approach in practical
  applications.
THANK YOU

More Related Content

What's hot

5 spatial filtering p1
5 spatial filtering p15 spatial filtering p1
5 spatial filtering p1
Gichelle Amon
 
Digital image processing img smoothning
Digital image processing img smoothningDigital image processing img smoothning
Digital image processing img smoothning
Vinay Gupta
 

What's hot (20)

Kccsi 2012 a real-time robust object tracking-v2
Kccsi 2012   a real-time robust object tracking-v2Kccsi 2012   a real-time robust object tracking-v2
Kccsi 2012 a real-time robust object tracking-v2
 
Block Matching Project
Block Matching ProjectBlock Matching Project
Block Matching Project
 
Fingerprint High Level Classification
Fingerprint High Level ClassificationFingerprint High Level Classification
Fingerprint High Level Classification
 
Image Acquisition and Representation
Image Acquisition and RepresentationImage Acquisition and Representation
Image Acquisition and Representation
 
Chapter 1 introduction (Image Processing)
Chapter 1 introduction (Image Processing)Chapter 1 introduction (Image Processing)
Chapter 1 introduction (Image Processing)
 
Mathematical tools in dip
Mathematical tools in dipMathematical tools in dip
Mathematical tools in dip
 
5 spatial filtering p1
5 spatial filtering p15 spatial filtering p1
5 spatial filtering p1
 
Image Texture Analysis
Image Texture AnalysisImage Texture Analysis
Image Texture Analysis
 
Region filling
Region fillingRegion filling
Region filling
 
Notes on image processing
Notes on image processingNotes on image processing
Notes on image processing
 
PPT s02-machine vision-s2
PPT s02-machine vision-s2PPT s02-machine vision-s2
PPT s02-machine vision-s2
 
Morphological Image Processing
Morphological Image ProcessingMorphological Image Processing
Morphological Image Processing
 
03 digital image fundamentals DIP
03 digital image fundamentals DIP03 digital image fundamentals DIP
03 digital image fundamentals DIP
 
PPT s04-machine vision-s2
PPT s04-machine vision-s2PPT s04-machine vision-s2
PPT s04-machine vision-s2
 
Ao25246249
Ao25246249Ao25246249
Ao25246249
 
PPT s08-machine vision-s2
PPT s08-machine vision-s2PPT s08-machine vision-s2
PPT s08-machine vision-s2
 
An improved Spread Spectrum Watermarking technique to withstand Geometric Def...
An improved Spread Spectrum Watermarking technique to withstand Geometric Def...An improved Spread Spectrum Watermarking technique to withstand Geometric Def...
An improved Spread Spectrum Watermarking technique to withstand Geometric Def...
 
2. filtering basics
2. filtering basics2. filtering basics
2. filtering basics
 
Digital image processing img smoothning
Digital image processing img smoothningDigital image processing img smoothning
Digital image processing img smoothning
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 

Similar to Collaborative Filtering Based on Star Users

EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
Yaxin Liu
 
TunUp final presentation
TunUp final presentationTunUp final presentation
TunUp final presentation
Gianmario Spacagna
 
Nearest Neighbor Algorithm Zaffar Ahmed
Nearest Neighbor Algorithm  Zaffar AhmedNearest Neighbor Algorithm  Zaffar Ahmed
Nearest Neighbor Algorithm Zaffar Ahmed
Zaffar Ahmed Shaikh
 
Download
DownloadDownload
Download
butest
 
Download
DownloadDownload
Download
butest
 
CMA-ES with local meta-models
CMA-ES with local meta-modelsCMA-ES with local meta-models
CMA-ES with local meta-models
zyedb
 

Similar to Collaborative Filtering Based on Star Users (20)

SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix Dataset
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
 
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting RatingsSemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
 
TunUp final presentation
TunUp final presentationTunUp final presentation
TunUp final presentation
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image Generation
 
Hussain Learning Relevant Eye Movement Feature Spaces Across Users
Hussain Learning Relevant Eye Movement Feature Spaces Across UsersHussain Learning Relevant Eye Movement Feature Spaces Across Users
Hussain Learning Relevant Eye Movement Feature Spaces Across Users
 
Nearest Neighbor Algorithm Zaffar Ahmed
Nearest Neighbor Algorithm  Zaffar AhmedNearest Neighbor Algorithm  Zaffar Ahmed
Nearest Neighbor Algorithm Zaffar Ahmed
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
 
Isvc08
Isvc08Isvc08
Isvc08
 
Clustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture modelClustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture model
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine Learning
 
Learning a nonlinear embedding by preserving class neibourhood structure 최종
Learning a nonlinear embedding by preserving class neibourhood structure   최종Learning a nonlinear embedding by preserving class neibourhood structure   최종
Learning a nonlinear embedding by preserving class neibourhood structure 최종
 
Download
DownloadDownload
Download
 
Download
DownloadDownload
Download
 
More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?
 
CMA-ES with local meta-models
CMA-ES with local meta-modelsCMA-ES with local meta-models
CMA-ES with local meta-models
 
C3_W2.pdf
C3_W2.pdfC3_W2.pdf
C3_W2.pdf
 
Vectorise all the things
Vectorise all the thingsVectorise all the things
Vectorise all the things
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Collaborative Filtering Based on Star Users

  • 1. ICTAI 2011, Boca Raton November 7, 2011 Collaborative Filtering Based on Star Users Qiang Liu with Bingfei Cheng and Congfu Xu College of Computer Science and Technology Zhejiang University Hangzhou, Zhejiang 310027, China 2012dtd@gmail.com
  • 2. Outline  Introduction  Star-user-based Collaborative Filtering  Experimental Results  Conclusion
  • 4. Collaborative Filtering User-based  Neighborhood-based Collaborative Item-based Filtering(CF) Bayesian Model Factorization Model  Model-based Maximum Entropy Classification or Clustering ……
  • 5. Motivation  To improve the most widely used technology in real-life recommender systems.
  • 6. Neighborhood Model Similarity between users: cov(������,������)  ������������ ������������ ◦ Pearson: ������∙������ ������ ������ ◦ Cosine: ◦ Other similarity measures  Weighted sum of neighbors’ ratings: ◦ ������������,������ = ������������ + ∑������∈������ ������������,������ −������������ ∙ ������������,������ ∑������∈������ ������������,������ Common items:1,4,6 Rating vectors of common items: a=[1,4,5] b=[2,2,5]
  • 7. Challenges faced by traditional methods  Matching similar users (computing similarities ):  Sparsity and noise  Scalability  ……
  • 9. The MPN users  Let A, B, C, D are neighbors of users A, B, C, D respectively.  Then area E is the set of the most popular neighbors(MPN).
  • 10. What is star user  Star users are special users who have rated all items with relatively stable standard.  We maintain a small set of star users, and treat them as fixed neighbors of every general user
  • 11. Problem Formulation Filling the following matrix ℛ ∈ ������������×������ . Items (N) ������������ … ������������ … ������������ ������������ ? . . . ? Star users(H) … . . . . . ������������ . . ������������,������ . . ... . . . . . ������������ ? . . . ?
  • 12. Prediction Model  Selecting Star Neighbors:  Generate predictions based on star users’ General Users (M) ������������ ������������ ������������ ratings: � ������������,������ = ������������ + ∑������∈������ ������������,������ −������������ ∙ ������������,������ ∑������∈������ ������������,������ ������������ … …  Star Users (H) The parameters are ������������,������ . . . . . ������������ ������������,������ and ������������,������ . … . . . . .  . . . . ������������ ... . . . . . . . . . . Relationship Matrix W
  • 13. How we get star users(1) 1. Initialization star user matrix ℛ.  Training Stage: 2. Predict each rating ������̂������,������ in the training set: ∑������∈������(������������,������ − ������̅������ ) × ������������,������ ������̂������,������ = ������̅������ + ∑������∈������ ������������,������ 3. The residual is ������������,������ = ������������,������ − ������̂������,������ gradient of ������������,������ 2 is: and the ������������,������ 2 = −2������������,������ ∙ ∑ ������−1 ������ ∙������������,������ ������ ������������������,������ ������∈������ ������������,������
  • 14. How we get star users(2) 4. Update each element of matrix ℛ:  Training Stage: ������������,������ ������������,������ ← ������������,������ + ������ ∙ ������������,������ ∙ ∑������∈������ ������������,������ 5. Repeat steps 2 to 4 until convergence.
  • 15. How we get star users(3) ◦ ������ (users):The update frequency of ������̅������ .  Parameters: ◦ ������ ������������������������������������������������������������ :The update frequency of ������������,������ ∈ ������ for each u, and s. w������,������ is computed using Pearson Correlation ������ ∈ ������������×������  Maintain the relationship matrix W: until recommending stage.
  • 17. Results on MovieLens Dataset RMSE of our approach against Time requirement comparison various H and comparison with kNN
  • 18. Item-based Model  We firstly train a small set of star items instead of star users.  Predictions are computed as: ∑������∈������ ′ ������������,������ − ������������ × ������������,������ � ������������,������ = ������̅������ + ∑������∈������ ′ ������������,������
  • 19. Results on Netflix Dataset Our approach with different values Our approach with different values of learning rate of H
  • 20. Discussion  Comparison with kNN  Comparison with SVD ◦ Accuracy ◦ Scientific explanation ◦ Data Sparsity ◦ Parameters ◦ Scalability ◦ Updating ������ ������2 × ������ ′ → ������(������ × ������ × ������ ′ ) where ������ ≪ ������.
  • 22. Summary  We proposed a novel CF model based on star users.  The original intention is to improve traditional neighborhood-based CF model.  Experimental results on two datasets verified the effectiveness of our approach.
  • 23. Future work  Incorporating contextual information into our model.  Validating our approach in practical applications.
  • 24.