SlideShare a Scribd company logo
1 of 38
Project(CS-893)
     SPATIALLY AWARE
RECOMMENDATION ALGORITHM


             Under the supervision of :
       Prof. (Dr.) Prosenjit Gupta (professor)
                          &
                Prof. (Dr.) Subhashis
           Majumdar(professor& HOD)
Wants to Buy Something Online ??
            The Problem is..
 How to get enough INFORMATION to make a
 Decision ?

                       Product
BUT ..
                      recommendations




  How to make a RIGHT DECISION out
  of enormous information ?
Introduction
 Recommender System
– Apply statistical and knowledge discovery
  techniques to the problem of making Product
  Recommendation.
–It receives information from a customer about which
  products he/she is interested in, and recommends
  products that are likely to fit his/her needs.
– Today, recommender systems are deployed on
  hundreds of different e-commerce websites,
  serving millions of customers.
•Collaborative Filtering
-Basic Principle :
  To find a subset of users, having similar tastes and
  preferences to that of active user.
  And offering recommendations based on that subset of
  users.
-Assumptions :
  Users with similar interest have common preferences
  and vice-versa.
  Sufficiently large number of user preferences is available.
Past Works
 Amazon.com
    - Uses Item-to-Item Collaborative Filtering.
    - Focuses on finding similar Items, not similar Customers.
 Google- Hotpot
    - A recommendation engine for places.
    - To make local recommendation more personal, by
    recommending places based on ratings.
 Netflix.com
     - Recommendation engine for movies.
     - Uses matrix- factorization and so called “temporal-
    dynamics” to perform Collaborative Filtering.
IMPORTANT                      CHALLENGES
                     1. Scalability Issue
                  Recommendation Algorithm- Performance
                  In Searching Neighbors having similar
                         preferences – Active user




Tens of thousands of users



                                                  Tens of millions of users
2. To improve Quality Of
          Recommendation
          Consumers need Recommendations
       - they can trust upon to help them in finding
    Time – toproducts - closer look on
                   have they will like.
            Different contextual
                   information
        BALANCE REQUIRED !!
         - To add new methods of
    To Search more number of related customers (neighbors)

             recommendation !!IN TWO CHALLENGES
                                 CONFLICT


Lesser Time algorithm spends in searching neighbors
                 More scalable it is.
                         But
      Lesser the Quality of Recommendation is.
Why Spatially Aware ?
 Recommendation system considers


   Location of             Preferences of other
   Active User            users, who share same
                                 Location



      Recommendation for Active user
Project Objective
To Decompose User’s Space based on their
       location (voronoi Diagram)




To Find Correlation among Users within same
  location (Pearson’s correlation coefficient)




 To Recommend relevant Items of interest to
     active user (Collaborative Filtering)
Voronoi Diagram

pi : site points
q : free point
e : Voronoi edge
v : Voronoi vertex


                              v
                     q
                         pi       e
Everyday Example of Voronoi
          diagram
The post office problem:-
        Suppose in a city with several post offices we would
  like to mark the service region of each post office
  proximity. What are those regions??


       Let us solve this problem for a section of kolkata.
Post offices in a section of kolkata
Post-offices as Points in Plane
Proximity Regions of Post offices
Post-office services in Kolkata
GUI Application developed in
            Java
Raster-scan Concept
Project
Implementation
     Part
DATA SET PROVIDED


   Users.dat file
   UserID | Gender | Age | Occupation | Zip-code
   * Contains around 6000 0f user’s information




    Zips_sm.txt file
    Zip-code | City-name | longitude | latitude
    *Contains around 30000 cities information
DATA SET PROVIDED


    Movies.dat file
    MovieID::Title::Genres
    *Contains around 4000 movies informations




        Ratings.dat file
        UserID::MovieID::Rating::Timestamp
        *UserIDs range between 1 and 6040
        *MovieIDs range between 0 and 3592
        **Ratings are made on a 5-star scale
        **Each user has at least 20 ratings
Decomposing User’s space based on
         their location-
      ‘Voronoi Diagram’
            Concept
Users.dat file
                        UserID | Gender | Age | Occupation | Zip-code




                                         Find_sites.java
                                        Threshold value =15
                                            users(say)

Zip_cen.dat file                                                 all_Zips.dat file
Zip-code | user’s count                                              Zip-code
*Contains all voronoi sites(i.e. zip-                           *Contains all zip-codes
codes having no. of users >=
Threshold value of users )
Zip_cen.dat file             Zips_sm.txt file     All_zips.dat file




                                    Find_zipcen_coords.java


zip_cen_coordinates.dat                                              zip_coordinates.dat
Zip-code | longitude | latitude                                  Zip-code | longitude | latitude
*Contains all voronoi sites along with                              *Contains all zip-codes
their longitude and latitude                                      along with their longitude
                                                                         and latitude
zip_cen_coordinates.dat                    zip_coordinates.dat
*Contains all voronoi sites                *Contains all zip-codes

                                        Find_zip_voronoi.java




                     voronoi_zip_coordinates.dat
                     Zip-code | Corresponding_zip_centre
                     *Contains all zip-codes with corresponding voronoi centers
Find Correlation among Users
      within same location
‘Pearson’s correlation coefficient’
Given voronoi
     site


                                                                   Users.dat file
                    voronoi_zip_coordinates.dat                UserID | Gender | Age |
                    Zip-code | Corresponding_zip_centre        Occupation | Zip-code



                                       Find_Zipsite_users.java




                             ZipsiteN.dat file
                              Zip-code| Userid
            *Contains all the users lying inside Nth voronoi
             cell , along with their corresponding zip-codes
Ratings.dat file
                                                                 UserID::MovieID::Rating::Timestamp
                       ZipsiteN.dat file
                                                                 *UserIDs range between 1 and 6040
                         Zip-code| Userid
                                                                 *MovieIDs range between 0 and 3592
*Contains all the users lying inside Nth
                                                                 **Ratings are made on a 5-star scale
         voronoi cell , along with their
                                                                 **Each user has at least 20 ratings
               corresponding zip-codes
                                                          Find_zipcen_ratings.java




                                         Zipsite_ratingsN.dat
                                        Userid | movieid | ratings
          *Contains the ratings of all the users within one voronoi cell, on different movies
Pearson’s correlation coefficient

      Ca,b =




Ca,b =Pearson correlation between user a & user b
ra,i =rating of user ‘a’ on item ‘i’
rb,i =rating of user ‘b’ on item ‘i’

    =average rating of user ‘a’ on all the ‘m’ items

    =average rating of user ‘b’ on all the ‘m’ items

 Value of Ca,b lies between -1 to 1.
1/-1= positive/negative preferences between users.
0= users have no common set of preferences.
Zipsite_ratingsN.dat
      Userid | movieid | ratings
      *Contains the ratings of all the users within each of the
      voronoi cells on different movies .




Find_correlation.java




          CorrelationsN.dat
          Userid_a | userid_b | c(a,b)
          *Contains the correlation coefficient between all the pairs
          of different users lying within each voronoi cells.
To Recommend Relevant Items of
    Interest to Active User
    ‘Collaborative Filtering’
Filters out an array of
                   Searches in which zip
  Active user, u(i) cell, the user belongs   ZipsiteN.dat file   CorrelationsN.dat    highly correlated users
                                                                                      ( > threshold value)

                             RECOMMENDATION ALGORITHM
                               (Find_recommendation.java)




Set of User’s highly rated                                                 Set of movies highly
                                 Top two categories of
movies (having ratings 4                                                 rated by correlated users
                                     user’s choice
       or 5 out of 5)




                                                                                     RECOMMENDED
                                                                                        MOVIES
Testing
‘Experiments & Results’
Testing Algorithm

                                                Active user [u(i)]




Set of all the movies seen & rated so far                                     Set of movies generated after collaborative
              by active user.                                                 filtering and being recommended to active
                                                                                                 user.
Calculate average of all the ratings on
        these movies. (Avg2)




                                   Set of common movies in both the above two
                                                    sets.
                                    Calculate average of all the ratings on
                                        these common movies. (Avg1)




                 Calculate Difference , diff(i) = Avg1(i) – Avg2(i)

                                 Repeat this process for N no. of users.
                                     Store the Results in a Table.
Testing Continues..
       From this Table of differences,
                Calculate ..

Number of users with positive difference values.
                (Pos_countu )

Number of users with negative difference values.
                (Neg_countu)


Average of absolute of all these difference values.
                     (Avgu)
                        &
           Standard Deviation (SDu)
Results !!
TEST CASE 1.
                                           N= 242 (around 250) Users
                            3


                            2
    Rating Difference -->




                             1


                            0


                            -1


                            -2


                            -3
                                   85
                                     7




                                   37
                                     1

                                    13


                                    31

                                   43




                                   73




                                 109


                                 127




                                 169




                                 205

                                 217



                                 241
                                   55




                                  121

                                  133




                                  211
                                  115




                                 145




                                 175
                                   19
                                   25



                                  49




                                 187

                                 199



                                 223

                                 235
                                   67

                                   79



                                 103




                                 139




                                  181
                                   61




                                   97




                                 229
                                   91




                                  151

                                 163
                                 157




                                 193
                                                  User-id -->


   RESULTS:                       [A] Pos_countu = 181, Neg_countu = 61
                                  [B] Avgu = 0.35702798. [C] SDu = 0.63110024.
Rating difference -->




                                                                                                                                0




                                                                                                             -3
                                                                                                                  -2
                                                                                                                         -1
                                                                                                                                       1
                                                                                                                                               2
                                                                                                                                                   3
                                                                                                         1
                                                                                                        11
                                                                                                       21
                                                                                                        31
                                                                                                       41
                                                                                                        51
                                                                                                       61
                                                                                                       71
                                                                                                       81
                                                                                                                                                                            TEST CASE 2.

                                                                                                       91
                                                                                                     101




                                                    RESULTS:
                                                                                                       111
                                                                                                      121
                                                                                                      131
                                                                                                     141
                                                                                                      151
                                                                                                     161
                                                                                                      171
                                                                                                     181
                                                                                                     191
                                                                                                     201
                                                                                                      211
                                                                                                     221
                                                                                                     231
                                                                                                     241
                                                                                       User-id -->   251
                                                                                                     261
                                                                                                     271
                                                                                                     281
                                                                                                     291
                                                                                                     301
                                                                                                                                                       N= 488(around 500) Users




                                                                                                      311
                                                                                                     321
                                                                                                     331
                                                                                                     341
                                                                                                     351
                                                                                                     361
                                                                                                     371
                                              [A] Pos_countu = 373, Neg_countu = 115




                                                                                                     381
                                                                                                     391
                                                                                                     401
[B] Avgu = 0.3604241. [C] SDu = 0.60164124.




                                                                                                     411
                                                                                                     421
                                                                                                     431
                                                                                                     441
                                                                                                     451
                                                                                                     461
                                                                                                     471
                                                                                                     481
Conclusion
1. (Pos_countu ) /(Neg_countu) ≈3 : 1, so out of every four
   users, three users are being recommended relatively better
   movies by our algorithm, than they have already seen and
   rated.

2. Since Avgu ≈ 0.3 and SDu ≈ 0.6, so although the one
   user out of four, which are not being recommended
   better movies, Still the average rating of those
   recommended set of movies(which are not better)
   differ from the average rating on all the movies he has
   seen so far, just by [0.3 ± 0.6].
Thank you..

    Veer Chandra (085118)
    Ashis Senapati (085123)
    Suvodeep Majumder (085128)
    -All B-tech in Computer Sc. & Engg.
    Heritage Institute of Technology (Kolkata)

More Related Content

Viewers also liked

Toward the Next Generation of Recommender Systems:
Toward the Next Generation of Recommender Systems: Toward the Next Generation of Recommender Systems:
Toward the Next Generation of Recommender Systems:
Vincent Chu
 
e-learning 3.0 and AI
e-learning 3.0 and AIe-learning 3.0 and AI
e-learning 3.0 and AI
Neil Rubens
 
Recommender Systems in E-Commerce
Recommender Systems in E-CommerceRecommender Systems in E-Commerce
Recommender Systems in E-Commerce
Roger Chen
 
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
PyData
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
Liang Xiang
 

Viewers also liked (13)

Toward the Next Generation of Recommender Systems:
Toward the Next Generation of Recommender Systems: Toward the Next Generation of Recommender Systems:
Toward the Next Generation of Recommender Systems:
 
Offline evaluation of recommender systems: all pain and no gain?
Offline evaluation of recommender systems: all pain and no gain?Offline evaluation of recommender systems: all pain and no gain?
Offline evaluation of recommender systems: all pain and no gain?
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Impersonal Recommendation system on top of Hadoop
Impersonal Recommendation system on top of HadoopImpersonal Recommendation system on top of Hadoop
Impersonal Recommendation system on top of Hadoop
 
Trust and Recommender Systems
Trust and  Recommender SystemsTrust and  Recommender Systems
Trust and Recommender Systems
 
Profile injection attack detection in recommender system
Profile injection attack detection in recommender systemProfile injection attack detection in recommender system
Profile injection attack detection in recommender system
 
e-learning 3.0 and AI
e-learning 3.0 and AIe-learning 3.0 and AI
e-learning 3.0 and AI
 
Recommender Systems in E-Commerce
Recommender Systems in E-CommerceRecommender Systems in E-Commerce
Recommender Systems in E-Commerce
 
Recommender Systems and Active Learning (for Startups)
Recommender Systems and Active Learning (for Startups)Recommender Systems and Active Learning (for Startups)
Recommender Systems and Active Learning (for Startups)
 
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
e-Commerce Trends from 2014 to 2015 by Divante.co
e-Commerce Trends from 2014 to 2015 by Divante.coe-Commerce Trends from 2014 to 2015 by Divante.co
e-Commerce Trends from 2014 to 2015 by Divante.co
 
E commerce ppt
E commerce pptE commerce ppt
E commerce ppt
 

Similar to Spatially Aware Recommendation System

[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...
[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...
[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...
YONG ZHENG
 
Harvesting Intelligence from User Interactions
Harvesting Intelligence from User Interactions Harvesting Intelligence from User Interactions
Harvesting Intelligence from User Interactions
R A Akerkar
 

Similar to Spatially Aware Recommendation System (20)

Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
 
AAMAS-2006 TANDEM Design Method (poster format)
AAMAS-2006 TANDEM Design Method (poster format)AAMAS-2006 TANDEM Design Method (poster format)
AAMAS-2006 TANDEM Design Method (poster format)
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...
[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...
[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...
 
Introduction to recommender systems
Introduction to recommender systemsIntroduction to recommender systems
Introduction to recommender systems
 
AI in Entertainment – Movie Recommendation System
AI in Entertainment – Movie Recommendation SystemAI in Entertainment – Movie Recommendation System
AI in Entertainment – Movie Recommendation System
 
Major_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptxMajor_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptx
 
WeGroup--A Community Android App
WeGroup--A Community Android AppWeGroup--A Community Android App
WeGroup--A Community Android App
 
[SOCRS2013]Differential Context Modeling in Collaborative Filtering
[SOCRS2013]Differential Context Modeling in Collaborative Filtering[SOCRS2013]Differential Context Modeling in Collaborative Filtering
[SOCRS2013]Differential Context Modeling in Collaborative Filtering
 
powerpoint presentation on movie recommender system.
powerpoint presentation on movie recommender system.powerpoint presentation on movie recommender system.
powerpoint presentation on movie recommender system.
 
Filtering content bbased crs
Filtering content bbased crsFiltering content bbased crs
Filtering content bbased crs
 
Data Transfer between Activities & Databases
Data Transfer between Activities & DatabasesData Transfer between Activities & Databases
Data Transfer between Activities & Databases
 
Df32676679
Df32676679Df32676679
Df32676679
 
Df32676679
Df32676679Df32676679
Df32676679
 
A flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVA flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TV
 
A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TV
 
beancounter.io - Social Web user profiling as a service #semtechbiz
beancounter.io - Social Web user profiling as a service #semtechbiz beancounter.io - Social Web user profiling as a service #semtechbiz
beancounter.io - Social Web user profiling as a service #semtechbiz
 
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
 
Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence
 
Harvesting Intelligence from User Interactions
Harvesting Intelligence from User Interactions Harvesting Intelligence from User Interactions
Harvesting Intelligence from User Interactions
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Spatially Aware Recommendation System

  • 1. Project(CS-893) SPATIALLY AWARE RECOMMENDATION ALGORITHM Under the supervision of : Prof. (Dr.) Prosenjit Gupta (professor) & Prof. (Dr.) Subhashis Majumdar(professor& HOD)
  • 2. Wants to Buy Something Online ?? The Problem is.. How to get enough INFORMATION to make a Decision ? Product BUT .. recommendations How to make a RIGHT DECISION out of enormous information ?
  • 3. Introduction  Recommender System – Apply statistical and knowledge discovery techniques to the problem of making Product Recommendation. –It receives information from a customer about which products he/she is interested in, and recommends products that are likely to fit his/her needs. – Today, recommender systems are deployed on hundreds of different e-commerce websites, serving millions of customers.
  • 4. •Collaborative Filtering -Basic Principle : To find a subset of users, having similar tastes and preferences to that of active user. And offering recommendations based on that subset of users. -Assumptions : Users with similar interest have common preferences and vice-versa. Sufficiently large number of user preferences is available.
  • 5. Past Works  Amazon.com - Uses Item-to-Item Collaborative Filtering. - Focuses on finding similar Items, not similar Customers.  Google- Hotpot - A recommendation engine for places. - To make local recommendation more personal, by recommending places based on ratings.  Netflix.com - Recommendation engine for movies. - Uses matrix- factorization and so called “temporal- dynamics” to perform Collaborative Filtering.
  • 6. IMPORTANT CHALLENGES 1. Scalability Issue Recommendation Algorithm- Performance In Searching Neighbors having similar preferences – Active user Tens of thousands of users Tens of millions of users
  • 7. 2. To improve Quality Of Recommendation Consumers need Recommendations - they can trust upon to help them in finding Time – toproducts - closer look on have they will like. Different contextual information BALANCE REQUIRED !! - To add new methods of To Search more number of related customers (neighbors) recommendation !!IN TWO CHALLENGES CONFLICT Lesser Time algorithm spends in searching neighbors More scalable it is. But Lesser the Quality of Recommendation is.
  • 8. Why Spatially Aware ?  Recommendation system considers Location of Preferences of other Active User users, who share same Location Recommendation for Active user
  • 9. Project Objective To Decompose User’s Space based on their location (voronoi Diagram) To Find Correlation among Users within same location (Pearson’s correlation coefficient) To Recommend relevant Items of interest to active user (Collaborative Filtering)
  • 10. Voronoi Diagram pi : site points q : free point e : Voronoi edge v : Voronoi vertex v q pi e
  • 11. Everyday Example of Voronoi diagram The post office problem:- Suppose in a city with several post offices we would like to mark the service region of each post office proximity. What are those regions?? Let us solve this problem for a section of kolkata.
  • 12. Post offices in a section of kolkata
  • 14. Proximity Regions of Post offices
  • 19. DATA SET PROVIDED Users.dat file UserID | Gender | Age | Occupation | Zip-code * Contains around 6000 0f user’s information Zips_sm.txt file Zip-code | City-name | longitude | latitude *Contains around 30000 cities information
  • 20. DATA SET PROVIDED Movies.dat file MovieID::Title::Genres *Contains around 4000 movies informations Ratings.dat file UserID::MovieID::Rating::Timestamp *UserIDs range between 1 and 6040 *MovieIDs range between 0 and 3592 **Ratings are made on a 5-star scale **Each user has at least 20 ratings
  • 21. Decomposing User’s space based on their location- ‘Voronoi Diagram’ Concept
  • 22. Users.dat file UserID | Gender | Age | Occupation | Zip-code Find_sites.java Threshold value =15 users(say) Zip_cen.dat file all_Zips.dat file Zip-code | user’s count Zip-code *Contains all voronoi sites(i.e. zip- *Contains all zip-codes codes having no. of users >= Threshold value of users )
  • 23. Zip_cen.dat file Zips_sm.txt file All_zips.dat file Find_zipcen_coords.java zip_cen_coordinates.dat zip_coordinates.dat Zip-code | longitude | latitude Zip-code | longitude | latitude *Contains all voronoi sites along with *Contains all zip-codes their longitude and latitude along with their longitude and latitude
  • 24. zip_cen_coordinates.dat zip_coordinates.dat *Contains all voronoi sites *Contains all zip-codes Find_zip_voronoi.java voronoi_zip_coordinates.dat Zip-code | Corresponding_zip_centre *Contains all zip-codes with corresponding voronoi centers
  • 25. Find Correlation among Users within same location ‘Pearson’s correlation coefficient’
  • 26. Given voronoi site Users.dat file voronoi_zip_coordinates.dat UserID | Gender | Age | Zip-code | Corresponding_zip_centre Occupation | Zip-code Find_Zipsite_users.java ZipsiteN.dat file Zip-code| Userid *Contains all the users lying inside Nth voronoi cell , along with their corresponding zip-codes
  • 27. Ratings.dat file UserID::MovieID::Rating::Timestamp ZipsiteN.dat file *UserIDs range between 1 and 6040 Zip-code| Userid *MovieIDs range between 0 and 3592 *Contains all the users lying inside Nth **Ratings are made on a 5-star scale voronoi cell , along with their **Each user has at least 20 ratings corresponding zip-codes Find_zipcen_ratings.java Zipsite_ratingsN.dat Userid | movieid | ratings *Contains the ratings of all the users within one voronoi cell, on different movies
  • 28. Pearson’s correlation coefficient Ca,b = Ca,b =Pearson correlation between user a & user b ra,i =rating of user ‘a’ on item ‘i’ rb,i =rating of user ‘b’ on item ‘i’ =average rating of user ‘a’ on all the ‘m’ items =average rating of user ‘b’ on all the ‘m’ items Value of Ca,b lies between -1 to 1. 1/-1= positive/negative preferences between users. 0= users have no common set of preferences.
  • 29. Zipsite_ratingsN.dat Userid | movieid | ratings *Contains the ratings of all the users within each of the voronoi cells on different movies . Find_correlation.java CorrelationsN.dat Userid_a | userid_b | c(a,b) *Contains the correlation coefficient between all the pairs of different users lying within each voronoi cells.
  • 30. To Recommend Relevant Items of Interest to Active User ‘Collaborative Filtering’
  • 31. Filters out an array of Searches in which zip Active user, u(i) cell, the user belongs ZipsiteN.dat file CorrelationsN.dat highly correlated users ( > threshold value) RECOMMENDATION ALGORITHM (Find_recommendation.java) Set of User’s highly rated Set of movies highly Top two categories of movies (having ratings 4 rated by correlated users user’s choice or 5 out of 5) RECOMMENDED MOVIES
  • 33. Testing Algorithm Active user [u(i)] Set of all the movies seen & rated so far Set of movies generated after collaborative by active user. filtering and being recommended to active user. Calculate average of all the ratings on these movies. (Avg2) Set of common movies in both the above two sets. Calculate average of all the ratings on these common movies. (Avg1) Calculate Difference , diff(i) = Avg1(i) – Avg2(i) Repeat this process for N no. of users. Store the Results in a Table.
  • 34. Testing Continues.. From this Table of differences, Calculate .. Number of users with positive difference values. (Pos_countu ) Number of users with negative difference values. (Neg_countu) Average of absolute of all these difference values. (Avgu) & Standard Deviation (SDu)
  • 35. Results !! TEST CASE 1. N= 242 (around 250) Users 3 2 Rating Difference --> 1 0 -1 -2 -3 85 7 37 1 13 31 43 73 109 127 169 205 217 241 55 121 133 211 115 145 175 19 25 49 187 199 223 235 67 79 103 139 181 61 97 229 91 151 163 157 193 User-id --> RESULTS: [A] Pos_countu = 181, Neg_countu = 61 [B] Avgu = 0.35702798. [C] SDu = 0.63110024.
  • 36. Rating difference --> 0 -3 -2 -1 1 2 3 1 11 21 31 41 51 61 71 81 TEST CASE 2. 91 101 RESULTS: 111 121 131 141 151 161 171 181 191 201 211 221 231 241 User-id --> 251 261 271 281 291 301 N= 488(around 500) Users 311 321 331 341 351 361 371 [A] Pos_countu = 373, Neg_countu = 115 381 391 401 [B] Avgu = 0.3604241. [C] SDu = 0.60164124. 411 421 431 441 451 461 471 481
  • 37. Conclusion 1. (Pos_countu ) /(Neg_countu) ≈3 : 1, so out of every four users, three users are being recommended relatively better movies by our algorithm, than they have already seen and rated. 2. Since Avgu ≈ 0.3 and SDu ≈ 0.6, so although the one user out of four, which are not being recommended better movies, Still the average rating of those recommended set of movies(which are not better) differ from the average rating on all the movies he has seen so far, just by [0.3 ± 0.6].
  • 38. Thank you.. Veer Chandra (085118) Ashis Senapati (085123) Suvodeep Majumder (085128) -All B-tech in Computer Sc. & Engg. Heritage Institute of Technology (Kolkata)