SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Semantically-Enhanced
                Recommendation Algorithms


                                        CCIA 2012

                          Victor Codina        & Luigi Ceccaroni
                         vcodina@lsi.upc.edu        lceccaroni@BDigital.org



Departament de Llenguatges i Sistemes Informàtics                 Health Informatics
Knowledge Engineering and Machine Learning Group        Personalized Computational Medicine
Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni   2
The value of recommendations
 Netflix: 2/3 of the movies rented are recommend
 Google News: 38% more clickthrough
 Amazon: 35% sales from recommendations


   All these systems employ as a main component
         Collaborative Filtering (CF) approach




        Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni   3
But in most online services the CF approach
           does not work so well

                                  Why??

               Usually: Lack of Data

   Other reasons: lack of context-awareness,
                  domain-specific particularities


     Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni   4
Outline



Cold-start problem and existing solutions



Proposed solution to overcome cold start


Evaluation and results




       Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni   5
Outline



                                                              Cold-start problem
Cold-start problem and
existing solutions
                                                              Hybrid recommenders


Proposed solution to overcome cold start


Evaluation and results




       Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni   6
What is the cold-start problem?

 Narrow view
   o No ratings at all associated to items or users
 Wider view
    o Few ratings associated


 Cold-start scenarios:                                                  Users
                                                   Many ratings                   Few ratings
                                  Many
                                                       Normal                       New user
                                 ratings
                    Items
                                  Few
                                                     New item                  New user & item
                                 ratings


             Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni   7
Typical solution: hybrid recommender combining
CF with content-based filtering

                      PAST SOLUTION                                   MORE RECENT SOLUTION
             Collaborative Filtering                                  Collaborative Filtering

                               +                                                        +
                  Traditional                                      Semantically-Enhanced
             Content-based filtering                               Content-based filtering

New item
New user
                  Lack of understanding                              The need of domain
Limitation        and exploitation of                                ontologies describing explicit
                  domain semantics                                   metadata relations
             Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni   8
Outline




Cold-start problem and existing solutions


                                                  Acquisition of implicit semantics
Proposed solution to
overcome cold start                               Methods for semantics exploitation

Evaluation and results




       Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni   9
Acquisition of implicit domain semantics

 Implicit semantics = semantic similarities among item
  attributes extracted from Vector Space Models (VSMs)
 Distributional hypothesis: “words that share similar
  contexts share similar meaning”
       Items            Users

              Context
                                           Matrix
Attributes




                                                                                  Similarity
                …




                                                                                                             Attribute
             … wa,c                    Transformation                             measure                    semantic
                                      (SVD, Conditional                            (Cosine,                 similarities
                                        probabilities)                             Jaccard)


                        Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni             10
Semantic similarities are context-dependant

 Item-based
   o Similarity is measured in terms of how many items are similarly
     described by both attributes
 User-based
     o Similarity is measured in terms of how many users are similarly
       interested in both attributes

Example:                                         User-based                             Items-based
- Top-5 tags similar to “Sci-Fi”           Scifi         0.79598457                Scifi             0.48631117
- Calculated using cosine                  future        0.6889696                 aliens            0.42508063
similarity without matrix                  space         0.65459067                dystopia          0.34769687
transformation                             aliens        0.6110453                 space             0.32580933
                                           robots        0.59465224                future            0.27470198

                 Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni           11
Exploitation of implicit semantics in
   content-based filtering

     USER MODELING                                                            PREDICTION GENERATION
                           Attributes                                                    Attributes
Attribute
relevance [0,1]                                                                           … wi,a
                               …
                   Items




                           … w                                                     Item attributes (i)
                              i,a

                                               degree of interest [-1,1]

     Items                                                                                                     score
                                                       Attributes
    … ru,i …          User modeling                     … wu,a                      Vector-based
                                                                                    2. Semantic                (    )
                        technique                                                    matching
                                                                                      matching
user ratings (u)                                     User interests (u)
                                                                                        Expanded
                                                                                     user interests (u)

                                                                      1. Profile
                                                                     expansion

                           Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni   12
Method 1: User profile expansion by constrained
     spreading activation

                                                            activated node
            Attribute                                       a1       a2     a3     a4       a5
        semantic similarities                              0        0.5    -0.1    0        0       User interests [-1,1]
       a1    a2     a3    a4      a5


        1    0.5   0.2    0      0.3
a1                                                          (0.5)         (0.3)


       0.5    1    0.3    0      0.1
a2
a3

       0.2   0.3    1    0.7     0.8
a4                                                       0.25    0.5      0.05     0        0          Expanded

        0     0    0.7    1       0
                                                           a1        a2     a3    a4       a5      user interests [-1,1]
a5

       0.3 0.1 0.8         0   1                 new interest                Weight updated
     Similarities can be symmetric or
     not depending on the similarity
     measure used                                    Method            - activation threshold = 0.25
                                                     hyper-parameters: - fan-out threshold = 0.25
                                                                       - max.expansion levels = 1

                         Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni           13
Method 2: Prediction generation by pair-wise
     semantic matching strategies

                                                       Approach:          Vector-based matching
                                                                           All-pairs matching
                                                                           Best-pairs matching
            Attribute                                  Result:          0.15 - 0.056 = 0.094 - 0.056 = 0.12
                                                                             - 0.009 + 0.035
        semantic similarities                          (using the product as aggregation function)
       a1    a2     a3    a4      a5                        a1      a2       a3          a4       a5
                                                                                                       Item attributes [0,1]
        1    0.5   0.2    0      0.3
a1                                                          0     0.3        0           0       0.7


       0.5    1    0.3    0      0.1
a2
a3                                                                       (0.3)

       0.2   0.3    1    0.7     0.8
                                                 Direct                          (0.1)
a4

        0     0    0.7    1       0
                                                 matching (1)
                                                                                         (0.8)
a5

       0.3 0.1 0.8         0   1
     Similarities can be symmetric or                       0     0.5       -0.1         0        0    User interests [-1,1]
     not depending on the similarity
                                                           a1      a2       a3       a4          a5
     measure used

                                                    Method
                                                                                     - similarity threshold = 0.05
                                                    hyper-parameter:

                         Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni              14
Outline




Cold-start problem and existing solutions



Proposed solution to overcome cold start


                                                              MovieLens data set
Evaluation and results
                                                              Experimental results



       Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni   15
Offline experimentation with a MovieLens data
set extended with movie metadata

Data set statistics after pruning unusual
attributes values and movies with few attributes:

      Users                                  2113
      Movies                                 1646
      Attributes                             4 (Genres, directors, actors and tags)
      Attribute values                       2886
      Ratings per user on avg. 239
      Rating density                         14%




              Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni   16
Evaluation of methods for semantics exploitation

Baseline = Traditional CB using hybrid user modeling technique
Expansion-CB = CSA-same + User-based + raw frequencies
Matching-CB = Best-pairs-same + User-based + Forbes-Zhu method
BPR-MF = CF based on matrix factorization optimized for ranking




             Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni   17
Conclusions


 Cold-start problem can be very critical
   o Above all in systems with small databases
 Existing solutions have some limitations
   o Traditional CB cannot solve new user scenario
   o Semantically-enhanced CB requires domain ontologies to work
 Exploitation of implicit semantics can be a good
  alternative to overcome cold-start problem
   o User-based semantics is more effective than item-based
   o The best-pair semantic matching method is more effective than
     the profile expansion based on spreading activation

            Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni   18
Future work


 Experimenting with data sets of different domains
   o Million Song data set
 Extending the study of Vector Space Models
   o Probabilistic similarity measures (e.g. Kullback-Leiber)
 Apply the same approach to enhance cold-start
  performance of context-aware recommenders
   o Implicit semantics of contextual conditions can also be acquired
     from user data
   o Similarly, pair-wise semantic strategies can be employed to
     enhance contextual user modeling


            Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni   19

Weitere ähnliche Inhalte

Ähnlich wie Semantically-Enhanced Recommendation Algorithms

Extending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context AwarenessExtending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context AwarenessVictor Codina
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxdongchangim30
 
LiquidPub: Services at Service of Science
LiquidPub: Services at Service of ScienceLiquidPub: Services at Service of Science
LiquidPub: Services at Service of ScienceAliaksandr Birukou
 
Scikit Learn intro
Scikit Learn introScikit Learn intro
Scikit Learn intro9xdot
 
An introduc on to Machine Learning
An introduc on to Machine LearningAn introduc on to Machine Learning
An introduc on to Machine Learningbutest
 
Reward constrained interactive recommendation with natural language feedback ...
Reward constrained interactive recommendation with natural language feedback ...Reward constrained interactive recommendation with natural language feedback ...
Reward constrained interactive recommendation with natural language feedback ...Jeong-Gwan Lee
 
Integrating digital traces into a semantic enriched data
Integrating digital traces into a semantic enriched dataIntegrating digital traces into a semantic enriched data
Integrating digital traces into a semantic enriched dataDhaval Thakker
 
Inaugural address manjusha - Indicthreads cloud computing conference 2011
Inaugural address manjusha -  Indicthreads cloud computing conference 2011Inaugural address manjusha -  Indicthreads cloud computing conference 2011
Inaugural address manjusha - Indicthreads cloud computing conference 2011IndicThreads
 
MoDisco EclipseCon2010
MoDisco EclipseCon2010MoDisco EclipseCon2010
MoDisco EclipseCon2010fmadiot
 
DIAM : Towards a Model for Describing Appropriation Processes Through the Evo...
DIAM : Towards a Model for Describing Appropriation Processes Through the Evo...DIAM : Towards a Model for Describing Appropriation Processes Through the Evo...
DIAM : Towards a Model for Describing Appropriation Processes Through the Evo...Yannick Prié
 
Tag And Tag Based Recommender
Tag And Tag Based RecommenderTag And Tag Based Recommender
Tag And Tag Based Recommendergu wendong
 
GeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic WebGeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic WebWeb Information Systems, TU Delft
 
RecSys 2008: Social Ranking
RecSys 2008: Social RankingRecSys 2008: Social Ranking
RecSys 2008: Social RankingUCL-CS MobiSys
 
Collaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsCollaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsNavisro Analytics
 
Scale, Structure, and Semantics
Scale, Structure, and SemanticsScale, Structure, and Semantics
Scale, Structure, and SemanticsDaniel Tunkelang
 
A Role for Provenance in Quality Assessment
A Role for Provenance in Quality AssessmentA Role for Provenance in Quality Assessment
A Role for Provenance in Quality AssessmentChris Baillie
 

Ähnlich wie Semantically-Enhanced Recommendation Algorithms (20)

Extending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context AwarenessExtending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context Awareness
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptx
 
LiquidPub: Services at Service of Science
LiquidPub: Services at Service of ScienceLiquidPub: Services at Service of Science
LiquidPub: Services at Service of Science
 
Ai use cases
Ai use casesAi use cases
Ai use cases
 
Scikit Learn intro
Scikit Learn introScikit Learn intro
Scikit Learn intro
 
An introduc on to Machine Learning
An introduc on to Machine LearningAn introduc on to Machine Learning
An introduc on to Machine Learning
 
Reward constrained interactive recommendation with natural language feedback ...
Reward constrained interactive recommendation with natural language feedback ...Reward constrained interactive recommendation with natural language feedback ...
Reward constrained interactive recommendation with natural language feedback ...
 
Integrating digital traces into a semantic enriched data
Integrating digital traces into a semantic enriched dataIntegrating digital traces into a semantic enriched data
Integrating digital traces into a semantic enriched data
 
MARS presentation
MARS presentationMARS presentation
MARS presentation
 
DML2012 - ECDemocratized
DML2012 - ECDemocratizedDML2012 - ECDemocratized
DML2012 - ECDemocratized
 
Inaugural address manjusha - Indicthreads cloud computing conference 2011
Inaugural address manjusha -  Indicthreads cloud computing conference 2011Inaugural address manjusha -  Indicthreads cloud computing conference 2011
Inaugural address manjusha - Indicthreads cloud computing conference 2011
 
MoDisco EclipseCon2010
MoDisco EclipseCon2010MoDisco EclipseCon2010
MoDisco EclipseCon2010
 
DIAM : Towards a Model for Describing Appropriation Processes Through the Evo...
DIAM : Towards a Model for Describing Appropriation Processes Through the Evo...DIAM : Towards a Model for Describing Appropriation Processes Through the Evo...
DIAM : Towards a Model for Describing Appropriation Processes Through the Evo...
 
Tag And Tag Based Recommender
Tag And Tag Based RecommenderTag And Tag Based Recommender
Tag And Tag Based Recommender
 
GeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic WebGeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic Web
 
RecSys 2008: Social Ranking
RecSys 2008: Social RankingRecSys 2008: Social Ranking
RecSys 2008: Social Ranking
 
Sa past-future
Sa past-futureSa past-future
Sa past-future
 
Collaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsCollaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro Analytics
 
Scale, Structure, and Semantics
Scale, Structure, and SemanticsScale, Structure, and Semantics
Scale, Structure, and Semantics
 
A Role for Provenance in Quality Assessment
A Role for Provenance in Quality AssessmentA Role for Provenance in Quality Assessment
A Role for Provenance in Quality Assessment
 

Mehr von Luigi Ceccaroni

Digital twins of the environment: opportunities and barriers for citizen science
Digital twins of the environment: opportunities and barriers for citizen scienceDigital twins of the environment: opportunities and barriers for citizen science
Digital twins of the environment: opportunities and barriers for citizen scienceLuigi Ceccaroni
 
Harnessing the power of citizen science for environmental stewardship and wat...
Harnessing the power of citizen science for environmental stewardship and wat...Harnessing the power of citizen science for environmental stewardship and wat...
Harnessing the power of citizen science for environmental stewardship and wat...Luigi Ceccaroni
 
Citizen science, training, data quality and interoperability
Citizen science, training, data quality and interoperabilityCitizen science, training, data quality and interoperability
Citizen science, training, data quality and interoperabilityLuigi Ceccaroni
 
Methods for measuring citizen-science impact
Methods for measuring citizen-science impactMethods for measuring citizen-science impact
Methods for measuring citizen-science impactLuigi Ceccaroni
 
Abrazo, integra tv 4all @ eweek2004 (final)
Abrazo, integra tv 4all @ eweek2004 (final)Abrazo, integra tv 4all @ eweek2004 (final)
Abrazo, integra tv 4all @ eweek2004 (final)Luigi Ceccaroni
 
Abrazo @ congreso e learning e inclusión social 2004
Abrazo @ congreso e learning e inclusión social 2004Abrazo @ congreso e learning e inclusión social 2004
Abrazo @ congreso e learning e inclusión social 2004Luigi Ceccaroni
 
Pizza and a movie 2002 aamas
Pizza and a movie 2002   aamasPizza and a movie 2002   aamas
Pizza and a movie 2002 aamasLuigi Ceccaroni
 
Integra tv 4all 2005 - drt4all
Integra tv 4all 2005 - drt4allIntegra tv 4all 2005 - drt4all
Integra tv 4all 2005 - drt4allLuigi Ceccaroni
 
In out pc media center 2003
In out pc media center 2003In out pc media center 2003
In out pc media center 2003Luigi Ceccaroni
 
Modeling utility ontologies in agentcities with a collaborative approach 2002...
Modeling utility ontologies in agentcities with a collaborative approach 2002...Modeling utility ontologies in agentcities with a collaborative approach 2002...
Modeling utility ontologies in agentcities with a collaborative approach 2002...Luigi Ceccaroni
 
Pizza and a movie 2002 aamas
Pizza and a movie 2002   aamasPizza and a movie 2002   aamas
Pizza and a movie 2002 aamasLuigi Ceccaroni
 
The april agent platform 2002 agentcities, lausanne
The april agent platform 2002 agentcities, lausanneThe april agent platform 2002 agentcities, lausanne
The april agent platform 2002 agentcities, lausanneLuigi Ceccaroni
 
ILIAD and CoCoast @ Noordzeedagen 2021
ILIAD and CoCoast @ Noordzeedagen 2021ILIAD and CoCoast @ Noordzeedagen 2021
ILIAD and CoCoast @ Noordzeedagen 2021Luigi Ceccaroni
 
Metrics and instruments to evaluate the impacts of citizen science
Metrics and instruments to evaluate the impacts of citizen scienceMetrics and instruments to evaluate the impacts of citizen science
Metrics and instruments to evaluate the impacts of citizen scienceLuigi Ceccaroni
 
COST Action 15212 WG5 - Standardisation and interoperability
COST Action 15212 WG5 - Standardisation and interoperabilityCOST Action 15212 WG5 - Standardisation and interoperability
COST Action 15212 WG5 - Standardisation and interoperabilityLuigi Ceccaroni
 
The role of interoperability in encouraging participation in citizen science ...
The role of interoperability in encouraging participation in citizen science ...The role of interoperability in encouraging participation in citizen science ...
The role of interoperability in encouraging participation in citizen science ...Luigi Ceccaroni
 
Ontology of citizen science @ Siena 2016 11 24
Ontology of citizen science @ Siena 2016 11 24Ontology of citizen science @ Siena 2016 11 24
Ontology of citizen science @ Siena 2016 11 24Luigi Ceccaroni
 
Citclops/EyeOnWater @ Barcelona - Citizen science day 2016
Citclops/EyeOnWater @ Barcelona - Citizen science day 2016Citclops/EyeOnWater @ Barcelona - Citizen science day 2016
Citclops/EyeOnWater @ Barcelona - Citizen science day 2016Luigi Ceccaroni
 
Workshop - data collection and management
Workshop - data collection and managementWorkshop - data collection and management
Workshop - data collection and managementLuigi Ceccaroni
 

Mehr von Luigi Ceccaroni (20)

Digital twins of the environment: opportunities and barriers for citizen science
Digital twins of the environment: opportunities and barriers for citizen scienceDigital twins of the environment: opportunities and barriers for citizen science
Digital twins of the environment: opportunities and barriers for citizen science
 
Harnessing the power of citizen science for environmental stewardship and wat...
Harnessing the power of citizen science for environmental stewardship and wat...Harnessing the power of citizen science for environmental stewardship and wat...
Harnessing the power of citizen science for environmental stewardship and wat...
 
Citizen science, training, data quality and interoperability
Citizen science, training, data quality and interoperabilityCitizen science, training, data quality and interoperability
Citizen science, training, data quality and interoperability
 
Methods for measuring citizen-science impact
Methods for measuring citizen-science impactMethods for measuring citizen-science impact
Methods for measuring citizen-science impact
 
Abrazo, integra tv 4all @ eweek2004 (final)
Abrazo, integra tv 4all @ eweek2004 (final)Abrazo, integra tv 4all @ eweek2004 (final)
Abrazo, integra tv 4all @ eweek2004 (final)
 
Abrazo @ congreso e learning e inclusión social 2004
Abrazo @ congreso e learning e inclusión social 2004Abrazo @ congreso e learning e inclusión social 2004
Abrazo @ congreso e learning e inclusión social 2004
 
Pizza and a movie 2002 aamas
Pizza and a movie 2002   aamasPizza and a movie 2002   aamas
Pizza and a movie 2002 aamas
 
Integra tv 4all 2005 - drt4all
Integra tv 4all 2005 - drt4allIntegra tv 4all 2005 - drt4all
Integra tv 4all 2005 - drt4all
 
In out pc media center 2003
In out pc media center 2003In out pc media center 2003
In out pc media center 2003
 
Modeling utility ontologies in agentcities with a collaborative approach 2002...
Modeling utility ontologies in agentcities with a collaborative approach 2002...Modeling utility ontologies in agentcities with a collaborative approach 2002...
Modeling utility ontologies in agentcities with a collaborative approach 2002...
 
Pizza and a movie 2002 aamas
Pizza and a movie 2002   aamasPizza and a movie 2002   aamas
Pizza and a movie 2002 aamas
 
The april agent platform 2002 agentcities, lausanne
The april agent platform 2002 agentcities, lausanneThe april agent platform 2002 agentcities, lausanne
The april agent platform 2002 agentcities, lausanne
 
ILIAD and CoCoast @ Noordzeedagen 2021
ILIAD and CoCoast @ Noordzeedagen 2021ILIAD and CoCoast @ Noordzeedagen 2021
ILIAD and CoCoast @ Noordzeedagen 2021
 
MICS @ Geneva 2020
MICS @ Geneva 2020MICS @ Geneva 2020
MICS @ Geneva 2020
 
Metrics and instruments to evaluate the impacts of citizen science
Metrics and instruments to evaluate the impacts of citizen scienceMetrics and instruments to evaluate the impacts of citizen science
Metrics and instruments to evaluate the impacts of citizen science
 
COST Action 15212 WG5 - Standardisation and interoperability
COST Action 15212 WG5 - Standardisation and interoperabilityCOST Action 15212 WG5 - Standardisation and interoperability
COST Action 15212 WG5 - Standardisation and interoperability
 
The role of interoperability in encouraging participation in citizen science ...
The role of interoperability in encouraging participation in citizen science ...The role of interoperability in encouraging participation in citizen science ...
The role of interoperability in encouraging participation in citizen science ...
 
Ontology of citizen science @ Siena 2016 11 24
Ontology of citizen science @ Siena 2016 11 24Ontology of citizen science @ Siena 2016 11 24
Ontology of citizen science @ Siena 2016 11 24
 
Citclops/EyeOnWater @ Barcelona - Citizen science day 2016
Citclops/EyeOnWater @ Barcelona - Citizen science day 2016Citclops/EyeOnWater @ Barcelona - Citizen science day 2016
Citclops/EyeOnWater @ Barcelona - Citizen science day 2016
 
Workshop - data collection and management
Workshop - data collection and managementWorkshop - data collection and management
Workshop - data collection and management
 

Kürzlich hochgeladen

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Kürzlich hochgeladen (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Semantically-Enhanced Recommendation Algorithms

  • 1. Semantically-Enhanced Recommendation Algorithms CCIA 2012 Victor Codina & Luigi Ceccaroni vcodina@lsi.upc.edu lceccaroni@BDigital.org Departament de Llenguatges i Sistemes Informàtics Health Informatics Knowledge Engineering and Machine Learning Group Personalized Computational Medicine
  • 2. Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 2
  • 3. The value of recommendations  Netflix: 2/3 of the movies rented are recommend  Google News: 38% more clickthrough  Amazon: 35% sales from recommendations All these systems employ as a main component Collaborative Filtering (CF) approach Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 3
  • 4. But in most online services the CF approach does not work so well Why?? Usually: Lack of Data Other reasons: lack of context-awareness, domain-specific particularities Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 4
  • 5. Outline Cold-start problem and existing solutions Proposed solution to overcome cold start Evaluation and results Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 5
  • 6. Outline Cold-start problem Cold-start problem and existing solutions Hybrid recommenders Proposed solution to overcome cold start Evaluation and results Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 6
  • 7. What is the cold-start problem?  Narrow view o No ratings at all associated to items or users  Wider view o Few ratings associated Cold-start scenarios: Users Many ratings Few ratings Many Normal New user ratings Items Few New item New user & item ratings Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 7
  • 8. Typical solution: hybrid recommender combining CF with content-based filtering PAST SOLUTION MORE RECENT SOLUTION Collaborative Filtering Collaborative Filtering + + Traditional Semantically-Enhanced Content-based filtering Content-based filtering New item New user Lack of understanding The need of domain Limitation and exploitation of ontologies describing explicit domain semantics metadata relations Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 8
  • 9. Outline Cold-start problem and existing solutions Acquisition of implicit semantics Proposed solution to overcome cold start Methods for semantics exploitation Evaluation and results Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 9
  • 10. Acquisition of implicit domain semantics  Implicit semantics = semantic similarities among item attributes extracted from Vector Space Models (VSMs)  Distributional hypothesis: “words that share similar contexts share similar meaning” Items Users Context Matrix Attributes Similarity … Attribute … wa,c Transformation measure semantic (SVD, Conditional (Cosine, similarities probabilities) Jaccard) Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 10
  • 11. Semantic similarities are context-dependant  Item-based o Similarity is measured in terms of how many items are similarly described by both attributes  User-based o Similarity is measured in terms of how many users are similarly interested in both attributes Example: User-based Items-based - Top-5 tags similar to “Sci-Fi” Scifi 0.79598457 Scifi 0.48631117 - Calculated using cosine future 0.6889696 aliens 0.42508063 similarity without matrix space 0.65459067 dystopia 0.34769687 transformation aliens 0.6110453 space 0.32580933 robots 0.59465224 future 0.27470198 Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 11
  • 12. Exploitation of implicit semantics in content-based filtering USER MODELING PREDICTION GENERATION Attributes Attributes Attribute relevance [0,1] … wi,a … Items … w Item attributes (i) i,a degree of interest [-1,1] Items score Attributes … ru,i … User modeling … wu,a Vector-based 2. Semantic ( ) technique matching matching user ratings (u) User interests (u) Expanded user interests (u) 1. Profile expansion Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 12
  • 13. Method 1: User profile expansion by constrained spreading activation activated node Attribute a1 a2 a3 a4 a5 semantic similarities 0 0.5 -0.1 0 0 User interests [-1,1] a1 a2 a3 a4 a5 1 0.5 0.2 0 0.3 a1 (0.5) (0.3) 0.5 1 0.3 0 0.1 a2 a3 0.2 0.3 1 0.7 0.8 a4 0.25 0.5 0.05 0 0 Expanded 0 0 0.7 1 0 a1 a2 a3 a4 a5 user interests [-1,1] a5 0.3 0.1 0.8 0 1 new interest Weight updated Similarities can be symmetric or not depending on the similarity measure used Method - activation threshold = 0.25 hyper-parameters: - fan-out threshold = 0.25 - max.expansion levels = 1 Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 13
  • 14. Method 2: Prediction generation by pair-wise semantic matching strategies Approach: Vector-based matching All-pairs matching Best-pairs matching Attribute Result: 0.15 - 0.056 = 0.094 - 0.056 = 0.12 - 0.009 + 0.035 semantic similarities (using the product as aggregation function) a1 a2 a3 a4 a5 a1 a2 a3 a4 a5 Item attributes [0,1] 1 0.5 0.2 0 0.3 a1 0 0.3 0 0 0.7 0.5 1 0.3 0 0.1 a2 a3 (0.3) 0.2 0.3 1 0.7 0.8 Direct (0.1) a4 0 0 0.7 1 0 matching (1) (0.8) a5 0.3 0.1 0.8 0 1 Similarities can be symmetric or 0 0.5 -0.1 0 0 User interests [-1,1] not depending on the similarity a1 a2 a3 a4 a5 measure used Method - similarity threshold = 0.05 hyper-parameter: Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 14
  • 15. Outline Cold-start problem and existing solutions Proposed solution to overcome cold start MovieLens data set Evaluation and results Experimental results Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 15
  • 16. Offline experimentation with a MovieLens data set extended with movie metadata Data set statistics after pruning unusual attributes values and movies with few attributes: Users 2113 Movies 1646 Attributes 4 (Genres, directors, actors and tags) Attribute values 2886 Ratings per user on avg. 239 Rating density 14% Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 16
  • 17. Evaluation of methods for semantics exploitation Baseline = Traditional CB using hybrid user modeling technique Expansion-CB = CSA-same + User-based + raw frequencies Matching-CB = Best-pairs-same + User-based + Forbes-Zhu method BPR-MF = CF based on matrix factorization optimized for ranking Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 17
  • 18. Conclusions  Cold-start problem can be very critical o Above all in systems with small databases  Existing solutions have some limitations o Traditional CB cannot solve new user scenario o Semantically-enhanced CB requires domain ontologies to work  Exploitation of implicit semantics can be a good alternative to overcome cold-start problem o User-based semantics is more effective than item-based o The best-pair semantic matching method is more effective than the profile expansion based on spreading activation Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 18
  • 19. Future work  Experimenting with data sets of different domains o Million Song data set  Extending the study of Vector Space Models o Probabilistic similarity measures (e.g. Kullback-Leiber)  Apply the same approach to enhance cold-start performance of context-aware recommenders o Implicit semantics of contextual conditions can also be acquired from user data o Similarly, pair-wise semantic strategies can be employed to enhance contextual user modeling Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 19

Hinweis der Redaktion

  1. Soc estudiant de doctorat del grup KEMLG a la UPC i el meu director de tesis és el Luigi Ceccaroni A grans trets la meva investigació consisteix en estudiar nous metodes per millorar el rendiment de les tecniques de recomanació existents mitjançant la explotació de la semantica implicita del domini
  2. Desde l’arribada d’internet tenim un al nostre abast un exces d’informarció que fa dificil en moltes ocasions trobar els productes i serveis que millor s’adapten a les nostres preferencies. Per cobrir aquesta necessitat van apareixer els sistemes de filtrat d’informació o de recomanació personalitzada, i cada cop més, s’han convertit en un component imprescindible per a molts serveis en linea, principalment de l’industria de l’entreteniment.
  3. Oferir bones recomanacions als usuaris comporta normalment una millora de la seva satisfacció i un increment de les ventes o de l’us del sistema. Clars exemples d’exit els tenim en companyies amb una gran base de dades com Netflix, Google I Amazon La tecnica de recomanació que més predomina actualment es la recomanació cooperativa o CF, donat que en condicions optimes es la que aconsegueix recomanacions més precises. La idea principal d’aquesta tecnica es la de recomanar objectes que han agradat a altres usuaris amb interessos semblants al nostre.
  4. Pero el problema es que aquest bon rendiment no es repeteix normalment en la majoria de serveis online. Per què? Doncs la principal raó es la falta de dades d’usuari. Una de las principals limitacions dels metodes basats en CF es que el seu rendiment va altament lligat a la quantitat de dades disponibles per generar les prediccions, es a dir, en el nombre d’usuaris i de ratings disponibles. La falta de sensibilitat al context i particularitats del domini on s’aplica el recomanador també poden causar un mal funcionament.
  5. - El nostre treball es centra amb el problema de la falta d’informació que normalment es coneix com el cold-start o d’arrencada en fred - Començaré parlant amb més detall d’aquest problema i de les solucions que existeixen actualment Després presentaré la solució que proposem I finalment mostraré els resultats principals de la nostra evaluació
  6. - A continuació explicaré el problema de cold-start i les solucions principals que s’apliquen actualment
  7. -En la literatura, el problema de cold-start es pot definir desde 2 punts de vista diferents: alguns consideren cold-start quan els usuaris o objectes son completament nous, es a dir, encara no hi ha cap valoració implicita o explicita associada amb ells; I a d’altres que consideren cold-start, ademés dels completament nous, els que tenen poques valoracions associades. Nosaltres fem us d’aquest punt de vista més ampli del problema. -Ens podem trobar en 3 escenaris de cold-start alhora de predir el grau d’utilitat d’un objecte per un usuari concret. -L’escenari de nou objecte, quan nomes tenim poques valoracions de l’objecte -L’escenari de nou usuari, quan nomes tenim -I l’escenari més extrem quan hi ha poques valoracions tan de l’objecte com de l’usuari.
  8. -La solucio més comuna per evitar un baix rendiment en els escenaris de cold-start es utilitzar un sistem hibrid on es combini recomanacio cooperativa amb recomanacio basada en contingut. Aquesta altre familia de tecniques fa us dels descripcions textuals o metadata dels objectes per generar les recomanacions. -D’aquesta manera l’escenari de nou objecte queda solventat ja que no depen de que altres usuaris l’hagin valorat anteriorment. -En canvi, l’escenari nou usuari segueix sent un problema ja que per construir un perfil d’usuari precis es necessari que l’usuari proporcioni un nombre determinat de valoracions. -Ademes, el metode tradicional té la limitació de que la semantica del domini no es té en compte durant la predicció. -Per solventar aquesta limitació, més recentment va apareixer la familia de recomanadors semantics que es caracteritzar per explotar la semantica explicita del domini normalment representada en la forma d’ontologies. Gracies a la semantica diversos estudis han demostrat que també es pot millorar el rendiment en l’escenari de nou usuari ja que permet completar els perfils d’usuari. - Tot I això, l’aplicació dels recomanador semantics actuals depenen completament de l’existencia d’ontologies de domini I aixo no es sempre possible.
  9. Amb l’objectiu de solventar aquesta limitació dels recomanadors semantics, en aquest treball hem desenvolupat I evaluat metodes per l’acquisició I explotacio de la semantica implicita del domini.
  10. Nosaltres entenem com a semantica implicita del domini a les semblances semantics entre atributs que descrien els objectes calculades a partir de models distribucionals, també coneguts com vector space models. Aquests models es basen en la hypothesis distribucional, que assumeix que termes o paraules que apareixen frequentment en contexts semblants estan relacionades semanticament. Nosaltres hem generalitzat aquesta hypthosis per a ser utilitzada per calcular relacions semantics entre attributs, ja siguin tags, actors de peliculas. En particular, utilitzem com a corpus els perfils normalizats dels objectes o del usuaris, que com a continuació veureu implican resultats ben diferents. Un cop seleccionat el corpus, es pot aplicat una transformació a la matriu corresponent (com una reducció de dimensionalitat) I finalment es calcula la similitut entre attributs comparant els vectors de coocurrencia corresponents per a cada attribut. En els experiments hem utilitzat 2 tecnicas de reduccio de dimensionalitat i la measure del cosinus.
  11. Com he dit anteriorment, depenent del context utilitzat com a corpus les similituts semantics resultants son diferents. En el cas d’utilitzar els objectes com a context de coocurrencia, la semblança entre dos attributs es mesura en termes de quants objectes contenen ambdos atributs. En el cas d’utilitzar els usuarios, la semblança es measura en termes de quants usuaris estan interessats en ambdos attributs. Com podeu veure en l’example, les semblances calculades varien dependen del context tan en valor com en ordre.
  12. Aquest grafic mostra els principals components de la recomanacio basada en contingut: per una banda hi ha el component de modelatge d’usuari, que s’encarrega de crear el perfil d’usuari en relació als atributs del domini a partir de les valoracions als objectes del domini I de la seves descripcions. I per una altra banda hi ha el component de predicció que s’encarrega de generar la puntuació per a un objecte concret, calculant la correspodencia entre els perfil d’usuari I de l’objecte. En aquest treball hem implementat dos metodes per explotar la semantica implicita: el metode d’expansio de perfil d’usuari que modifica el vector d’interesos uriginal amb nova informació que despres s’utilitza pel calcul de la correspondencia. I el metode de correspodencia semantica que incorpora les relacions semantics entre atributs durant el calcul.
  13. En aquesta transparencia mostro un exemple senzill de com funciona l’algoritme d’expansio de perfil d’usuari que hem desenvolupat basat en una tecnica de CSA. En el costat esquerra podeu veure la matriu de semblances semantiques entre els atributs del domini. En aquest exemple hi ha 5 attributs. I a la dreta teniu un perfil d’usuari en relació als 5 attributs. Un valor positiu representa que l’usuari esta interessat en l’atribut I un negatiu el contrari. El metode d’expansio té 3 hyperparamentres que regulen el grau de propagació: el llindar d’activació que delimita el grau d’interes necessari que a que s’activi la propagació desde un node; el llindar de fanout que delimita la semblança minima entre atributs per fer la propagació a un node; I finalment el numero maxim de nivells d’expansio des del node inicial. Tenint en compte els valors indicats del hyperparams, en aquest example nomes s’activaria la propagació des de l’atribut 2 ja que es l’unic que supera el llindar d’activació. Des d’aquest node es propagaria el valor als atributs 1 I 3 ja que el valor de les seves semblances superen el llindar de fanout. Donat que max num de nivells d’expansio es 1 aqui s’acabaria la expansio de perfil. Com a resultat el perfil d’usuari s’hauria completat amb 1 nou interes positiu I un recalcul del grau d’interes en l’atribut 3.
  14. -Ara passaré a explicar com funcional el metode correspodencia semantica aprofitant el mateix example, per lo que la matriu de semblances I el perfil d’usuari son els mateixos -En aquest cas el que busquem es incorporar les relacions semantiques entre atributs durant el calcul de la predicció Començo per mostrar com funciona el metode tradicional basat en el producte vectorial. En aquest cas, l’unic attribut que coincideix en ambdos perfils es el 2 per lo que la predicció es calculario como el producte del pesos corresponents. Si en comptes del metode tradicional utilizem l’estrategia de correspondencia semantica de millor-parell, ademés del atribut 2 també es consideria la correspondecia entre l’atribut 5 de l’object I el 3 de l’usuari, ja que aquesta estrategia considera per a cada atribut del perfil de l’objecte amb valor diferent de zero l’atribut del perfil d’usuari més semblant. L’altre estrategia semantica que hem estudiat es la de tots els parells, en la qual es consideren totes les correspondencies semantiques. En aquests casos l’aportació de cada correspodencia es ponderada amb el valor de la semblança entre atributs. Amb l’objectiu d’evitar correspodencias massa debils les estrategies utilitzen un llindar de semblança que delimina el minim valor de semblança per a ser considerat en el calcul de la correspodencia.
  15. A continuació mostraré els results principals de l’avaluació dels metodes proposats
  16. Per a l’avaluació hem utilitzat un dels conjunts de dades disponibles del sistema MovieLens que inclou metadata sobre les peliculas. Aquestes son les principals estadisticas del data set despres de filtrar pelicules amb poca metadata. En particular hem utilitzat per a l’experiment 4 attributes differents: … amb un total de 2886 valors d’atributs diferents.
  17. En aquest grafic de barres es poden apreciar els principals resultats dels metodes d’explotació semantica proposats. El que es mostra es el tan percent de millora respecte al baseline en quan a precisió de ranking. En aquest cas el baseline consisteix en un metode basat en contingut tradicional, es adir, sense fer us de la semantic del domini. Les barres de color negre corresponen als resultats globals, tenint en compte tots els usuaris I objectes. La de color vermell corresponen als resultats de nomes nous usuaris I la de color ver son els de nous objectes. Pels simular els escenaris de cold-start hem seleccionat el 10% d’usuaris I objectes amb menys ratings. En quant els algoritmes avaluats expansion-CB correspon el metode d’expansio de perfil d’usuari, matching-CB correspon al metode de correspondencia semantica de millor parells, I BPR-MF correspon a un algoritme actual de CF optimizat per generar rankings. Per a cada un dels algoritmes hem seleccionat la configuració amb millor rendiment global. A partir dels resultats s’observa que el metode correspodencia semantica es més efectiu que el metode d’expansio de perfil. Si el comparem amb el resultats de l’algoritme de filtrat cooperatiu podem comprobar que tan en nous usuaris com nous objectes el rendiment de matching-CB es millor. De fet, el rendiment del recomanador collaboratiu en l’escenari de nous items es pitjor que el de baseline, algo força normal tenint em compte que el baseline es una algoritme basat en contingut. Finalment, el terms de rendiment global els dos metodes estan força equiparats sent una mica millor el de filtrat cooperatiu.