Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Content-Based approaches for Cold-Start Job Recommendations

380 Aufrufe

Veröffentlicht am

Content-Based approaches for Cold-Start Job Recommendations
ACM RecSys Challenge 2017
Lunatic Goats @PoliMi
M. Bianchi, F. Cesaro, F. Ciceri, M. Dagrada, A. Gasparin, D. Grattarola, I. Inajjar, A. M. Metelli, L. Cella

Veröffentlicht in: Daten & Analysen
  • Login to see the comments

Content-Based approaches for Cold-Start Job Recommendations

  1. 1. Titolo presentazione sottotitolo Milano, XX mese 20XX Content-Based approaches for Cold-Start Job Recommendations ACM RecSys Challenge 2017 Lunatic Goats @PoliMi M. Bianchi, F. Cesaro, F. Ciceri, M. Dagrada, A. Gasparin, D. Grattarola, I. Inajjar, A. M. Metelli, L. Cella
  2. 2. Lunatic Goats @PoliMi Task Outline ● Cold Start recommendation scenario: ○ job posting recommendations; ○ focus on getting positive interactions; ○ penalized for negative interaction; ○ rewarded for recruiter Interest. ● Two phases: ○ Offline - predictions for fixed sets of items and users. ○ Online - daily recommendation to variable sets of users.
  3. 3. Lunatic Goats @PoliMi Data Analysis - Impressions vs Interactions ● Impressions: ~97% of the data, little to no information contained (discarded). ● Interactions: ~3% of the data. ● Interactions divided in: ○ positive interactions (types 1, 2 and 3); ○ negative interactions (type 4); ○ recruiter interest (type 5). ● Interactions treated with implicit approach.
  4. 4. Lunatic Goats @PoliMi Local Validation ● Split the dataset in train and validation set. ● Random sampling procedure: ○ randomly select target items from dataset; ○ remove all interactions with these items; ○ pick target users as a subset of those who have interactions with these items. ● Preserve the user-item ratio. ● No cross-validation, too much data
  5. 5. Lunatic Goats @PoliMi Solution - Preprocessing ● One Hot Encoding of both user and items features. ● Feature aggregation: ● TF-IDF application. ● Negative User Filtering: removing heavy deleters.
  6. 6. Lunatic Goats @PoliMi Solution Overview
  7. 7. Lunatic Goats @PoliMi Solution - Negative Recommendation ● Scoring heavily penalized negative (type 4) interactions ● Using CBF approach, predict type 4 interactions ● Ensemble these predictions with negative weight
  8. 8. Lunatic Goats @PoliMi Solution – Content Based Filtering algorithms (CBF) Recommend to a user items similar to the ones he/she likes. ● Run separately on positive (CBF+) and negative (CBF-) interactions. ● Tanimoto similarity between items: ● Recommendation performed for filtered users only: ● Penalize heavy clickers.
  9. 9. Lunatic Goats @PoliMi Solution – Profile Matching (PM) Recommend to a user items matching his/her profile. ● Cosine similarity between user and item: ● Items’ tags and titles compared with users’ jobroles. ● Recommendation performed for filtered users only. ● Differently from CBF, PM is able to recommend also cold-start users.
  10. 10. Lunatic Goats @PoliMi Solution – Collaborative Filtering algorithms ● CF cannot be run directly in a cold-start scenario. ● Content-based microclustering approach: ○ for each cold-start item associate the interactions of the top 5 CBF-similar non-cold-start items; ○ run standard CF algorithms. ● CF algorithms: ○ CF with item cosine similarity; ○ iALS (Implicit Alternating Least Squares).
  11. 11. Lunatic Goats @PoliMi Solution - Ensemble Structure ● Divide algorithms by nature. ● Normalize and weight each layer. ● Generate upper layers by adding lower layers. ● Output 100 best scores.
  12. 12. Lunatic Goats @PoliMi Solution - Parameter Tuning ● Ensemble tuning: ○ 9 weights (one for each block), reduced to 6 due to normalization; ○ non-differentiable scoring function; ○ gradient-free optimization methods: ■ Genetic Algorithms - quick and acceptable results; ■ Powell’s Conjugate Direction method - slower but superior results. ● Individual algorithms tuning: ○ greedy search on local test.
  13. 13. Lunatic Goats @PoliMi Online - Changes to ensemble ● Normalization type. ● Cutting for each user before items. ● Excluding slower algorithms - prompt push gives more exposure → better scores.
  14. 14. Lunatic Goats @PoliMi Architecture & Runtime ● Recommender is run on VM’s with 8 cores and 16GB RAM. ● Only exception is content-based microclustering and iALS, run on 8 core 64GB RAM. ● Code is heavily optimized to use little memory efficiently (sparse matrix representations, efficient matrix operations). ● Results in optimal runtime.
  15. 15. Lunatic Goats @PoliMi Scores - Local vs Offline Algorithm Local score Leaderboard score Execution time CBF+ 57852 60257 13 min CBF- -1330 -8529 4 min PM 17260 16777 7 min CF 42213 39250 12 min iALS 48081 52411 150 min XING Baseline 14742 14395 40 min Ensemble 60625 71372 2 min
  16. 16. Lunatic Goats @PoliMi Results and Conclusions ● 2nd place in the online phase; ● 1st place in the offline phase. ● Points of strength: ○ speed (in particular offline ~20 min); ○ ease of implementation. ● Extensions: ○ feature weighting (user personalized, feature interaction); ○ time decay models.