SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Downloaden Sie, um offline zu lesen
Challenge statement
Our Solution
What could we do better?
RecSys Challenge 2016
job recommendations based on preselection of offers and gradient
boosting
Andrzej Pacuk Piotr Sankowski Karol W˛egrzycki
Adam Witkowski Piotr Wygocki
apacuk@mimuw.edu.pl
University of Warsaw
RecSys Challenge 2016
mim-solutions.pl RecSys Challenge 2016
Challenge statement
Our Solution
What could we do better?
Outline
1 Challenge statement
2 Our Solution
Candidate items selection
Learning probabilities
Features
3 What could we do better?
mim-solutions.pl RecSys Challenge 2016
Challenge statement
Our Solution
What could we do better?
Problem
Xing.com dataset:
user profiles (experience, education, current job’s roles, etc.),
job (item) offer description (title, tags, employment type, etc.),
past recommendations (impressions),
user positive (clicking, bookmarking, replying) and negative
(deleting) interactions with items.
Task: predict user’s positive interactions.
mim-solutions.pl RecSys Challenge 2016
Challenge statement
Our Solution
What could we do better?
Evaluation
Secret ground truth (GT): positive interactions from test week.
Mean average precision-like (MAP) measure.
Online evaluation.
Finished 2nd!
mim-solutions.pl RecSys Challenge 2016
Challenge statement
Our Solution
What could we do better?
Candidate items selection
Learning probabilities
Features
Solution’s schema
user
job #1 job #2 job #3
select candidates
predict probabilities
sort
... job #N
job #1
0.3
job #2
0.7
job #3
0.4
...
job #N
0.5
job #15
0.9
job #34
0.89
...
job #124
0.75
take top 30
mim-solutions.pl RecSys Challenge 2016
Challenge statement
Our Solution
What could we do better?
Candidate items selection
Learning probabilities
Features
Training set
Training GT: positive interactions of last week.
Local score.
Separate candidates and features for training and full dataset!
mim-solutions.pl RecSys Challenge 2016
Challenge statement
Our Solution
What could we do better?
Candidate items selection
Learning probabilities
Features
Candidates
Candidate - item with high:
P [i ∈ GT(u)] .
20 categories.
Ranking: e.g. sort interactions by timestamp.
∼ 300 candidates per user (0.1% of all items).
37% cover of training GT.
mim-solutions.pl RecSys Challenge 2016
Challenge statement
Our Solution
What could we do better?
Candidate items selection
Learning probabilities
Features
Candidates categories
Users’s interactions (Int(u)) sorted by week and events count
within week,
Similarly for impressions (Imp(u)),
Int(u ) for users u sorted by:
Jaccard(Int(u), Int(u )).
mim-solutions.pl RecSys Challenge 2016
Challenge statement
Our Solution
What could we do better?
Candidate items selection
Learning probabilities
Features
Candidates (cold start)
items i sorted by:
max
i ∈Int(u)
|tags(i) ∩ tags(i )|,
items i sorted by:
|jobroles(u) ∩ tags(i)|,
globally most popular items.
mim-solutions.pl RecSys Challenge 2016
Challenge statement
Our Solution
What could we do better?
Candidate items selection
Learning probabilities
Features
Candidate ranking
XGBoost (Gradient Boosting Decision Trees).
Optimizing logloss.
Training file from preselected candidates:
all positive,
sampled negative.
77.5% of perfect candidates ranking’s score.
mim-solutions.pl RecSys Challenge 2016
Challenge statement
Our Solution
What could we do better?
Candidate items selection
Learning probabilities
Features
Features
Feature maps (user, item) to real number.
12 groups.
Total 273.
Worked well with:
highly correlated features,
null values,
no scaling/normalization.
mim-solutions.pl RecSys Challenge 2016
Challenge statement
Our Solution
What could we do better?
Candidate items selection
Learning probabilities
Features
Feature definitions (sample)
Event based item: percentage of Int(u) having same property
(e.g., employment) as item i.
Most similar user who clicked item:
max
u ∈Users(i)
Jaccard(Int(u), Int(u )).
Most similar item clicked by user:
max
i ∈Int(u)
Jaccard(Users(i), Users(i )).
mim-solutions.pl RecSys Challenge 2016
Challenge statement
Our Solution
What could we do better?
Candidate items selection
Learning probabilities
Features
Top feature groups
feature group fscore
event based user (item) profile 41%
tags + title 7%
item global popularity 22%
trend 10%
weekday 4%
most similar 10%
item clicked by user 6%
user who clicked item 4%
user total events 8%
in last week 4%
seconds from last user activity 7%
max common tags with clicked item 4%
mim-solutions.pl RecSys Challenge 2016
Challenge statement
Our Solution
What could we do better?
Possible improvements
Training file:
8x bigger,
sample 1/4 negative candidates (instead of random 5) per user.
score: +6.5k.
Ensembling models.
Layer scores:
Candidates selection: 37%.
Ranking candidates: 77.5%.
mim-solutions.pl RecSys Challenge 2016
Challenge statement
Our Solution
What could we do better?
Thank you
apacuk@mimuw.edu.pl
mim-solutions.pl
mim-solutions.pl RecSys Challenge 2016

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (12)

Recruit recsys-review-magambo
Recruit recsys-review-magamboRecruit recsys-review-magambo
Recruit recsys-review-magambo
 
Thesis_Nazarova_Final(1)
Thesis_Nazarova_Final(1)Thesis_Nazarova_Final(1)
Thesis_Nazarova_Final(1)
 
allegrotech - Data science meetup #1 Intro
allegrotech - Data science  meetup #1 Introallegrotech - Data science  meetup #1 Intro
allegrotech - Data science meetup #1 Intro
 
Warsaw Data Science - Factorization Machines Introduction
Warsaw Data Science -  Factorization Machines IntroductionWarsaw Data Science -  Factorization Machines Introduction
Warsaw Data Science - Factorization Machines Introduction
 
Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...
Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...
Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...
 
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
 
Warsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick ReviewWarsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick Review
 
Prezentacja z Big Data Tech 2016: Machine Learning vs Big Data
Prezentacja z Big Data Tech 2016: Machine Learning vs Big DataPrezentacja z Big Data Tech 2016: Machine Learning vs Big Data
Prezentacja z Big Data Tech 2016: Machine Learning vs Big Data
 
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 

Kürzlich hochgeladen

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 

Kürzlich hochgeladen (20)

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 

RecSys Challenge 2016: job recommendations based on preselection of offers and gradient boosting

  • 1. Challenge statement Our Solution What could we do better? RecSys Challenge 2016 job recommendations based on preselection of offers and gradient boosting Andrzej Pacuk Piotr Sankowski Karol W˛egrzycki Adam Witkowski Piotr Wygocki apacuk@mimuw.edu.pl University of Warsaw RecSys Challenge 2016 mim-solutions.pl RecSys Challenge 2016
  • 2. Challenge statement Our Solution What could we do better? Outline 1 Challenge statement 2 Our Solution Candidate items selection Learning probabilities Features 3 What could we do better? mim-solutions.pl RecSys Challenge 2016
  • 3. Challenge statement Our Solution What could we do better? Problem Xing.com dataset: user profiles (experience, education, current job’s roles, etc.), job (item) offer description (title, tags, employment type, etc.), past recommendations (impressions), user positive (clicking, bookmarking, replying) and negative (deleting) interactions with items. Task: predict user’s positive interactions. mim-solutions.pl RecSys Challenge 2016
  • 4. Challenge statement Our Solution What could we do better? Evaluation Secret ground truth (GT): positive interactions from test week. Mean average precision-like (MAP) measure. Online evaluation. Finished 2nd! mim-solutions.pl RecSys Challenge 2016
  • 5. Challenge statement Our Solution What could we do better? Candidate items selection Learning probabilities Features Solution’s schema user job #1 job #2 job #3 select candidates predict probabilities sort ... job #N job #1 0.3 job #2 0.7 job #3 0.4 ... job #N 0.5 job #15 0.9 job #34 0.89 ... job #124 0.75 take top 30 mim-solutions.pl RecSys Challenge 2016
  • 6. Challenge statement Our Solution What could we do better? Candidate items selection Learning probabilities Features Training set Training GT: positive interactions of last week. Local score. Separate candidates and features for training and full dataset! mim-solutions.pl RecSys Challenge 2016
  • 7. Challenge statement Our Solution What could we do better? Candidate items selection Learning probabilities Features Candidates Candidate - item with high: P [i ∈ GT(u)] . 20 categories. Ranking: e.g. sort interactions by timestamp. ∼ 300 candidates per user (0.1% of all items). 37% cover of training GT. mim-solutions.pl RecSys Challenge 2016
  • 8. Challenge statement Our Solution What could we do better? Candidate items selection Learning probabilities Features Candidates categories Users’s interactions (Int(u)) sorted by week and events count within week, Similarly for impressions (Imp(u)), Int(u ) for users u sorted by: Jaccard(Int(u), Int(u )). mim-solutions.pl RecSys Challenge 2016
  • 9. Challenge statement Our Solution What could we do better? Candidate items selection Learning probabilities Features Candidates (cold start) items i sorted by: max i ∈Int(u) |tags(i) ∩ tags(i )|, items i sorted by: |jobroles(u) ∩ tags(i)|, globally most popular items. mim-solutions.pl RecSys Challenge 2016
  • 10. Challenge statement Our Solution What could we do better? Candidate items selection Learning probabilities Features Candidate ranking XGBoost (Gradient Boosting Decision Trees). Optimizing logloss. Training file from preselected candidates: all positive, sampled negative. 77.5% of perfect candidates ranking’s score. mim-solutions.pl RecSys Challenge 2016
  • 11. Challenge statement Our Solution What could we do better? Candidate items selection Learning probabilities Features Features Feature maps (user, item) to real number. 12 groups. Total 273. Worked well with: highly correlated features, null values, no scaling/normalization. mim-solutions.pl RecSys Challenge 2016
  • 12. Challenge statement Our Solution What could we do better? Candidate items selection Learning probabilities Features Feature definitions (sample) Event based item: percentage of Int(u) having same property (e.g., employment) as item i. Most similar user who clicked item: max u ∈Users(i) Jaccard(Int(u), Int(u )). Most similar item clicked by user: max i ∈Int(u) Jaccard(Users(i), Users(i )). mim-solutions.pl RecSys Challenge 2016
  • 13. Challenge statement Our Solution What could we do better? Candidate items selection Learning probabilities Features Top feature groups feature group fscore event based user (item) profile 41% tags + title 7% item global popularity 22% trend 10% weekday 4% most similar 10% item clicked by user 6% user who clicked item 4% user total events 8% in last week 4% seconds from last user activity 7% max common tags with clicked item 4% mim-solutions.pl RecSys Challenge 2016
  • 14. Challenge statement Our Solution What could we do better? Possible improvements Training file: 8x bigger, sample 1/4 negative candidates (instead of random 5) per user. score: +6.5k. Ensembling models. Layer scores: Candidates selection: 37%. Ranking candidates: 77.5%. mim-solutions.pl RecSys Challenge 2016
  • 15. Challenge statement Our Solution What could we do better? Thank you apacuk@mimuw.edu.pl mim-solutions.pl mim-solutions.pl RecSys Challenge 2016