Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Company Recommendation for New Graduates via Implicit Feedback Multiple Matrix Factorization with Bayesian Optimization

1.385 Aufrufe

Veröffentlicht am

2016/12/07 IEEE BigData 2016での、風間の講演資料になります

Veröffentlicht in: Technologie
  • Login to see the comments

  • Gehören Sie zu den Ersten, denen das gefällt!

Company Recommendation for New Graduates via Implicit Feedback Multiple Matrix Factorization with Bayesian Optimization

  1. 1. Company Recommendation for New Graduates via Implicit Feedback Multiple Matrix Factorization with Bayesian Optimization IEEE BIG DATA 2016 Washington D.C. Masahiro Kazama1, Issei Sato2, Haruaki Yatabe3, Tairiku Ogihara3, Tetsuro Onishi3, Hiroshi Nakagawa2 1.Recruit Technologies Co., Ltd. 2. University of Tokyo 3.Recruit Career Co., Ltd
  2. 2. Outline • Problem Settings • Data Description • Proposed Method • Experiments • Results • Conclusion
  3. 3. Problem Setting • Unique job hunting activities of Japanese students • The starting time for job hunting is fixed • All students apply at the same time Example. job hunting schedule of students who graduate in 2015 Start job hunting activities Start Interview Graduate/Join Dec 1, 2013 April 1, 2014 April 1, 2015
  4. 4. Problem Setting • Students have to send application sheet for many companies to get a job offer • Many students spend much time on job hunting activities. This is a big social problem in Japan • Many students send application sheet to the popular companies at the beginning. But they have a high competition rate, therefore they can not get a job offer.
  5. 5. Popularity bias • Browsing concentrates on some companies 5Company(ordered by popularity) Low-browsed companies (Bottom 80%) High-browsed companies(Top 20%) Number of Students
  6. 6. Problem Setting • It is important to find a company suitable for students at an early stage of job hunting activities • It is important to consider not only High-browsed companies but also Low-browsed companies
  7. 7. Solutions • We recommend suitable companies to students at an early stage • We focus on low-browsed companies
  8. 8. Data • Our company (Recruit.Co.Ltd) provides a job recruiting service • Almost all students use our service • We have three types of data 1. Browsing data 2. Entry data 3. Student/Company information
  9. 9. Browsing data • Browsing data of students on our recruiting service • Used for training our model • period: 2013/12/1〜2014/3/31 9
  10. 10. Entry data • Entry data of students on our recruiting service • Used for evaluating our model • period: 2013/12/1〜2014/3/31 10
  11. 11. Browsing (click) data 11 click i1 i2 i3 i4 j1 0 4 0 21 j2 71 31 0 18 j3 3 1 2 0 Students Company
  12. 12. Entry data 12 entry i1 i2 i3 i4 j1 0 1 0 0 j2 0 1 0 1 j3 1 0 1 0 Student Company
  13. 13. Student/Company info 13 Student Faculty Department etc.. Company Industry type Location Number of employees
  14. 14. Overview 14 Purpose Solution ・Using browsing data and student/company information, we recommend suitable companies to students ・We focus on low-browsed companies • Using browsing data -> Implicit feedback recommendation • Low-browsed item recommendation -> Popularity bias • Hyper parameter search → Bayesian optimization
  15. 15. Explicit VS Implicit 15 Explicit feedback Implicit feedback The data user explicitly give. The user action data for guessing user preference e.g. Amazon 5 star rating Click log Pros Good quality Easy to get Much data Con Difficult to get Noise Popularity bias
  16. 16. Popularity bias • Browsing concentrates on some companies →High-browsed companies are more likely to be recommended 16Company(ordered by popularity) Low-browsed company (Bottom 80%) We want to recommend these High-browsed company(Top 20%) Number of students
  17. 17. Implicit feedback matrix factorization 17 Number of clicks Collaborative Filtering for Implicit Feedback Datasets(2008) Yifan Hu, Yehuda Koren, Chris Volinsky rui = 1 0 rui > 0 rui = 0 ! " # $# confidence preference i1 i2 i3 j1 41 j2 2 j3 24 3 51 Browsing data
  18. 18. Problem • High-browsed companies are more likely to be recommended 18 i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11 i12 Company Number of clicks Low-browsed companies We want to recommend these Likely to be recommended
  19. 19. Proposed method 19 = Number of users who browsed the company i (Company’s popularity) c is bigger when the company has fewer clicks → Low-browsed companies are likely to be recommended
  20. 20. Proposed method with side information 20 Student information Company information
  21. 21. Hyper parameter search • Weight of Browsing α、β、Regularization λ1, λ2, λ3 • When the number of hyper parameter is large, grid search doesn’t work well • Use Bayesian optimization for hyper parameter search 21
  22. 22. Bayesian optimization 22 x y=f(x) y Optimization for Black-box →Gaussian process is assumed for distribution of function f(x) →It suggests the next hyper parameter to evaluate x : Hyper parameter α、β、λ1, λ2, λ3 f(x) : Recall We want to find hyper parameter that maximize Recall Mockus,1978
  23. 23. Data and Evaluation Recall@100(low browsed) 23 c01 c02 c03 c04 c05 c06 c07 c08 c09 c10 Brow sing 10 20 1 8 5 10 3 7 23 13 Entry ◯ ◯ ◯ ◯ 60% 20% 20% Training Set for matrix factorization Validation Set for Bayesian Optimization(BO) Evaluation Set
  24. 24. Results 0 0.1 0.2 0.3 0.4 0.5 BO+Hu et al. BO+Fang et al. Proposed Proposed with side Proposed models get better recall
  25. 25. Trials of Bayesian Optimization Increasing the trials, we get better recall. -> we can find better hyper parameters
  26. 26. Conclusions • We built a recommendation system that relaxes popularity bias • By using the side information, the recommendation performance of the low-browsed companies improved • Hyper parameter optimization was performed using Bayesian optimization

×