Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Recommendation as Classification

Weitere Verwandte Inhalte

Ähnliche Bücher

Kostenlos mit einer 30-tägigen Testversion von Scribd

Alle anzeigen

Ähnliche Hörbücher

Kostenlos mit einer 30-tägigen Testversion von Scribd

Alle anzeigen

Recommendation as Classification

  1. 1. Recommendation as Classification Max Lin @m4xl1n NYC Predictive Analytics Meetup March 2011
  2. 2. R Recommendation Engine Competition • Kaggle.com • http://www.kaggle.com/R • “Record Me Men” placed 2nd with AUC 0.9832 • < 0.9881, > 0.9812
  3. 3. Recommendation as Classification • Input: (User, Package) • Output: Recommend the package or not • Recommend≈ Package is installed by User
  4. 4. Classification • Features • Classifier training algorithms • Training: Minimize loss + regularizers J(θ) = L(yi , f (ui , pi ; θ)) + λR(θ) • Stochastic gradient descent i • Choose parameters by cross validation
  5. 5. Classification Models • Model 1: Baseline • Model 2: Latent factor models • Model 3: Package LDA topic • Model 4: Package task view • Ensemble Learning
  6. 6. M1: Baseline • Provided by the contest organizer • Strong baseline: AUC of ~0.94 • 7 package features + User factors • Logistic Regression
  7. 7. M2: Factor Models • Features: user factors, package factors, latent user and package factors • Classifier: f (u, p) = µ + µu + µp + T βu βp • Minimize exponential loss + L2 regularizers
  8. 8. Model Expressiveness
  9. 9. M3: Package LDA topic • Features: user factors, package factors, package LDA topics • Classifier: Similar to M2 f (u, v) = µ + µu + µp + tu
  10. 10. M4: Package task view • Features: user factors, package factors, package task views • e.g., high-performance computing • Classifier: Similar to M3 f (u, v) = µ + µu + µp + tu
  11. 11. Ensemble Learning • Combine predictions from individual models • Logistic Regression
  12. 12. Code & More • Github https://github.com/m4xl1n • Python + R • Blog post: http://bit.ly/hWmQyM
  13. 13. Lessons • Features, Features, Features • User factors, package factors • Data cleaning • Domain knowledge

Hinweis der Redaktion

  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

×