Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
From Idea to
Execution: Spotify’s
Discover Weekly
Chris Johnson :: @MrChrisJohnson
Edward Newett :: @scaladaze
DataEngConf...
Who are We??
Chris Johnson Edward Newett
Spotify in Numbers
• Started in 2006, now available in 58 markets
• 75+ Million active users, 20 Million paying subscriber...
Challenge: 30M songs… how do we recommend
music to users?
Discover
Radio
Related Artists
Discover Weekly
• Started in 2006, now available in 58 markets
• 75+ Million active users, 20 Million paying subscribers
•...
The Road to
Discover Weekly
2013 :: Discover Page v1.0
• Personalized News Feed of
recommendations
• Artists, Album Reviews, News
Articles, New Releas...
2014 :: Discover Page v2.0
• Recommendations grouped into
strips (a la Netflix)
• Limited to Albums and New
Releases
• Mor...
Insight: users spending more time on
editorial Browse playlists than Discover.
Idea: combine the
personalized experience
of Discover with the lean-
back ease of Browse
Meanwhile… 2014 Year In Music
Play it forward: Same content as the
Discover Page but.. a playlist
Lesson 1:
Be data driven from
start to finish
Slide from Dan McKinley - Etsy
2008 2012 2015
• Reach: How many users are you reaching
• Depth: For the users you reach, what is the
depth of reach.
• Retention: For th...
• Reach: DW WAU / Spotify WAU
• Depth: DW Time Spent / Spotify WAU
• Retention: DW week-over-week retention
Discover Weekl...
2008 2012 2015
Slide from Dan McKinley - Etsy
Step 1: Prototype (employee test)
Step 1: Prototype (employee test)
Results of Employee Test were very positive!
2008 2012 2015
Slide from Dan McKinley - Etsy
Step 2: Release AB Test to 1% of Users
Google Form 1% Results
Personalized image resulted in 10% lift in WAU
• Initial 0.5% user test
• 1% Spaceman image
• 1% Personalized
image
Lesson 2:
Reuse existing
infrastructure in creative
ways
Discover Weekly Data Flow
Recommendation
Models
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
•Aggregate all (user, trac...
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
•Aggregate all (user, trac...
NLP Models on News and Blogs
Playlist itself is a
document
Songs in
playlist are
words
NLP Models work great on Playlists!
[3] http://benanne.github.io/2014/08/05/spotify-cnns.html
Deep Learning on Audio
•normalized item-vectors
Songs in a Latent Space representation
•user-vector in same space
Songs in a Latent Space representation
Lesson 3:
Don’t scale until
you need to
Scaling to 100%: Rollout Challenges
‣Create and publish 75M playlists every week
‣Downloading and processing Facebook imag...
Scaling to 100%: Weekly refresh
‣Time sensitive updates
‣Refresh 75M playlists every Sunday night
‣Take timezones into acc...
Discover Weekly publishing flow
What’s next?
Iterating on content
quality and interface
enhancements
Iterating on quality and adding a feedback loop.
DW feedback comes at the expense of presentation bias.
Lesson 4:
Users know best. In the
end, AB Test everything!
Lesson 5 (final lesson!):
Empower bottom-up
innovation in your org and
amazing things will happen.
Thank You!
(btw, we’re hiring Machine Learning and
Data Engineers, come chat with us!)
From Idea to Execution: Spotify's Discover Weekly
From Idea to Execution: Spotify's Discover Weekly
From Idea to Execution: Spotify's Discover Weekly
Nächste SlideShare
Wird geladen in …5
×

From Idea to Execution: Spotify's Discover Weekly

234.664 Aufrufe

Veröffentlicht am

Discover Weekly is a personalized mixtape of 30 highly personalized songs that's curated and delivered to Spotify's 75M active users every Monday. It's received high acclaim in the press and reached 1B streams within its first 10 weeks. In this slide deck we dive into the narrative of how Discover Weekly came to be, highlighting technical challenges, data driven development, and the Machine Learning models used to power our recommendations engine.

Veröffentlicht in: Software, Technologie
  • Are you literally FEEDING your diabetes putting this one "health" food on your dinner plate? This is important. You must stop eating this food today or you could be doubling the speed at which your diabetes progresses... ★★★ http://ishbv.com/bloodsug/pdf
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • Girls for sex in your area are there: tinyurl.com/areahotsex
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • If there's a nicer guy in the tipping industry I'm yet to meet them. Glad you let me buy you a pint the other week too. Absolutely the least I could have done. Looking forward to turning this up a notch once I hit £20k banked.  https://bit.ly/30jWepO
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • What if you had a printing press that could spit out hundred dollar bills on demand? Do you think that would change your life? ★★★ http://scamcb.com/ezpayjobs/pdf
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • I am making a good MONEY (500$ to 700$ / hr )online on my Ipad .Last month my pay check of nearly 30 k$.This online work is like draw straight-arrow and earn money.Do not go to office.I do not claim to be others,I just work.You will call yourself after doing this JOB,It's a REAL job.Will be very lucky to refer to this WEBSITE.I hope,you can find something,Simply go to the below SITE. ➤➤ http://t.cn/AisJWUCf
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier

From Idea to Execution: Spotify's Discover Weekly

  1. From Idea to Execution: Spotify’s Discover Weekly Chris Johnson :: @MrChrisJohnson Edward Newett :: @scaladaze DataEngConf • NYC • Nov 2015 Or: 5 lessons in building recommendation products at scale
  2. Who are We?? Chris Johnson Edward Newett
  3. Spotify in Numbers • Started in 2006, now available in 58 markets • 75+ Million active users, 20 Million paying subscribers • 30+ Million songs, 20,000 new songs added per day • 1.5 Billion user generated playlists • 1 TB user data logged per day • 1,700 node Hadoop cluster • 10,000+ Hadoop jobs run daily
  4. Challenge: 30M songs… how do we recommend music to users?
  5. Discover
  6. Radio
  7. Related Artists
  8. Discover Weekly • Started in 2006, now available in 58 markets • 75+ Million active users, 20 Million paying subscribers • 30+ Million songs, 20,000 new songs added per day • 1.5 Billion user generated playlists • 1 TB user data logged per day • 1,700 node Hadoop cluster • 10,000+ Hadoop jobs run daily
  9. The Road to Discover Weekly
  10. 2013 :: Discover Page v1.0 • Personalized News Feed of recommendations • Artists, Album Reviews, News Articles, New Releases, Upcoming Concerts, Social Recommendations, Playlists… • Required a lot of attention and digging to engage with recommendations • No organization of content
  11. 2014 :: Discover Page v2.0 • Recommendations grouped into strips (a la Netflix) • Limited to Albums and New Releases • More organized than News-Feed but still requires active interaction
  12. Insight: users spending more time on editorial Browse playlists than Discover.
  13. Idea: combine the personalized experience of Discover with the lean- back ease of Browse
  14. Meanwhile… 2014 Year In Music
  15. Play it forward: Same content as the Discover Page but.. a playlist
  16. Lesson 1: Be data driven from start to finish
  17. Slide from Dan McKinley - Etsy 2008 2012 2015
  18. • Reach: How many users are you reaching • Depth: For the users you reach, what is the depth of reach. • Retention: For the users you reach, how many do you retain? Define success metrics BEFORE you release your test
  19. • Reach: DW WAU / Spotify WAU • Depth: DW Time Spent / Spotify WAU • Retention: DW week-over-week retention Discover Weekly Key Success Metrics
  20. 2008 2012 2015 Slide from Dan McKinley - Etsy
  21. Step 1: Prototype (employee test)
  22. Step 1: Prototype (employee test)
  23. Results of Employee Test were very positive!
  24. 2008 2012 2015 Slide from Dan McKinley - Etsy
  25. Step 2: Release AB Test to 1% of Users
  26. Google Form 1% Results
  27. Personalized image resulted in 10% lift in WAU • Initial 0.5% user test • 1% Spaceman image • 1% Personalized image
  28. Lesson 2: Reuse existing infrastructure in creative ways
  29. Discover Weekly Data Flow
  30. Recommendation Models
  31. 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of plays, context, and recency as weight X YUsers Songs • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vector [1] Hu Y. & Koren Y. & Volinsky C. (2008) Collaborative Filtering for Implicit Feedback Datasets 8th IEEE International Conference on Data Mining Implicit Matrix Factorization
  32. 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 •Aggregate all (user, track) streams into a large matrix •Goal: Model probability of user playing a song as logistic, then maximize log likelihood of binary preference matrix, weighting positive observations by a function of plays, context, and recency X YUsers Songs • = bias for user • = bias for item • = regularization parameter • = user latent factor vector • = item latent factor vector [2] Johnson C. (2014) Logistic Matrix Factorization for Implicit Feedback Data NIPS Workshop on Distributed Matrix Computations Can also use Logistic Loss!
  33. NLP Models on News and Blogs
  34. Playlist itself is a document Songs in playlist are words NLP Models work great on Playlists!
  35. [3] http://benanne.github.io/2014/08/05/spotify-cnns.html Deep Learning on Audio
  36. •normalized item-vectors Songs in a Latent Space representation
  37. •user-vector in same space Songs in a Latent Space representation
  38. Lesson 3: Don’t scale until you need to
  39. Scaling to 100%: Rollout Challenges ‣Create and publish 75M playlists every week ‣Downloading and processing Facebook images ‣Language translations
  40. Scaling to 100%: Weekly refresh ‣Time sensitive updates ‣Refresh 75M playlists every Sunday night ‣Take timezones into account
  41. Discover Weekly publishing flow
  42. What’s next? Iterating on content quality and interface enhancements
  43. Iterating on quality and adding a feedback loop.
  44. DW feedback comes at the expense of presentation bias.
  45. Lesson 4: Users know best. In the end, AB Test everything!
  46. Lesson 5 (final lesson!): Empower bottom-up innovation in your org and amazing things will happen.
  47. Thank You! (btw, we’re hiring Machine Learning and Data Engineers, come chat with us!)

×