Exploiting Twitter’sCollective Knowledge forMusic RecommendationsEva Zangerle, Wolfgang Gassler and Günther Specht        ...
Outline•   Motivation•   Data set creation•   Artist and track resolution•   Music recommendation•   Future directions•   ...
Motivation• Twitter as source for music recommendations• User stream: set of #nowplaying tweets of one  user (representing...
Data Set Creation• Crawling Twitter for keywords nowplaying,  listeningto, listento, etc.• 5 mio tweets from 07/2011 to 02...
Track Resolutionlistening to Hey Hey My My (Out Of The Blue) byNeil Young on @Grooveshark: #nowplaying#musicmonday http://...
Track Resolution• Matching of tweet content to reference DB• Fulltext index (Lucene, tf/idf + cosine sim)• Custom similari...
Evaluation of Resolution Process• Ground truth data set (100 tweets)• MusicBrainz and FreeDB tracks assigned manually• Aut...
Use Case: Music Recommendation• Recommendation based on co-occurrence  analysis on user streams• Evaluation through compar...
Problems & Future Directions• Sparsity of Data  – User stream crawling (more complete user    streams)  – Exploit Metadata...
Conclusion• Data Set Creation  – Crawl Twitter  – Reference database  – Matching of tracks• Music recommendation by co-occ...
11
Nächste SlideShare
Wird geladen in …5
×

Exploiting Twitter’s Collective Knowledge for Music Recommendations

1.392 Aufrufe

Veröffentlicht am

Veröffentlicht in: Unterhaltung & Humor, Technologie
  • Als Erste(r) kommentieren

Exploiting Twitter’s Collective Knowledge for Music Recommendations

  1. 1. Exploiting Twitter’sCollective Knowledge forMusic RecommendationsEva Zangerle, Wolfgang Gassler and Günther Specht 1
  2. 2. Outline• Motivation• Data set creation• Artist and track resolution• Music recommendation• Future directions• Conclusion 2
  3. 3. Motivation• Twitter as source for music recommendations• User stream: set of #nowplaying tweets of one user (representing his preferences in music) 3
  4. 4. Data Set Creation• Crawling Twitter for keywords nowplaying, listeningto, listento, etc.• 5 mio tweets from 07/2011 to 02/12• User stream analysis Tweets in Stream Users 1 457,657 >3 196,422 > 10 63,017 > 100 3,190 > 1,000 253 > 10,000 5 4
  5. 5. Track Resolutionlistening to Hey Hey My My (Out Of The Blue) byNeil Young on @Grooveshark: #nowplaying#musicmonday http://t.co/7os3eeA#nowplaying @Lloyd_YG ft. @LilTunechi - You• Problem: extraction of – Title of the track – Artist performing the track – Metainformation (links, Twitter accounts, etc.) – > Reference Database (FreeDB or MusicBrainz) 5
  6. 6. Track Resolution• Matching of tweet content to reference DB• Fulltext index (Lucene, tf/idf + cosine sim)• Custom similarity measure: 𝑡𝑤𝑒𝑒𝑡 ∩ 𝑡𝑟𝑎𝑐𝑘 𝑠𝑖𝑚 𝑡𝑤𝑒𝑒𝑡, 𝑡𝑟𝑎𝑐𝑘 = 𝑡𝑟𝑎𝑐𝑘• Query: listening to Hey Hey My My (Out Of The Blue) by Neil Young on @Grooveshark: #nowplaying #musicmonday http://t.co/7os3eeA MusicBrainz track = Hey Hey My My (Out of the Blue) MusicBrainz artist = Neil Young 6
  7. 7. Evaluation of Resolution Process• Ground truth data set (100 tweets)• MusicBrainz and FreeDB tracks assigned manually• Automatically assigned tracks (custom similarity > 0.8)• Matched tracks: RefDB Manually Automatically False Positive MusicBrainz 59 43 (73%) 5 (10%) FreeDB 57 31 (54%) 18 (36%)• FreeDB very noisy -> many false positives 7
  8. 8. Use Case: Music Recommendation• Recommendation based on co-occurrence analysis on user streams• Evaluation through comparison with last.fm• 79% coverage of co-occurence rules• Top 10 recs: only 1% coverage• Sparsity! 8
  9. 9. Problems & Future Directions• Sparsity of Data – User stream crawling (more complete user streams) – Exploit Metadata (URLs, …)• Matching Process – Tweets <-> refDB tracks – refDB tracks <-> last.fm tracks – Last.fm similar tracks <-> refDB tracks 9
  10. 10. Conclusion• Data Set Creation – Crawl Twitter – Reference database – Matching of tracks• Music recommendation by co-occurrence analysis 10
  11. 11. 11

×