SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
Hypergraph models of
playlist dialects


Brian McFee                                   Lab
Center for Jazz Studies/LabROSA
Columbia University                   ROSA
                                      Laboratory for the Recognition and
                                      Organization of Speech and Audio




Gert Lanckriet
Electrical & Computer Engineering
University of California, San Diego
Automatic playlist generation
Evaluating playlist algorithms                             [M. & Lanckriet, 2011]




                ...
                                                        2. Compute playlist
 1. Observe playlists from users                               likelihoods



                                        ?
                                        >
                             3. Compare algorithms
                                 by likelihood scores
Evaluating playlist algorithms            [M. & Lanckriet, 2011]




                     Key idea:
                 Playlist algorithm
                          =
               Probability distribution
                over song sequences
Modeling playlist diversity




                   Playlists
Modeling playlist diversity



                              Road trip

            Mixed
            Genre
                       Party mix




                      Hip-hop
Data collection
                                        http://www.artofthemix.org/




 Started in 1998, users upload and share playlists

 [Ellis, Whitman, Berenzweig, and Lawrence, ISMIR 2002]
The data: AotM-2011



• 98K songs indexed to Million Song Dataset


• 87K playlists (1998-2011), ~210K contiguous segments


• 40 playlist categories, user meta-data available
# Playlists per category
   Mixed genre
        Theme
     Rock-pop
 Alternating DJ
          Indie
   Single artist
     Romantic
      Road trip
    Depression
          Punk
      Break-up
      Narrative
       Hip-hop
         Sleep
  Dance-house
     Electronic
Rhythm & blues
       Country
         Cover
      Hardcore
          Rock
           Jazz
           Folk
      Ambient
          Blues

               100   1000   104   105
# Playlists per category
   Mixed genre
        Theme
     Rock-pop
 Alternating DJ
          Indie
   Single artist
     Romantic
      Road trip
    Depression
          Punk
      Break-up
      Narrative
       Hip-hop
         Sleep
  Dance-house
     Electronic       • Majority of playlists are Mixed genre
Rhythm & blues
       Country
         Cover
      Hardcore        • Remaining categories:
          Rock
           Jazz
           Folk
                         contextual/mood, genre, other
      Ambient
          Blues

               100   1000                104                    105
Our goals



• Which categories can we model? Are some harder than others?

• Which features are useful for playlist generation?

• Do transitions matter? Are some categories less diverse?
A simple playlist model




  1. Start with a set of songs
A simple playlist model




  2. Select a subset (e.g., jazz songs)
A simple playlist model




  3. Select a song
A simple playlist model




  4. Find subsets containing the current song
A simple playlist model




  4. Select a new subset
A simple playlist model




  5. Select a new song
A simple playlist model




  6. Repeat...
A simple playlist model




  6. Repeat...
Connecting the dots...


• Random walk on a hypergraph
 - Vertices = songs
 - Edges = subsets
Connecting the dots...


• Random walk on a hypergraph
 - Vertices = songs
 - Edges = subsets



• Learning: optimize edge weights from example playlists
Connecting the dots...


• Random walk on a hypergraph
 - Vertices = songs
 - Edges = subsets



• Learning: optimize edge weights from example playlists



• Sampling is efficient, edge labels provide transparency
The hypergraph random walk model

            exp. prior           edge
                               weights




                         transitions

                                 playlists
Edge construction: example

• Audio: cluster songs by timbre
Edge construction: example

• Audio: cluster songs by timbre

           Audio-1             Audio-2




                                         Audio-4




                     Audio-3


• Multiple clusterings (k=16, 64, 256)
Edge construction: the kitchen sink

• Audio
• MSD taste profile
• Era
• Familiarity
• Lyrics
• Social tags
• Uniform shuffle
• Conjunctions: "TAG_jazz-&-YEAR_1959"
• 6390 edges, 98K vertices (songs)
Evaluation protocol


• Repeat x10:
 - Split playlist collection into 75% train/25% test
 - Learn edge weights on training playlists
 - Evaluate average likelihood of test playlists


• Compare gain in likelihood over uniform shuffle baseline
Experiment 1: global vs. categorical


• Fit one model per category

• Fit one global model to all categories

• Test on each category and compare likelihoods


• Question:
       When does categorical training improve accuracy?
Experiment 1: global vs. categorical
                Unifo
                      rm
               ALL
             Mixed                                        Global model
            Theme                                         Category-specific
         Rock-pop
     Alternating DJ
              Indie
       Single artist
         Romantic
          Road trip
              Punk
        Depression
          Break up
          Narrative
           Hip-hop
             Sleep
         Electronic
     Dance-house
               R&B
           Country
       Cover songs
          Hardcore
              Rock
               Jazz
               Folk
            Reggae
              Blues
                          0%      5%      10%      1 5%        20%        25%
                           Log-likelihood gain over uniform shuffle
Experiment 1: global vs. categorical
                Unifo                  • Largest gains for genre playlists
                      rm
               ALL
             Mixed                     • No change for "hard" categories
                                                     Global model
            Theme                                         Category-specific
         Rock-pop
     Alternating DJ
                                         (e.g., Mixed, Alternating DJ, Theme)
              Indie
       Single artist
         Romantic
          Road trip
              Punk
        Depression
          Break up
          Narrative
           Hip-hop
             Sleep
         Electronic
     Dance-house
               R&B
           Country
       Cover songs
          Hardcore
              Rock
               Jazz
               Folk
            Reggae
              Blues
                          0%      5%      10%      1 5%        20%        25%
                           Log-likelihood gain over uniform shuffle
Experiment 1: learned edge weights

              ALL
            Mixed
           Theme
        Rock-pop
    Alternating DJ
             Indie
     Single Artist
        Romantic
         RoadTrip
             Punk
       Depression
         Break Up
         Narrative
         Hip-hop
            Sleep
 Electronic music
    Dance-house
Rhythm and Blues
          Country
            Cover
         Hardcore
             Rock
              Jazz
              Folk
           Reggae
             Blues
                     Audio   CF   Era   Familiarity Lyrics   Tags   Uniform
Experiment 2: continuity?

• Do we need to model playlist continuity?

                               edge weights
                                                    songs
• Simplified model:
  - ignore transitions
  - choose each edge IID
                                    exp. prior


                                                   playlists
• Question:
   Are some categories more diverse than others?
Experiment 2: continuity
                                    Unifo
                                          rm
              ALL
            Mixed                                       Global model
           Theme                                        Category-specific
        Rock-pop
    Alternating DJ
             Indie
      Single artist
        Romantic
         Road trip
             Punk
       Depression
         Break up
         Narrative
          Hip-hop
            Sleep
        Electronic
    Dance-house
              R&B
          Country
      Cover songs
         Hardcore
             Rock
              Jazz
              Folk
           Reggae
             Blues
                      -15%   -10%   -5%       0%   5%   10%   15%    20%
                         Log-likelihood gain over uniform shuffle
Experiment 2: continuity
                             Unifo
                                   rm
                ALL
              Mixed                              Global model
             Theme                               Category-specific
   • Most categories exhibit both
          Rock-pop
      Alternating DJ
               Indie
     continuity AND diversity
        Single artist
          Romantic
   • Transitions are important!
           Road trip
               Punk
         Depression
           Break up
           Narrative
            Hip-hop
              Sleep
          Electronic
      Dance-house
                R&B
            Country
        Cover songs
           Hardcore
               Rock
                Jazz
                Folk
             Reggae
               Blues
                      -15% -10% -5% 0%   5%     10%    15%     20%
                    Log-likelihood gain over uniform shuffle
Example playlists
                       Rhythm & Blues
  EDGE                          SONG
  70s & soul                    Lyn Collins - Think
  Audio #14 & funk              Isaac Hayes - No Name Bar
  DECADE 1965 & soul            Michael Jackson - My Girl


                       Electronic music
  EDGE                          SONG
  Audio #11 & downtempo         Everything but the Girl - Blame
  DECADE 1990 & trip-hop        Massive Attack - Spying Glass
  Audio #11 & electronica       Björk - Hunter
Conclusions


• Category-specific models outperform global playlist models.

• Continuity matters!

• Proposed model is simple, efficient, and transparent

• AotM-2011 dataset available now!
  http://cosmal.ucsd.edu/cal/projects/aotm2011
Obrigado!

Weitere ähnliche Inhalte

Ähnlich wie Hypergraph Models of Playlist Dialects (11)

social web music
social web musicsocial web music
social web music
 
FINAL FINAL PWR
FINAL FINAL PWRFINAL FINAL PWR
FINAL FINAL PWR
 
Final PWR Presentation
Final PWR PresentationFinal PWR Presentation
Final PWR Presentation
 
PWR Presentation - Personal Music Mediums
PWR Presentation - Personal Music MediumsPWR Presentation - Personal Music Mediums
PWR Presentation - Personal Music Mediums
 
Music similarity: what for?
Music similarity: what for?Music similarity: what for?
Music similarity: what for?
 
MIR
MIRMIR
MIR
 
SHERRiN VARGHeSE Show
SHERRiN VARGHeSE Show SHERRiN VARGHeSE Show
SHERRiN VARGHeSE Show
 
Questionnaire Analysis
Questionnaire AnalysisQuestionnaire Analysis
Questionnaire Analysis
 
Music Genres List
Music Genres ListMusic Genres List
Music Genres List
 
Dubstep
DubstepDubstep
Dubstep
 
Favorite music genre
Favorite music genreFavorite music genre
Favorite music genre
 

Hypergraph Models of Playlist Dialects

  • 1. Hypergraph models of playlist dialects Brian McFee Lab Center for Jazz Studies/LabROSA Columbia University ROSA Laboratory for the Recognition and Organization of Speech and Audio Gert Lanckriet Electrical & Computer Engineering University of California, San Diego
  • 3. Evaluating playlist algorithms [M. & Lanckriet, 2011] ... 2. Compute playlist 1. Observe playlists from users likelihoods ? > 3. Compare algorithms by likelihood scores
  • 4. Evaluating playlist algorithms [M. & Lanckriet, 2011] Key idea: Playlist algorithm = Probability distribution over song sequences
  • 6. Modeling playlist diversity Road trip Mixed Genre Party mix Hip-hop
  • 7. Data collection http://www.artofthemix.org/ Started in 1998, users upload and share playlists [Ellis, Whitman, Berenzweig, and Lawrence, ISMIR 2002]
  • 8. The data: AotM-2011 • 98K songs indexed to Million Song Dataset • 87K playlists (1998-2011), ~210K contiguous segments • 40 playlist categories, user meta-data available
  • 9. # Playlists per category Mixed genre Theme Rock-pop Alternating DJ Indie Single artist Romantic Road trip Depression Punk Break-up Narrative Hip-hop Sleep Dance-house Electronic Rhythm & blues Country Cover Hardcore Rock Jazz Folk Ambient Blues 100 1000 104 105
  • 10. # Playlists per category Mixed genre Theme Rock-pop Alternating DJ Indie Single artist Romantic Road trip Depression Punk Break-up Narrative Hip-hop Sleep Dance-house Electronic • Majority of playlists are Mixed genre Rhythm & blues Country Cover Hardcore • Remaining categories: Rock Jazz Folk contextual/mood, genre, other Ambient Blues 100 1000 104 105
  • 11. Our goals • Which categories can we model? Are some harder than others? • Which features are useful for playlist generation? • Do transitions matter? Are some categories less diverse?
  • 12. A simple playlist model 1. Start with a set of songs
  • 13. A simple playlist model 2. Select a subset (e.g., jazz songs)
  • 14. A simple playlist model 3. Select a song
  • 15. A simple playlist model 4. Find subsets containing the current song
  • 16. A simple playlist model 4. Select a new subset
  • 17. A simple playlist model 5. Select a new song
  • 18. A simple playlist model 6. Repeat...
  • 19. A simple playlist model 6. Repeat...
  • 20. Connecting the dots... • Random walk on a hypergraph - Vertices = songs - Edges = subsets
  • 21. Connecting the dots... • Random walk on a hypergraph - Vertices = songs - Edges = subsets • Learning: optimize edge weights from example playlists
  • 22. Connecting the dots... • Random walk on a hypergraph - Vertices = songs - Edges = subsets • Learning: optimize edge weights from example playlists • Sampling is efficient, edge labels provide transparency
  • 23. The hypergraph random walk model exp. prior edge weights transitions playlists
  • 24. Edge construction: example • Audio: cluster songs by timbre
  • 25. Edge construction: example • Audio: cluster songs by timbre Audio-1 Audio-2 Audio-4 Audio-3 • Multiple clusterings (k=16, 64, 256)
  • 26. Edge construction: the kitchen sink • Audio • MSD taste profile • Era • Familiarity • Lyrics • Social tags • Uniform shuffle • Conjunctions: "TAG_jazz-&-YEAR_1959" • 6390 edges, 98K vertices (songs)
  • 27. Evaluation protocol • Repeat x10: - Split playlist collection into 75% train/25% test - Learn edge weights on training playlists - Evaluate average likelihood of test playlists • Compare gain in likelihood over uniform shuffle baseline
  • 28. Experiment 1: global vs. categorical • Fit one model per category • Fit one global model to all categories • Test on each category and compare likelihoods • Question: When does categorical training improve accuracy?
  • 29. Experiment 1: global vs. categorical Unifo rm ALL Mixed Global model Theme Category-specific Rock-pop Alternating DJ Indie Single artist Romantic Road trip Punk Depression Break up Narrative Hip-hop Sleep Electronic Dance-house R&B Country Cover songs Hardcore Rock Jazz Folk Reggae Blues 0% 5% 10% 1 5% 20% 25% Log-likelihood gain over uniform shuffle
  • 30. Experiment 1: global vs. categorical Unifo • Largest gains for genre playlists rm ALL Mixed • No change for "hard" categories Global model Theme Category-specific Rock-pop Alternating DJ (e.g., Mixed, Alternating DJ, Theme) Indie Single artist Romantic Road trip Punk Depression Break up Narrative Hip-hop Sleep Electronic Dance-house R&B Country Cover songs Hardcore Rock Jazz Folk Reggae Blues 0% 5% 10% 1 5% 20% 25% Log-likelihood gain over uniform shuffle
  • 31. Experiment 1: learned edge weights ALL Mixed Theme Rock-pop Alternating DJ Indie Single Artist Romantic RoadTrip Punk Depression Break Up Narrative Hip-hop Sleep Electronic music Dance-house Rhythm and Blues Country Cover Hardcore Rock Jazz Folk Reggae Blues Audio CF Era Familiarity Lyrics Tags Uniform
  • 32. Experiment 2: continuity? • Do we need to model playlist continuity? edge weights songs • Simplified model: - ignore transitions - choose each edge IID exp. prior playlists • Question: Are some categories more diverse than others?
  • 33. Experiment 2: continuity Unifo rm ALL Mixed Global model Theme Category-specific Rock-pop Alternating DJ Indie Single artist Romantic Road trip Punk Depression Break up Narrative Hip-hop Sleep Electronic Dance-house R&B Country Cover songs Hardcore Rock Jazz Folk Reggae Blues -15% -10% -5% 0% 5% 10% 15% 20% Log-likelihood gain over uniform shuffle
  • 34. Experiment 2: continuity Unifo rm ALL Mixed Global model Theme Category-specific • Most categories exhibit both Rock-pop Alternating DJ Indie continuity AND diversity Single artist Romantic • Transitions are important! Road trip Punk Depression Break up Narrative Hip-hop Sleep Electronic Dance-house R&B Country Cover songs Hardcore Rock Jazz Folk Reggae Blues -15% -10% -5% 0% 5% 10% 15% 20% Log-likelihood gain over uniform shuffle
  • 35. Example playlists Rhythm & Blues EDGE SONG 70s & soul Lyn Collins - Think Audio #14 & funk Isaac Hayes - No Name Bar DECADE 1965 & soul Michael Jackson - My Girl Electronic music EDGE SONG Audio #11 & downtempo Everything but the Girl - Blame DECADE 1990 & trip-hop Massive Attack - Spying Glass Audio #11 & electronica Björk - Hunter
  • 36. Conclusions • Category-specific models outperform global playlist models. • Continuity matters! • Proposed model is simple, efficient, and transparent • AotM-2011 dataset available now! http://cosmal.ucsd.edu/cal/projects/aotm2011