SlideShare ist ein Scribd-Unternehmen logo
1 von 72
Downloaden Sie, um offline zu lesen
More Like This:
Machine Learning Approaches
     to Music Similarity


           Brian McFee

Computer Science & Engineering
University of California, San Diego
Music discovery in days of yore...
Music discovery 2.0: the present



       f
• ~20 million songs available

• Discovery is still largely human-powered
A Google for music?
A Google for music?




• Standard text search can work with meta-data
• Can we predict meta-data from audio?
 ⁃ [Turnbull, 2008], [Barrington, 2011]
Query by example

• Natural, user-friendly alternative to text search
Query by example

• Natural, user-friendly alternative to text search
Query by example

• Natural, user-friendly alternative to text search
This talk

• Learning algorithms for QBE, geared toward music discovery

• We'll look at two consumption models:




         Active browsing                Passive listening
        (search & ranking)            (playlist generation)

• Evaluation derived from user behavior
Learning similarity
Defining similarity: semantics?



                              Song similarity
                                     =
                              tag similarity?
Defining similarity: semantics?




• Drawbacks:
  - Choosing, weighting vocabulary is surprisingly difficult
  - Hard to maintain quality at scale
Defining similarity: human judgements?
                           [M. & Lanckriet, 2009, 2011]
• Which is more similar?
Defining similarity: human judgements?
                                    [M. & Lanckriet, 2009, 2011]
• Which is more similar?




• Drawbacks: ambiguity, subjectivity, scale
Collaborative filter similarity




• Collect listening histories for (lots of!) users

• Song similarity = portion of users in common
Collaborative filter similarity
• Collaborative filters perform well...
 - ... for tagging [Kim, Tomasik, & Turnbull, 2009]
 - ... and playlisting [Barrington, Oda, & Lanckriet, 2009]
 - ... and recommendation (Yahoo, Last.fm, iTunes...)



• Implicit feedback requires no additional effort from users



• ... but fails on unpopular items: the cold start problem!
Learning from a collaborative filter
                [M., Barrington, & Lanckriet, 2010, 2012]




                                      1.

                                      2.

                                      3.
Learning from a collaborative filter
                [M., Barrington, & Lanckriet, 2010, 2012]




                                      1.

                                      2.

                                      3.
Learning from a collaborative filter
                [M., Barrington, & Lanckriet, 2010, 2012]




                                      1.

                                      2.

                                      3.
Metric learning to rank

• The goal:

              Rankings in       Rankings in
                            =
              audio space        CF space
Metric learning to rank
                                         [M. & Lanckriet, 2010]
• The goal:

                 Ranking by            Target
                                   =
              (learned) distance       rankings
Metric learning to rank
                                         [M. & Lanckriet, 2010]
• The goal:

                 Ranking by            Target
                                   =
              (learned) distance       rankings


• Optimize a linear transformation for ranking
Structure prediction: nearest neighbors

• Setup: database       , rankings

• PSD matrix         transforms features

• Order        by distance from       :
Structure prediction: nearest neighbors

• Setup: database        , rankings

• PSD matrix          transforms features

• Order        by distance from          :




•                   encodes each (query, ranking) pair
Metric learning to rank (MLR)




         Score for
         target ranking
                          > Score ranking + Prediction
                            other
                                  for any
                                            error

• Supported losses Δ:
              AUC, KNN, MAP, MRR, NDCG, Prec@k
MLR solver
• Cutting-plane algorithm based on 1-slack Structural SVM
 [Joachims, et al. 2009]

• Repeat until convergence:



         Constraint                     Semi-definite
         generation                     programming
            (DP)
MLR solver
• Cutting-plane algorithm based on 1-slack Structural SVM
 [Joachims, et al. 2009]

• Repeat until convergence:



         Constraint                     Semi-definite
         generation                     programming
            (DP)
                                       Sequence of QPs
MLR solver
• Cutting-plane algorithm based on 1-slack Structural SVM
 [Joachims, et al. 2009]

• Repeat until convergence:



          Constraint                       Semi-definite
          generation                       programming
             (DP)
                                          Sequence of QPs


• Multiple kernel extensions:
  [Galleguillos, M., Belongie, & Lanckriet 2011]
Audio pipeline

    Audio signal
Audio pipeline

    Audio signal   1. Feature      Bag of ΔMFCCs
                      extraction
Audio pipeline

    Audio signal   1. Feature      Bag of ΔMFCCs
                      extraction



                                       2. Vector
                                          quantization

                                   Codeword hist.
Audio pipeline

    Audio signal   1. Feature       Bag of ΔMFCCs
                      extraction



                                        2. Vector
                                           quantization

       PPK                          Codeword hist.


                   3. Probability
                      product
                      kernel
Audio pipeline

    Audio signal              CF similarity




                                      Supervision

       PPK                        MLR


                   Features
Evaluation: CAL10K

• Last.fm collaborative filter                    [Celma, 2008]
 - 360K users, 186K artists

• CAL10K songs                     [Tingle, Turnbull, & Kim, 2010]
  - 5.4K songs, 2K artists (after CF matching)
Evaluation: CAL10K

• Last.fm collaborative filter                    [Celma, 2008]
 - 360K users, 186K artists

• CAL10K songs                     [Tingle, Turnbull, & Kim, 2010]
  - 5.4K songs, 2K artists (after CF matching)


• Evaluation:
  - Split artists into train/val/test
 - Target rankings: top-10 most similar train artists
Evaluation: comparison

• Gaussian mixture models + KL divergence
 - 8 component, diagonal covariance GMM per song

• Auto-tags: predict 149 semantic tags from audio
  [Turnbull, 2008]


• [Our method] VQ+MLR: 1024 codewords

• Expert tags: 1053 tags from Pandora
  [Tingle, et al., 2009]
Similarity learning: results


         GMM (KL)
         Auto-tags
   Auto-tags + MLR
         Audio VQ
   Audio VQ + MLR
   Expert tags (cos)
  Expert tags + MLR
                  0.65   0.70   0.75   0.80   0.85   0.90   0.95
                                       AUC
Example playlists
 The Ramones - Go Mental

 Def Leppard - Promises
 The Buzzcocks - Harmony In My Head
 Los Lonely Boys - Roses
 Wolfmother - Colossal
 Judas Priest - Diamonds and Rust (live)
Example playlists
 The Ramones - Go Mental

 Def Leppard - Promises
 The Buzzcocks - Harmony In My Head
 Los Lonely Boys - Roses
 Wolfmother - Colossal
 Judas Priest - Diamonds and Rust (live)



 The Buzzcocks - Harmony In My Head
 Mötley Crüe - Same Ol' Situation
 The Offspring - Gotta Get Away            MLR
 The Misfits - Skulls
 AC/DC - Who Made Who (live)
Example playlists
 Fats Waller - Winter Weather

 Dizzy Gillespie - She's Funny That Way
 Enrique Morente - Solea
 Chet Atkins - In the Mood
 Rachmaninov - Piano Concerto #4
 Eluvium - Radio Ballet
Example playlists
 Fats Waller - Winter Weather

 Dizzy Gillespie - She's Funny That Way
 Enrique Morente - Solea
 Chet Atkins - In the Mood
 Rachmaninov - Piano Concerto #4
 Eluvium - Radio Ballet


 Chet Atkins - In the Mood
 Charlie Parker - What Is This Thing Called Love?
 Bud Powell - Oblivion
 Bob Wills & His Texas Playboys - Lyla Lou
 Bob Wills & His Texas Playboys - Sittin' On Top of the World
Scaling up: fast retrieval
                                            [M. & Lanckriet, 2011]

• Audio similarity search for a million songs?



• Idea: Index data with spatial trees



• 100-NN search over 900K songs:
  - Brute force:     2.4s
  - 50% recall:     0.14s 17x speedup
  - 20% recall:     0.02s 120x speedup
Similarity learning: summary

• Collaborative filters provide user-centric music similarity

• CF similarity can be approximated by audio features

• Audio search can be done quickly at large-scale
Playlist generation
Playlist generation

• Goal: generate a "good" song sequence
 - Music auto-pilot (given context)



• Many existing algorithms, but no standard evaluation



• What makes one algorithm better than another?
Playlist evaluation 1: Human survey

• Idea: generate playlists, ask for opinions



• Impractical at large-scale:
   - Huge search space
   - User taste, expertise can be problematic
   - Slow, expensive



• Does not facilitate rapid evaluation and optimization
Playlist evaluation 2: Information retrieval


• Idea:
 - Define "good" and "bad" playlists
 - Predict the next song, measure accuracy

• But what makes a bad playlist?


• Do users agree on good/bad?
A generative approach
                                           [M. & Lanckriet, 2011b]




• Playlist algorithm = distribution over playlists

• Don't evaluate synthetic playlists

• Do evaluate the likelihood of generating real playlists
The playlist collection: AOTM-2011

• Art of the Mix
 - 13 years of playlists
 - ~210K playlist segments
 - ~100K songs from MSD



• Top 25 playlist categories:
  - Genre:        Punk, Hip-hop, Reggae...
  - Context:     Road trip, Break-up, Sleep...
  - Other:       Mixed genre, Alternating DJ...
A simple playlist model




  1. Start with a set of songs
A simple playlist model




  2. Select a subset (e.g., jazz songs)
A simple playlist model




  3. Select a song
A simple playlist model




  4. Select a new subset
A simple playlist model




  4. Select a new subset
A simple playlist model




  5. Select a new song
A simple playlist model




  6. Repeat...
A simple playlist model




  6. Repeat...
Connecting the dots...


• Random walk on a hypergraph
 - Vertices = songs
 - Edges = subsets

• Edges derived from:
  - Audio clusters, tags, lyrics, era, popularity, CF
  - or combinations/intersections

• Goal: optimize edge weights from example playlists
Playlist model

             exp. prior           edge
                                weights




                          transitions

                                  playlists
Playlist generation: evaluation


• Setup:
 - Split playlist collection into train/test
 - Learn edge weights on training playlists
 - Evaluate average likelihood of test playlists


• Train per category, or all together

• Compare against uniform shuffle baseline
Random walk results
              ALL
            Mixed                                     Global model
           Theme                                      Category-specific
        Rock-pop
    Alternating DJ
             Indie
      Single artist
        Romantic
         Road trip
             Punk
       Depression
         Break up
         Narrative
          Hip-hop
            Sleep
        Electronic
    Dance-house
              R&B
          Country
      Cover songs
         Hardcore
             Rock
              Jazz
              Folk
           Reggae
             Blues
                      0%      5%      10%      1 5%        20%        25%
                       Log-likelihood gain over random shuffle
Stationary model results
              ALL
            Mixed                                    Global model
           Theme                                     Category-specific
        Rock-pop
    Alternating DJ
             Indie
      Single artist
        Romantic
         Road trip
             Punk
       Depression
         Break up
         Narrative
          Hip-hop
            Sleep
        Electronic
    Dance-house
              R&B
          Country
      Cover songs
         Hardcore
             Rock
              Jazz
              Folk
           Reggae
             Blues
                      -15%   -10%   -5%   0%   5%   10%    15%     20%
                         Log-likelihood gain over random shuffle
Example playlists

 Rhythm & Blues
  70s & soul                Lyn Collins - Think
  Audio #14 & funk          Isaac Hayes - No Name Bar
  DECADE 1965 & soul        Michael Jackson - My Girl


 Electronic music
  Audio #11 & downtempo     Everything But The Girl - Blame
  DECADE 1990 & trip-hop    Massive Attack - Spying Glass
  Audio #11 & electronica   Björk - Hunter
Playlist generation summary


• Generative approach simplifies evaluation

• AOTM-2011 collection facilitates learning and evaluation

• Robust, efficient and transparent feature integration
The future
Directions for future work



• Audio features: coding, dynamics and rhythm

• Playlist models: mixtures, long-range interactions

• UI models: interactive, context-aware, diversity
Personalized recommendation
                   [M., Bertin-Mahieux, Ellis, & Lanckriet, 2012]

• The Million Song Dataset Challenge

• Listening histories for 1.1M users, 380K songs

• Task: personalized song recommendation
Conclusion


• MLR can optimize distance metrics for ranking, QBE retrieval

• Audio similarity can approximate a collaborative filter

• Generative playlist model integrates data, models dynamics


• User-centric evaluation makes it all possible
Thanks!
Metric partial order feature




 • Score is large when distances match ranking
Playlist weights: 6390 edges
              ALL
            Mixed
           Theme
        Rock-pop
    Alternating DJ
             Indie
     Single Artist
        Romantic
         RoadTrip
             Punk
       Depression
         Break Up
         Narrative
         Hip-hop
            Sleep
 Electronic music
    Dance-house
Rhythm and Blues
          Country
            Cover
         Hardcore
             Rock
              Jazz
              Folk
           Reggae
             Blues
                     Audio   CF   Era   Familiarity Lyrics   Tags   Uniform

 • Audio & CF: k-means (16/64/256)       • Lyrics: LDA (k=32, top-1/3/5)
 • Era: year, decade, decade+5           • Tags: Last.fm top-10
 • Familiarity: high/med/low             • Conjunctions

Weitere ähnliche Inhalte

Ähnlich wie More Like This: Machine Learning Approaches to Music similarity

[221]똑똑한 인공지능 dj 비서 clova music
[221]똑똑한 인공지능 dj 비서 clova music[221]똑똑한 인공지능 dj 비서 clova music
[221]똑똑한 인공지능 dj 비서 clova musicNAVER D2
 
Actions speak louder than words: Analyzing large-scale query logs to improve ...
Actions speak louder than words: Analyzing large-scale query logs to improve ...Actions speak louder than words: Analyzing large-scale query logs to improve ...
Actions speak louder than words: Analyzing large-scale query logs to improve ...Raman Chandrasekar
 
[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label ClassificationYONG ZHENG
 
Environmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC techniqueEnvironmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC techniquePankaj Kumar
 
[SOCRS2013]Differential Context Modeling in Collaborative Filtering
[SOCRS2013]Differential Context Modeling in Collaborative Filtering[SOCRS2013]Differential Context Modeling in Collaborative Filtering
[SOCRS2013]Differential Context Modeling in Collaborative FilteringYONG ZHENG
 
Media Sharing on Urban Transport
Media Sharing on Urban TransportMedia Sharing on Urban Transport
Media Sharing on Urban TransportUCL-CS MobiSys
 
Multimedia Answer Generation for Community Question Answering
Multimedia Answer Generation for Community Question AnsweringMultimedia Answer Generation for Community Question Answering
Multimedia Answer Generation for Community Question AnsweringSWAMI06
 
Exploiting Distributional Semantic Models in Question Answering
Exploiting Distributional Semantic Models in Question AnsweringExploiting Distributional Semantic Models in Question Answering
Exploiting Distributional Semantic Models in Question AnsweringPierpaolo Basile
 
2013 Hello GCC:The Theory, History and Future of System Linkers
2013 Hello GCC:The Theory, History and Future of System Linkers2013 Hello GCC:The Theory, History and Future of System Linkers
2013 Hello GCC:The Theory, History and Future of System LinkersChing-Yi Chen
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)inventionjournals
 
MLConf2013: Teaching Computer to Listen to Music
MLConf2013: Teaching Computer to Listen to MusicMLConf2013: Teaching Computer to Listen to Music
MLConf2013: Teaching Computer to Listen to MusicEric Battenberg
 
Ml conf2013 teaching_computers_share
Ml conf2013 teaching_computers_shareMl conf2013 teaching_computers_share
Ml conf2013 teaching_computers_shareMLconf
 
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...Vienna Data Science Group
 
Intelligent Stream Filtering Using MongoDB
Intelligent Stream Filtering Using MongoDBIntelligent Stream Filtering Using MongoDB
Intelligent Stream Filtering Using MongoDBMihnea Giurgea
 
Technologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic RecordsTechnologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic Recordspbajcsy
 
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...multimediaeval
 
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...Yun-Nung (Vivian) Chen
 
Dstc6 an introduction
Dstc6 an introductionDstc6 an introduction
Dstc6 an introductionhkh
 
[NUGU CONFERENCE 2019] 트랙 A-4 : Zero-shot learning for Personalized Text-to-S...
[NUGU CONFERENCE 2019] 트랙 A-4 : Zero-shot learning for Personalized Text-to-S...[NUGU CONFERENCE 2019] 트랙 A-4 : Zero-shot learning for Personalized Text-to-S...
[NUGU CONFERENCE 2019] 트랙 A-4 : Zero-shot learning for Personalized Text-to-S...NUGU developers
 

Ähnlich wie More Like This: Machine Learning Approaches to Music similarity (20)

[221]똑똑한 인공지능 dj 비서 clova music
[221]똑똑한 인공지능 dj 비서 clova music[221]똑똑한 인공지능 dj 비서 clova music
[221]똑똑한 인공지능 dj 비서 clova music
 
Actions speak louder than words: Analyzing large-scale query logs to improve ...
Actions speak louder than words: Analyzing large-scale query logs to improve ...Actions speak louder than words: Analyzing large-scale query logs to improve ...
Actions speak louder than words: Analyzing large-scale query logs to improve ...
 
[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification
 
Environmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC techniqueEnvironmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC technique
 
[SOCRS2013]Differential Context Modeling in Collaborative Filtering
[SOCRS2013]Differential Context Modeling in Collaborative Filtering[SOCRS2013]Differential Context Modeling in Collaborative Filtering
[SOCRS2013]Differential Context Modeling in Collaborative Filtering
 
Media Sharing on Urban Transport
Media Sharing on Urban TransportMedia Sharing on Urban Transport
Media Sharing on Urban Transport
 
Multimedia Answer Generation for Community Question Answering
Multimedia Answer Generation for Community Question AnsweringMultimedia Answer Generation for Community Question Answering
Multimedia Answer Generation for Community Question Answering
 
Exploiting Distributional Semantic Models in Question Answering
Exploiting Distributional Semantic Models in Question AnsweringExploiting Distributional Semantic Models in Question Answering
Exploiting Distributional Semantic Models in Question Answering
 
2013 Hello GCC:The Theory, History and Future of System Linkers
2013 Hello GCC:The Theory, History and Future of System Linkers2013 Hello GCC:The Theory, History and Future of System Linkers
2013 Hello GCC:The Theory, History and Future of System Linkers
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
MLConf2013: Teaching Computer to Listen to Music
MLConf2013: Teaching Computer to Listen to MusicMLConf2013: Teaching Computer to Listen to Music
MLConf2013: Teaching Computer to Listen to Music
 
Ml conf2013 teaching_computers_share
Ml conf2013 teaching_computers_shareMl conf2013 teaching_computers_share
Ml conf2013 teaching_computers_share
 
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
 
Intelligent Stream Filtering Using MongoDB
Intelligent Stream Filtering Using MongoDBIntelligent Stream Filtering Using MongoDB
Intelligent Stream Filtering Using MongoDB
 
Clustering - ACM 2013 02-25
Clustering - ACM 2013 02-25Clustering - ACM 2013 02-25
Clustering - ACM 2013 02-25
 
Technologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic RecordsTechnologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic Records
 
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
 
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
 
Dstc6 an introduction
Dstc6 an introductionDstc6 an introduction
Dstc6 an introduction
 
[NUGU CONFERENCE 2019] 트랙 A-4 : Zero-shot learning for Personalized Text-to-S...
[NUGU CONFERENCE 2019] 트랙 A-4 : Zero-shot learning for Personalized Text-to-S...[NUGU CONFERENCE 2019] 트랙 A-4 : Zero-shot learning for Personalized Text-to-S...
[NUGU CONFERENCE 2019] 트랙 A-4 : Zero-shot learning for Personalized Text-to-S...
 

Kürzlich hochgeladen

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Kürzlich hochgeladen (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

More Like This: Machine Learning Approaches to Music similarity

  • 1. More Like This: Machine Learning Approaches to Music Similarity Brian McFee Computer Science & Engineering University of California, San Diego
  • 2. Music discovery in days of yore...
  • 3. Music discovery 2.0: the present f • ~20 million songs available • Discovery is still largely human-powered
  • 4. A Google for music?
  • 5. A Google for music? • Standard text search can work with meta-data • Can we predict meta-data from audio? ⁃ [Turnbull, 2008], [Barrington, 2011]
  • 6. Query by example • Natural, user-friendly alternative to text search
  • 7. Query by example • Natural, user-friendly alternative to text search
  • 8. Query by example • Natural, user-friendly alternative to text search
  • 9. This talk • Learning algorithms for QBE, geared toward music discovery • We'll look at two consumption models: Active browsing Passive listening (search & ranking) (playlist generation) • Evaluation derived from user behavior
  • 11. Defining similarity: semantics? Song similarity = tag similarity?
  • 12. Defining similarity: semantics? • Drawbacks: - Choosing, weighting vocabulary is surprisingly difficult - Hard to maintain quality at scale
  • 13. Defining similarity: human judgements? [M. & Lanckriet, 2009, 2011] • Which is more similar?
  • 14. Defining similarity: human judgements? [M. & Lanckriet, 2009, 2011] • Which is more similar? • Drawbacks: ambiguity, subjectivity, scale
  • 15. Collaborative filter similarity • Collect listening histories for (lots of!) users • Song similarity = portion of users in common
  • 16. Collaborative filter similarity • Collaborative filters perform well... - ... for tagging [Kim, Tomasik, & Turnbull, 2009] - ... and playlisting [Barrington, Oda, & Lanckriet, 2009] - ... and recommendation (Yahoo, Last.fm, iTunes...) • Implicit feedback requires no additional effort from users • ... but fails on unpopular items: the cold start problem!
  • 17. Learning from a collaborative filter [M., Barrington, & Lanckriet, 2010, 2012] 1. 2. 3.
  • 18. Learning from a collaborative filter [M., Barrington, & Lanckriet, 2010, 2012] 1. 2. 3.
  • 19. Learning from a collaborative filter [M., Barrington, & Lanckriet, 2010, 2012] 1. 2. 3.
  • 20. Metric learning to rank • The goal: Rankings in Rankings in = audio space CF space
  • 21. Metric learning to rank [M. & Lanckriet, 2010] • The goal: Ranking by Target = (learned) distance rankings
  • 22. Metric learning to rank [M. & Lanckriet, 2010] • The goal: Ranking by Target = (learned) distance rankings • Optimize a linear transformation for ranking
  • 23. Structure prediction: nearest neighbors • Setup: database , rankings • PSD matrix transforms features • Order by distance from :
  • 24. Structure prediction: nearest neighbors • Setup: database , rankings • PSD matrix transforms features • Order by distance from : • encodes each (query, ranking) pair
  • 25. Metric learning to rank (MLR) Score for target ranking > Score ranking + Prediction other for any error • Supported losses Δ: AUC, KNN, MAP, MRR, NDCG, Prec@k
  • 26. MLR solver • Cutting-plane algorithm based on 1-slack Structural SVM [Joachims, et al. 2009] • Repeat until convergence: Constraint Semi-definite generation programming (DP)
  • 27. MLR solver • Cutting-plane algorithm based on 1-slack Structural SVM [Joachims, et al. 2009] • Repeat until convergence: Constraint Semi-definite generation programming (DP) Sequence of QPs
  • 28. MLR solver • Cutting-plane algorithm based on 1-slack Structural SVM [Joachims, et al. 2009] • Repeat until convergence: Constraint Semi-definite generation programming (DP) Sequence of QPs • Multiple kernel extensions: [Galleguillos, M., Belongie, & Lanckriet 2011]
  • 29. Audio pipeline Audio signal
  • 30. Audio pipeline Audio signal 1. Feature Bag of ΔMFCCs extraction
  • 31. Audio pipeline Audio signal 1. Feature Bag of ΔMFCCs extraction 2. Vector quantization Codeword hist.
  • 32. Audio pipeline Audio signal 1. Feature Bag of ΔMFCCs extraction 2. Vector quantization PPK Codeword hist. 3. Probability product kernel
  • 33. Audio pipeline Audio signal CF similarity Supervision PPK MLR Features
  • 34. Evaluation: CAL10K • Last.fm collaborative filter [Celma, 2008] - 360K users, 186K artists • CAL10K songs [Tingle, Turnbull, & Kim, 2010] - 5.4K songs, 2K artists (after CF matching)
  • 35. Evaluation: CAL10K • Last.fm collaborative filter [Celma, 2008] - 360K users, 186K artists • CAL10K songs [Tingle, Turnbull, & Kim, 2010] - 5.4K songs, 2K artists (after CF matching) • Evaluation: - Split artists into train/val/test - Target rankings: top-10 most similar train artists
  • 36. Evaluation: comparison • Gaussian mixture models + KL divergence - 8 component, diagonal covariance GMM per song • Auto-tags: predict 149 semantic tags from audio [Turnbull, 2008] • [Our method] VQ+MLR: 1024 codewords • Expert tags: 1053 tags from Pandora [Tingle, et al., 2009]
  • 37. Similarity learning: results GMM (KL) Auto-tags Auto-tags + MLR Audio VQ Audio VQ + MLR Expert tags (cos) Expert tags + MLR 0.65 0.70 0.75 0.80 0.85 0.90 0.95 AUC
  • 38. Example playlists The Ramones - Go Mental Def Leppard - Promises The Buzzcocks - Harmony In My Head Los Lonely Boys - Roses Wolfmother - Colossal Judas Priest - Diamonds and Rust (live)
  • 39. Example playlists The Ramones - Go Mental Def Leppard - Promises The Buzzcocks - Harmony In My Head Los Lonely Boys - Roses Wolfmother - Colossal Judas Priest - Diamonds and Rust (live) The Buzzcocks - Harmony In My Head Mötley Crüe - Same Ol' Situation The Offspring - Gotta Get Away MLR The Misfits - Skulls AC/DC - Who Made Who (live)
  • 40. Example playlists Fats Waller - Winter Weather Dizzy Gillespie - She's Funny That Way Enrique Morente - Solea Chet Atkins - In the Mood Rachmaninov - Piano Concerto #4 Eluvium - Radio Ballet
  • 41. Example playlists Fats Waller - Winter Weather Dizzy Gillespie - She's Funny That Way Enrique Morente - Solea Chet Atkins - In the Mood Rachmaninov - Piano Concerto #4 Eluvium - Radio Ballet Chet Atkins - In the Mood Charlie Parker - What Is This Thing Called Love? Bud Powell - Oblivion Bob Wills & His Texas Playboys - Lyla Lou Bob Wills & His Texas Playboys - Sittin' On Top of the World
  • 42. Scaling up: fast retrieval [M. & Lanckriet, 2011] • Audio similarity search for a million songs? • Idea: Index data with spatial trees • 100-NN search over 900K songs: - Brute force: 2.4s - 50% recall: 0.14s 17x speedup - 20% recall: 0.02s 120x speedup
  • 43. Similarity learning: summary • Collaborative filters provide user-centric music similarity • CF similarity can be approximated by audio features • Audio search can be done quickly at large-scale
  • 45. Playlist generation • Goal: generate a "good" song sequence - Music auto-pilot (given context) • Many existing algorithms, but no standard evaluation • What makes one algorithm better than another?
  • 46. Playlist evaluation 1: Human survey • Idea: generate playlists, ask for opinions • Impractical at large-scale: - Huge search space - User taste, expertise can be problematic - Slow, expensive • Does not facilitate rapid evaluation and optimization
  • 47. Playlist evaluation 2: Information retrieval • Idea: - Define "good" and "bad" playlists - Predict the next song, measure accuracy • But what makes a bad playlist? • Do users agree on good/bad?
  • 48. A generative approach [M. & Lanckriet, 2011b] • Playlist algorithm = distribution over playlists • Don't evaluate synthetic playlists • Do evaluate the likelihood of generating real playlists
  • 49. The playlist collection: AOTM-2011 • Art of the Mix - 13 years of playlists - ~210K playlist segments - ~100K songs from MSD • Top 25 playlist categories: - Genre: Punk, Hip-hop, Reggae... - Context: Road trip, Break-up, Sleep... - Other: Mixed genre, Alternating DJ...
  • 50. A simple playlist model 1. Start with a set of songs
  • 51. A simple playlist model 2. Select a subset (e.g., jazz songs)
  • 52. A simple playlist model 3. Select a song
  • 53. A simple playlist model 4. Select a new subset
  • 54. A simple playlist model 4. Select a new subset
  • 55. A simple playlist model 5. Select a new song
  • 56. A simple playlist model 6. Repeat...
  • 57. A simple playlist model 6. Repeat...
  • 58. Connecting the dots... • Random walk on a hypergraph - Vertices = songs - Edges = subsets • Edges derived from: - Audio clusters, tags, lyrics, era, popularity, CF - or combinations/intersections • Goal: optimize edge weights from example playlists
  • 59. Playlist model exp. prior edge weights transitions playlists
  • 60. Playlist generation: evaluation • Setup: - Split playlist collection into train/test - Learn edge weights on training playlists - Evaluate average likelihood of test playlists • Train per category, or all together • Compare against uniform shuffle baseline
  • 61. Random walk results ALL Mixed Global model Theme Category-specific Rock-pop Alternating DJ Indie Single artist Romantic Road trip Punk Depression Break up Narrative Hip-hop Sleep Electronic Dance-house R&B Country Cover songs Hardcore Rock Jazz Folk Reggae Blues 0% 5% 10% 1 5% 20% 25% Log-likelihood gain over random shuffle
  • 62. Stationary model results ALL Mixed Global model Theme Category-specific Rock-pop Alternating DJ Indie Single artist Romantic Road trip Punk Depression Break up Narrative Hip-hop Sleep Electronic Dance-house R&B Country Cover songs Hardcore Rock Jazz Folk Reggae Blues -15% -10% -5% 0% 5% 10% 15% 20% Log-likelihood gain over random shuffle
  • 63. Example playlists Rhythm & Blues 70s & soul Lyn Collins - Think Audio #14 & funk Isaac Hayes - No Name Bar DECADE 1965 & soul Michael Jackson - My Girl Electronic music Audio #11 & downtempo Everything But The Girl - Blame DECADE 1990 & trip-hop Massive Attack - Spying Glass Audio #11 & electronica Björk - Hunter
  • 64. Playlist generation summary • Generative approach simplifies evaluation • AOTM-2011 collection facilitates learning and evaluation • Robust, efficient and transparent feature integration
  • 66. Directions for future work • Audio features: coding, dynamics and rhythm • Playlist models: mixtures, long-range interactions • UI models: interactive, context-aware, diversity
  • 67. Personalized recommendation [M., Bertin-Mahieux, Ellis, & Lanckriet, 2012] • The Million Song Dataset Challenge • Listening histories for 1.1M users, 380K songs • Task: personalized song recommendation
  • 68. Conclusion • MLR can optimize distance metrics for ranking, QBE retrieval • Audio similarity can approximate a collaborative filter • Generative playlist model integrates data, models dynamics • User-centric evaluation makes it all possible
  • 70.
  • 71. Metric partial order feature • Score is large when distances match ranking
  • 72. Playlist weights: 6390 edges ALL Mixed Theme Rock-pop Alternating DJ Indie Single Artist Romantic RoadTrip Punk Depression Break Up Narrative Hip-hop Sleep Electronic music Dance-house Rhythm and Blues Country Cover Hardcore Rock Jazz Folk Reggae Blues Audio CF Era Familiarity Lyrics Tags Uniform • Audio & CF: k-means (16/64/256) • Lyrics: LDA (k=32, top-1/3/5) • Era: year, decade, decade+5 • Tags: Last.fm top-10 • Familiarity: high/med/low • Conjunctions