6. HOW DOES IT WORK?
• manually annotated metadata
•5 music experts at Aristo Music and
different consultants
• almost 80,000 songs
• but, not enough...
7. PROBLEMS
• satisfying
the music choice of all
customers
• retail
and catering differ from you
and me!
• new markets
• react fast on emerging music trends
• adding the full Belgian library catalog
8. GENERATE THE METADATA
• from different sources:
• the audio signal
• web sources
• the Aristo database
• attention metadata
• using our metadata generation framework: SamgI
9. GENRE...
• our master thesis looked at different ways to generate genre...
10. ONE APPROACH...
• M. Schedl, T. Pohle, P. Knees, G. Widmer, “Assigning
and
Visualizing Music Genres by Web-based Co-occurrence
Analysis”, Proceedings of the 7th International Conference on
Music Information Retrieval, 2006, pp. 260-265.
• G. Geleijnse, J. Korst, "Web-based Artist
Categorization",
Proceedings of the 7th International Conference on Music
Information Retrieval, 2006, pp. 266 - 271.
20. RESULTS
• master thesis student’s results were much worse
• what happened?
• did Google search result count change?
• has Google Search API different results?
• is the student’s implementation correct?
21. HOW TO EVALUATE THIS?
• re-run the original experiment
• evaluate on the same data set: 1995 artists and 9 genres.
• different search engines: Google,Yahoo! and Live! Search.
• over time: 8 times over a period of 36 days.
22. THE DATA SET
Blues Country Electronic
Folk Jazz Metal
Rap Reggae RnB
23. THE DATA SET
Blues Country Electronic
Folk Jazz Metal
Rap Reggae RnB
10% 9%
3%
2% 12%
13% 5%
4%
41%
24. THE DATA SET
Blues Country Electronic
Folk Jazz Metal
Rap Reggae RnB
29. MORE FINE-GRAINED...
• 18 artists
• more search engines: Google.co.uk/.fr/.be, uk/
fr.search.yahoo.com
• twice a day for 53 days
• 250,000 queries!
30. 2 Pac Rap
Alan Lomax Folk
Art Pepper Jazz
Cradle of Filth Metal
David Parsons Electronic
Desmond Dekker Reggae
Downpour Metal
IceT Rap
Jerry Butler RnB
Joy Lynn White Country
Louisiana Red Blues
Lou Rawls RnB
LTJ Bukem Electronic
Peter Tosh Reggae
Pinetop Smith Jazz
Robert Johnson Blues
Roy Rogers Country
Steeleye Span Folk
35. WHAT TO USE?
• use Google when it’s stable else rely on Yahoo!
• when is it stable? test with a small set
• some artists get classified incorrectly on bad days
• compare the accuracy achieved with the test set to the
average.
36. CONCLUSION
• still works after 3 years
• Google -> Yahoo! -> Live! Search
• why does Google fluctuate?
•a generic version of an all purpose classifier is implemented in
metadata generation framework
37. FUTURE WORK
• understand the performance
differences of regional search
engines
• use alternative search engines
• tweak
the genre taxonomy
depending on the search engine
NOT the Southern African Media and Gender Institute.
1. MG is better than MS, a possible explanation is that style is a broader term than genre for music
2. Google outperforms Yahoo! & Live!
3. results fluctuate over time
4. technical issues with Yahoo! only a fraction of the artists are retrieved
1. the accuracy is not the exactly the same as for the large data set. but the overall trends are similar.
2. MG schema is still more accurate
3. Yahoo! MG is a very stable
4. Live! is still the worst and Google the best!
1. Yahoo! is very stable
2. Live is the worst, Google the best!
3. no noticable differences between Live and Bing. Bing was launched on 3 June.
4. On 29 July, collaboration between Bing and Yahoo
1. .com performs best! -> co.uk -> fr -> be
2. fr and be worse: maybe because genres are in english
3. one could also check if local artists are classified better
correct: light
incorrect: dark
1. yahoo most stable
2. google changes most often.
3. changing from correct to incorrect occurs most, but no clear pattern
4. Live seems to struggle with the same artists, one time they do it correctly, the next time wrong.