2. Me
Background in Computer Science
Masters McGill Music Tech
Online
http://github.com/alastair (20/28 music; 11 in python)
http://twitter.com/alastairporter
3. Python as a go-to language
Quick for prototyping
Use the same code in a production release
Very handy for API access (thin wrapper around urllib2)
5. Music and Metadata
The problem:
People are really bad at naming music
Inconsistent over releases
The solution:
Crowdsourcing
Get info from as many trusted sources as possible
Make renaming take no effort
14. Identification strategy
If there’s a CD TOC, use that (musicbrainz lookup)
If no match, use audio fingerprinting
If no match, do a text lookup (artist/album)
15. Fingerprinting
Converts an audio signal to a short sequence of numbers
Smaller to compare than an entire file
Perceptual features rather than byte comparison (works
with different encodings)
16. Identification strategy
Fingerprinting gives us a set of candidate tracks
A track could be on many albums (original release, best of,
mix album)
Keep a list of what tracks we have for each album
Once we fill all the slots for an album, success!
17. Metadata strategy
Text information from Musicbrainz
Genre from last.fm
Image from Amazon (or folder.jpg)
Musicbrainz tells us where these are (don’t need to search)
Save in every file (Text is cheap)
18. Writing it all out
Custom MP3/ID3 writer
Ogg meta tags
FLAC meta tags
Name files
Artist/Artist - Year - Album/01 - Artist - Track
Replaygain!
Be a good citizen: Submit fingerprints to musicbrainz
19. What’s next
New version of musicbrainz
New fingerprinter
More metadata
More metadata
20. Thanks
More information:
MusicBrainz: http://musicbrainz.org
albumidentify:
http://github.com/albumidentify/albumidentify
More fingerprinting: http://acoustid.org,
http://echoprint.me
Last.fm