2. Me
Background in Computer Science
Masters McGill Music Tech
Online
http://github.com/alastair (20/28 music; 11 in python)
http://twitter.com/alastairporter
3. Python as a go-to language
Quick for prototyping
Use the same code in a production release
Very handy for API access (thin wrapper around urllib2)
5. Music and Metadata
The problem:
People are really bad at naming music
Inconsistent over releases
The solution:
Crowdsourcing
Get info from as many trusted sources as possible
Make renaming take no effort
14. IdentiïŹcation strategy
If thereâs a CD TOC, use that (musicbrainz lookup)
If no match, use audio ïŹngerprinting
If no match, do a text lookup (artist/album)
15. Fingerprinting
Converts an audio signal to a short sequence of numbers
Smaller to compare than an entire ïŹle
Perceptual features rather than byte comparison (works
with different encodings)
16. IdentiïŹcation strategy
Fingerprinting gives us a set of candidate tracks
A track could be on many albums (original release, best of,
mix album)
Keep a list of what tracks we have for each album
Once we ïŹll all the slots for an album, success!
17. Metadata strategy
Text information from Musicbrainz
Genre from last.fm
Image from Amazon (or folder.jpg)
Musicbrainz tells us where these are (donât need to search)
Save in every ïŹle (Text is cheap)
18. Writing it all out
Custom MP3/ID3 writer
Ogg meta tags
FLAC meta tags
Name ïŹles
Artist/Artist - Year - Album/01 - Artist - Track
Replaygain!
Be a good citizen: Submit ïŹngerprints to musicbrainz
19. Whatâs next
New version of musicbrainz
New ïŹngerprinter
More metadata
More metadata
20. Thanks
More information:
MusicBrainz: http://musicbrainz.org
albumidentify:
http://github.com/albumidentify/albumidentify
More ïŹngerprinting: http://acoustid.org,
http://echoprint.me
Last.fm