SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Music retrieval technique
Shital Katkar
132011005
Index
What is QBH?
Basic Architecture
Application
Challenges
File Formats
System Architecture
Parson code algorithm
 Benchmarking MIR System
•“I don’t know the name. I don’t know who does it.
•But I can’t get this song out of my head.”
•Well, why not just hum it.
QBH System
Query By Humming
Basic Architecture
Microphone Extraction Transcription Comparison Result List
DB
Fig- Basic System Architecture
Applications
Shazam
•identify pre-recorded music being
broadcast from any source, such as a
radio, television
Applications
Sound Hound
•identify music by humming, singing
or playing a recorded track
Applications
Midomi
•identify music by
humming, singing or
playing a recorded track
Applications
Musipedia
identify music by whistling
a theme, playing it on a
virtual piano keyboard,
tapping the rhythm on
the computer keyboard
Challenges
• Users may not make perfect queries.
• Accurately capturing pitches and notes from user hums is
difficult, even if the user manages to submit a perfect
query.
• Similarly, accurately capturing melodic information from a
pre-recorded music file is difficult.
File Formats
Wav File Format
short form of the Wave Audio File Format
the most common use is to store an uncompressed audio
 quite large in size
first generation files of high quality
File Formats
MIDI File Format
Musical Instrument Digital Interface
MIDI files are not exactly the same as the typical digital audio formats we
use (like WAV, MP3, MP4 etc.)
a MIDI is made up of information that describes what musical notes are to
be played
MIDI Files therefore do not contain any 'real world' recordings
System Architecture
Parson Code
Algorithm
Algorithm
A note in the input is classified in one of three ways
1. U = "up," if the note is higher than the previous
note
2. D = "down," if the note is lower than the
previous note
3. r = "repeat," if the note is the same pitch as the
previous note
4. * = first tone as reference
Textual Pattern
C C G G GA A
U r
F F E E D D C
D r D r D r D* rUr D
72 72 79 79
81 81 79 77 77 76 76 7274 74
Introduction
Music Information Retrieval (MIR)
efficient content-based searching
retrieval of musical information
should be easily operated by users
should be controlled by a simple-to-use graphical 'musical' interface
MIR System
Problem Definition
Lot of MIR System
All have the same task- to enable users to search for music
Very few systems that are actually publicly accessible and comparable
Some System works only with MIDI representation, some with transcriptions
Each system has a different set of files available in its database
Music Information
Retrieval Methods
MIR Systems may be divided into two categories
1. those that search symbolic representations of music
MIDI files or Common Music Notation (CMN)
2. those that search raw audio files
WAV or mp3 file format
Symbolic representations
 consist of a list of instructions as to how the piece should be played
include the notes, when and for how long each is played
Typical Query –
Involve a search for files with a given sequence of notes
List of MIDI files from database
Music Information
Retrieval Methods
raw audio files
digital representations of an actual recording
contain a level of complexity that is not found in the symbolic representations
composition is contaminated by noise
Music Information
Retrieval Methods
Two Approaches
Extraction
involves finding certain features, such as the mean and variance of
audio signal
Transcription
convert the query into a symbolic representation
Music Information
Retrieval Methods
Online MIR Systems
CatFind
MelDex
MelodyHound
ThemeFinder
Music Retrieval Demo
CatFind
 search MIDI files using either a musical transcription or a melodic profile
based on the Parson’s Code
It has minimal features
intended primarily for demonstration
Online MIR Systems
MelDex
 MELody inDEX
allows searching of the New Zealand Digital Library
designed to retrieve melodies from a database on the basis of a few
notes sung into a microphone
It accepts acoustic input from the user, transcribes it into common music
notation, then searches a database for tunes that contain the sung
pattern, or patterns similar to it.
Retrieval is ranked according to the closeness of the match
Online MIR Systems
MelodyHound
developed by Rainer Typke in 1997
originally known as "Tuneserver"
It searches directly on the Parsons Code
was designed initially for Query By Whistling
return the song in the database that most closely matches
Online MIR Systems
Themefinder
created by David Huron
allows one to identify common themes in Western classical
music, Folksongs of the sixteenth century
Online MIR Systems
Music Retrieval Demo
 performs similarity searches on raw audio data (WAV files)
No transcription of any kind is applied
It works by calculating the distance between the selected file and all
other files in the database
Online MIR Systems
Comparison Of MIR Systems
Evaluation Issues
The coverage of the collection, that is, the extent to which the system
includes relevant matter.
The time lag, that is, the average interval between the time the search
request is made and the time an answer is given.
The recall of the system, that is, the proportion of relevant material actually
retrieved in answer to a search request
The precision of the system, that is, the proportion of retrieved material
that is actually relevant.
Conclusion
In this work, we have laid down a framework for benchmarking of future
MIR systems. There are only a handful of MIR systems available
online, each of which is quite limited in scope. Still, these benchmarking
techniques were applied to five online systems. Proposals were made
concerning future benchmarking of full online audio retrieval systems. It is
hoped that these recommendations will be considered and expanded
upon as such systems become available.
Thank you

Weitere ähnliche Inhalte

Was ist angesagt?

Lecture# 7 midi file format
Lecture#  7 midi file formatLecture#  7 midi file format
Lecture# 7 midi file format
Mr SMAK
 
Speech recognition final
Speech recognition finalSpeech recognition final
Speech recognition final
Arpit Kumar
 
video compression techique
video compression techiquevideo compression techique
video compression techique
Ashish Kumar
 
Face recognition technology - BEST PPT
Face recognition technology - BEST PPTFace recognition technology - BEST PPT
Face recognition technology - BEST PPT
Siddharth Modi
 

Was ist angesagt? (20)

Speech Recognition Using Python | Edureka
Speech Recognition Using Python | EdurekaSpeech Recognition Using Python | Edureka
Speech Recognition Using Python | Edureka
 
Sound digitalisation
Sound digitalisationSound digitalisation
Sound digitalisation
 
YUV, Y CB CR and Subsampling
YUV, Y CB CR and SubsamplingYUV, Y CB CR and Subsampling
YUV, Y CB CR and Subsampling
 
The role of NLP & ML in Cognitive System by Sunantha Krishnan
The role of NLP & ML in Cognitive System by Sunantha KrishnanThe role of NLP & ML in Cognitive System by Sunantha Krishnan
The role of NLP & ML in Cognitive System by Sunantha Krishnan
 
Multimedia content based retrieval in digital libraries
Multimedia content based retrieval in digital librariesMultimedia content based retrieval in digital libraries
Multimedia content based retrieval in digital libraries
 
Face Detection and Recognition System
Face Detection and Recognition SystemFace Detection and Recognition System
Face Detection and Recognition System
 
EMOTION DETECTION USING AI
EMOTION DETECTION USING AIEMOTION DETECTION USING AI
EMOTION DETECTION USING AI
 
Keras vs Tensorflow vs PyTorch | Deep Learning Frameworks Comparison | Edureka
Keras vs Tensorflow vs PyTorch | Deep Learning Frameworks Comparison | EdurekaKeras vs Tensorflow vs PyTorch | Deep Learning Frameworks Comparison | Edureka
Keras vs Tensorflow vs PyTorch | Deep Learning Frameworks Comparison | Edureka
 
Text similarity measures
Text similarity measuresText similarity measures
Text similarity measures
 
Lecture# 7 midi file format
Lecture#  7 midi file formatLecture#  7 midi file format
Lecture# 7 midi file format
 
Deep learning for real life applications
Deep learning for real life applicationsDeep learning for real life applications
Deep learning for real life applications
 
Speech recognition final
Speech recognition finalSpeech recognition final
Speech recognition final
 
video compression techique
video compression techiquevideo compression techique
video compression techique
 
Data Science - Part XVII - Deep Learning & Image Processing
Data Science - Part XVII - Deep Learning & Image ProcessingData Science - Part XVII - Deep Learning & Image Processing
Data Science - Part XVII - Deep Learning & Image Processing
 
Codecs
CodecsCodecs
Codecs
 
Aws web application architecture
Aws web application architectureAws web application architecture
Aws web application architecture
 
Face recognition technology - BEST PPT
Face recognition technology - BEST PPTFace recognition technology - BEST PPT
Face recognition technology - BEST PPT
 
MPEG/Audio Compression
MPEG/Audio CompressionMPEG/Audio Compression
MPEG/Audio Compression
 
Unit iv
Unit ivUnit iv
Unit iv
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 

Andere mochten auch (6)

Query By Humming - Music Retrieval Technique
Query By Humming - Music Retrieval TechniqueQuery By Humming - Music Retrieval Technique
Query By Humming - Music Retrieval Technique
 
CS158: Final Project
CS158: Final ProjectCS158: Final Project
CS158: Final Project
 
Ok shazam, "la la-lalaa"!
Ok shazam, "la la-lalaa"!Ok shazam, "la la-lalaa"!
Ok shazam, "la la-lalaa"!
 
Music Information Retrieval: Overview and Current Trends 2008
Music Information Retrieval: Overview and Current Trends 2008Music Information Retrieval: Overview and Current Trends 2008
Music Information Retrieval: Overview and Current Trends 2008
 
How To Do Music Recommendation
How To Do Music RecommendationHow To Do Music Recommendation
How To Do Music Recommendation
 
Neo4j: Graph-like power
Neo4j: Graph-like powerNeo4j: Graph-like power
Neo4j: Graph-like power
 

Ähnlich wie Query By humming - Music retrieval technology

How Can The Essen Associative Code Be Used
How Can The Essen Associative Code Be UsedHow Can The Essen Associative Code Be Used
How Can The Essen Associative Code Be Used
lahtrumpet
 
How Can The Essen Associative Code Be Used
How Can The Essen Associative Code Be UsedHow Can The Essen Associative Code Be Used
How Can The Essen Associative Code Be Used
lahtrumpet
 
survey on Hybrid recommendation mechanism to get effective ranking results fo...
survey on Hybrid recommendation mechanism to get effective ranking results fo...survey on Hybrid recommendation mechanism to get effective ranking results fo...
survey on Hybrid recommendation mechanism to get effective ranking results fo...
Suraj Ligade
 
Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...
aciijournal
 
Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...
aciijournal
 
AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...
AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...
AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...
aciijournal
 
CSTalks - Music Information Retrieval - 23 Feb
CSTalks - Music Information Retrieval - 23 FebCSTalks - Music Information Retrieval - 23 Feb
CSTalks - Music Information Retrieval - 23 Feb
cstalks
 

Ähnlich wie Query By humming - Music retrieval technology (20)

Knn a machine learning approach to recognize a musical instrument
Knn  a machine learning approach to recognize a musical instrumentKnn  a machine learning approach to recognize a musical instrument
Knn a machine learning approach to recognize a musical instrument
 
web based music genre classification.pptx
web based music genre classification.pptxweb based music genre classification.pptx
web based music genre classification.pptx
 
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
 
Nithin Xavier research_proposal
Nithin Xavier research_proposalNithin Xavier research_proposal
Nithin Xavier research_proposal
 
AI&BigData Lab 2016. Игорь Костюк: Как приручить музыкальную рекомендательную...
AI&BigData Lab 2016. Игорь Костюк: Как приручить музыкальную рекомендательную...AI&BigData Lab 2016. Игорь Костюк: Как приручить музыкальную рекомендательную...
AI&BigData Lab 2016. Игорь Костюк: Как приручить музыкальную рекомендательную...
 
Fuzzy
FuzzyFuzzy
Fuzzy
 
Music data mining
Music  data miningMusic  data mining
Music data mining
 
How Can The Essen Associative Code Be Used
How Can The Essen Associative Code Be UsedHow Can The Essen Associative Code Be Used
How Can The Essen Associative Code Be Used
 
How Can The Essen Associative Code Be Used
How Can The Essen Associative Code Be UsedHow Can The Essen Associative Code Be Used
How Can The Essen Associative Code Be Used
 
Music Objects to Social Machines
Music Objects to Social MachinesMusic Objects to Social Machines
Music Objects to Social Machines
 
Igor Kostiuk “Как приручить музыкальную рекомендательную систему”
Igor Kostiuk “Как приручить музыкальную рекомендательную систему”Igor Kostiuk “Как приручить музыкальную рекомендательную систему”
Igor Kostiuk “Как приручить музыкальную рекомендательную систему”
 
survey on Hybrid recommendation mechanism to get effective ranking results fo...
survey on Hybrid recommendation mechanism to get effective ranking results fo...survey on Hybrid recommendation mechanism to get effective ranking results fo...
survey on Hybrid recommendation mechanism to get effective ranking results fo...
 
Bl ig2 url edit
Bl ig2 url editBl ig2 url edit
Bl ig2 url edit
 
Indexing and Retrieval of Audio
Indexing and Retrieval of AudioIndexing and Retrieval of Audio
Indexing and Retrieval of Audio
 
Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...
 
Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...
 
AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...
AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...
AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...
 
A Review on Song Recommendation Techniques
A Review on Song Recommendation TechniquesA Review on Song Recommendation Techniques
A Review on Song Recommendation Techniques
 
Music mobile
Music mobileMusic mobile
Music mobile
 
CSTalks - Music Information Retrieval - 23 Feb
CSTalks - Music Information Retrieval - 23 FebCSTalks - Music Information Retrieval - 23 Feb
CSTalks - Music Information Retrieval - 23 Feb
 

Mehr von Shital Kat (8)

Opinion Mining
Opinion MiningOpinion Mining
Opinion Mining
 
Introduction to HADOOP
Introduction to HADOOPIntroduction to HADOOP
Introduction to HADOOP
 
Big data processing using - Hadoop Technology
Big data processing using - Hadoop TechnologyBig data processing using - Hadoop Technology
Big data processing using - Hadoop Technology
 
School admission process management system (Documention)
School admission process management system (Documention)School admission process management system (Documention)
School admission process management system (Documention)
 
WiFi technology Writeup
WiFi technology WriteupWiFi technology Writeup
WiFi technology Writeup
 
Wifi Security
Wifi SecurityWifi Security
Wifi Security
 
WiFi part II
WiFi part IIWiFi part II
WiFi part II
 
WIFI Introduction (PART I)
WIFI Introduction (PART I)WIFI Introduction (PART I)
WIFI Introduction (PART I)
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Query By humming - Music retrieval technology

  • 2. Index What is QBH? Basic Architecture Application Challenges File Formats System Architecture Parson code algorithm  Benchmarking MIR System
  • 3. •“I don’t know the name. I don’t know who does it. •But I can’t get this song out of my head.” •Well, why not just hum it. QBH System Query By Humming
  • 4. Basic Architecture Microphone Extraction Transcription Comparison Result List DB Fig- Basic System Architecture
  • 5. Applications Shazam •identify pre-recorded music being broadcast from any source, such as a radio, television
  • 6. Applications Sound Hound •identify music by humming, singing or playing a recorded track
  • 7. Applications Midomi •identify music by humming, singing or playing a recorded track
  • 8. Applications Musipedia identify music by whistling a theme, playing it on a virtual piano keyboard, tapping the rhythm on the computer keyboard
  • 9. Challenges • Users may not make perfect queries. • Accurately capturing pitches and notes from user hums is difficult, even if the user manages to submit a perfect query. • Similarly, accurately capturing melodic information from a pre-recorded music file is difficult.
  • 10. File Formats Wav File Format short form of the Wave Audio File Format the most common use is to store an uncompressed audio  quite large in size first generation files of high quality
  • 11. File Formats MIDI File Format Musical Instrument Digital Interface MIDI files are not exactly the same as the typical digital audio formats we use (like WAV, MP3, MP4 etc.) a MIDI is made up of information that describes what musical notes are to be played MIDI Files therefore do not contain any 'real world' recordings
  • 13. Parson Code Algorithm Algorithm A note in the input is classified in one of three ways 1. U = "up," if the note is higher than the previous note 2. D = "down," if the note is lower than the previous note 3. r = "repeat," if the note is the same pitch as the previous note 4. * = first tone as reference
  • 14. Textual Pattern C C G G GA A U r F F E E D D C D r D r D r D* rUr D 72 72 79 79 81 81 79 77 77 76 76 7274 74
  • 15.
  • 16. Introduction Music Information Retrieval (MIR) efficient content-based searching retrieval of musical information should be easily operated by users should be controlled by a simple-to-use graphical 'musical' interface
  • 17. MIR System Problem Definition Lot of MIR System All have the same task- to enable users to search for music Very few systems that are actually publicly accessible and comparable Some System works only with MIDI representation, some with transcriptions Each system has a different set of files available in its database
  • 18. Music Information Retrieval Methods MIR Systems may be divided into two categories 1. those that search symbolic representations of music MIDI files or Common Music Notation (CMN) 2. those that search raw audio files WAV or mp3 file format
  • 19. Symbolic representations  consist of a list of instructions as to how the piece should be played include the notes, when and for how long each is played Typical Query – Involve a search for files with a given sequence of notes List of MIDI files from database Music Information Retrieval Methods
  • 20. raw audio files digital representations of an actual recording contain a level of complexity that is not found in the symbolic representations composition is contaminated by noise Music Information Retrieval Methods
  • 21. Two Approaches Extraction involves finding certain features, such as the mean and variance of audio signal Transcription convert the query into a symbolic representation Music Information Retrieval Methods
  • 23. CatFind  search MIDI files using either a musical transcription or a melodic profile based on the Parson’s Code It has minimal features intended primarily for demonstration Online MIR Systems
  • 24. MelDex  MELody inDEX allows searching of the New Zealand Digital Library designed to retrieve melodies from a database on the basis of a few notes sung into a microphone It accepts acoustic input from the user, transcribes it into common music notation, then searches a database for tunes that contain the sung pattern, or patterns similar to it. Retrieval is ranked according to the closeness of the match Online MIR Systems
  • 25. MelodyHound developed by Rainer Typke in 1997 originally known as "Tuneserver" It searches directly on the Parsons Code was designed initially for Query By Whistling return the song in the database that most closely matches Online MIR Systems
  • 26. Themefinder created by David Huron allows one to identify common themes in Western classical music, Folksongs of the sixteenth century Online MIR Systems
  • 27. Music Retrieval Demo  performs similarity searches on raw audio data (WAV files) No transcription of any kind is applied It works by calculating the distance between the selected file and all other files in the database Online MIR Systems
  • 28. Comparison Of MIR Systems
  • 29. Evaluation Issues The coverage of the collection, that is, the extent to which the system includes relevant matter. The time lag, that is, the average interval between the time the search request is made and the time an answer is given. The recall of the system, that is, the proportion of relevant material actually retrieved in answer to a search request The precision of the system, that is, the proportion of retrieved material that is actually relevant.
  • 30. Conclusion In this work, we have laid down a framework for benchmarking of future MIR systems. There are only a handful of MIR systems available online, each of which is quite limited in scope. Still, these benchmarking techniques were applied to five online systems. Proposals were made concerning future benchmarking of full online audio retrieval systems. It is hoped that these recommendations will be considered and expanded upon as such systems become available.

Hinweis der Redaktion

  1. When we don’t knw the songs name. We don’t knw who does the song. But that perticular song cannot goes out of our mind. In this situation QBH is a powerfull to for serching that song.
  2. The difference bet MP3 and wav is mp3 is compressed while wav file in uncompressed file.
  3. The difference bet MP3 and wav is mp3 is compressed while wav file in uncompressed file.
  4. Now you have some knowlegde of this notes and you knwo the parson code rules.So we will convert this song into a textuual pattern.The song is Twinkle twinkle little star. Notations are CC GG AA GFirst note is C . 72nd note We will make it as reference note. And put the *Second note is also C  Since it is repeating, we will put RNext is G  G note is upper than C so we will put U – U for upperFor second G  We put R
  5. Goal of this paper is to create an accurate and effective benchmarking system for music information retrieval (MIR) systems. This will serve the multiple purposes of inspiring the MIR community to add additional features and increased speed into existing projects, and to measure the performance of their work and incorporate the ideas of other works. To date, there has been no systematic rigorous review of the field, and thus there is little knowledge of when an MIR implementation might fail in a real world setting. Benchmarking MIR systems is currently hindered by the diversity of the systems, by their relatively new and unrefined nature, and by the limited number of accessible systems. Thus most of what will be described here will be introductory and will lay down the framework for future benchmarking and analysis. Particular attention will be paid to the evaluation issues surrounding retrieval of audio in test collections.
  6. The Music Information Retrieval (MIR) field is primarily concerned with efficient content-based searching and retrieval of musical information from online databases. The musical data may be stored in a variety of formats ranging from encoded scores to digital audio. MIR systems should be easily operated by users with a wide range of musical ability and understanding and should be controlled by a simple-to-use graphical 'musical' interface, both for search queries and for the presentation of results.
  7. There are a lot of MIR systems in various stages of development. These systems all have the same task- to enable users to search for music in a database. But there are very few systems that are actually publicly accessible and comparable, To date, there has been no formal analysis or quantitative comparison methodology (benchmark) of the available preliminary MIR systems. Some systems work only with MIDI representations, some with monophonic transcriptions, and some with scores. In addition, each system has a different set of files available in its database. Foreg, in MIDOMI.com it does not have bollywood songs To date, there is no online, publicly available system, that attempts to search for music based on polyphonic transcriptions. Thus, one goal of this work is to find ways by which these different systems can be compared. A benchmarking of MIR search engines will also provide an effective measure of the progress in the field.
  8. The symbolic representations typically consist of MIDI files or Common Music Notation (CMN)The raw audio files are typically WAV or mp3 file format.
  9. The symbolic representations usually consist of a list of instructions as to how the piece should be played. These include the notes, when and for how long each is played, the dynamics and the instruments that should be used. Other symbolic representations that may be searched include piano rolls and Parsons notation[3]. A typical query may involve a search for files with a given sequence of notes, and might produce a list of MIDI files from a database. Such queries are pertinent to musicians and musicologists who have a knowledge of musical representations.
  10. Essentially, they are digital representations of an actual recording. Thus they contain a level of complexity that is not found in the symbolic representations. The composition is contaminated by noise and incorporates slight variations in the timing and dynamics of the notes. By comparison, symbolic representations are ambiguous, since they often leave certain characteristics of the piece unspecified. Thus, two performances may have the same MIDI or CMN representations, but differ notably in their audio files.
  11. MIR systems that operate on audio files have followed two approaches, feature extraction[5] and transcription[6]. Feature extraction involves finding certain features, such as the mean and variance that typifies the audio or a portion thereof. The query and all files in the database are classified in terms of these parameters. Retrieval systems then operate as multidimensional searches on these parameters. Fast search methods have been described for such systems.[7] No attempt is made to relate these features to the musical qualities they might represent, e.g., energy to loudness, frequency to pitch. Transcription based raw audio MIR systems convert the query into a symbolic representation, and seek to match it against symbolic representations of the audio files in the database. Thus such a technique typically uses feature extraction as well, but then has an intermediate step attempting to relate these features to a description of the notes and instruments. This is an exceedingly difficult to task, and to date, no system achieves this effectively and accurately over a wide range of music.
  12. For the purposes of this work, we considered five online MIR systems. The systems considered all have certain properties in common. They may all be used online via the World Wide Web. They all are used by entering a query concerning a piece of music, and all may return information about music that matches that query. However, these systems differ greatly in their features, goals and implementation. These differences are discussed in detail below.
  13. CatFind[13] allows one to search MIDI files using either a musical transcription or a melodic profile based on the Parson’s Code. It has minimal features, and was intended primarily for demonstration. Although it seems unlikely that this system will be extended, it is still useful here as a system for comparison.
  14. This allows searching of the New Zealand Digital Library. Ie it only recongnize songs of new zealandcontry. The MELodyinDEX system[14, 15] is designed to retrieve melodies from a database on the basis of a few notes sung into a microphone. It accepts acoustic input from the user, transcribes it into common music notation, then searches a database for tunes that contain the sung pattern, or patterns similar to it.Thus the query is audio although the retrieved files are in symbolic representation. Retrieval is ranked according to the closeness of the match. A variety of different mechanisms are provided to control the search, depending on the precision of the input.
  15. MelodyHoundThis melody recognition system[16] was developed by Rainer Typke in 1997. It was originally known as "Tuneserver" and hosted by the university of Karlsruhe. It searches directly on the Parsons Code and was designed initially for Query By Whistling. That is, it will return the song in the database that most closely matches a whistled query.
  16. Themefinder[17], created by David Huron, et. al.,[18] allows one to identify common themes in Western classical music, Folksongs, and latin Motets of the sixteenth century. Themefinder provides a web-based interface to the Humdrum thema command[19], which in turn allows searching of databases containing musical themes or incipits (opening note sequences). Themes and incipits available through Themefinder are first encoded in the kern music data format. Groups of incipits are assembled into databases. Currently there are three databases: Classical Instrumental Music, European Folksongs, and Latin Motets from the sixteenth century. Matched themes are displayed on-screen in graphical notation.
  17. Music Retrieval Demo The Music Retrieval Demo[20] is notably different from the other MIR systems considered herein. The Music Retrieval Demo performs similarity searches on raw audio data (WAV files). No transcription of any kind is applied. It works by calculating the distance between the selected file and all other files in the database. The other files can then be displayed in a list ranked by their similarity, such that the more similar files are nearer the top. Distances are computed between templates, which are representations of the audio files, not the audio itself. The waveform is Hamming-windowed into overlapping segments; each segment is processed into a spectral representation of Mel- frequency cepstral coefficients. This is a data-reducing transformation that replaces each 20ms window with 12 cepstral coefficients plus an energy term, yielding a 13-valued vector. The next step is to quantize each vector using a specially- designed quantization tree. This recursively divides the vector space into bins, each of which corresponds to a leaf of the tree. Any MFCC vector will fall into one and only one bin. Given a segment of audio, the distribution of the vectors in the various bins characterize that audio. Counting how many vectors fall into each bin yields a histogram template that is used in the distance measure. For this demonstration, the distance between audio files is the simple Euclidean distance between their corresponding templates (or rather 1 minus the distance, so closer files have larger scores). Once scores have been computed for each audio clip, they are sorted by magnitude to produce a ranked list like other search engines.
  18. In Table 1, we present a comparison of the features of the various MIR systems under investigation. Note first that each of these systems was designed for a different purpose, and none of them can be considered a finished product. This table allows one to get an overview of the state of the MIR systems available., the features that one may wish to include in an MIR system, and the areas where improvement is most necessary. It also highlights the need for a standardized testbed. Each of the MIR systems use a different database of files for audio retrieval. Both CatFind and the Music Retrieval Demo have databases with less than 500 files. Thus, any benchmarking estimates, such as retrieval times and efficiency, are rendered useless. MelDex, MelodyHound and ThemeFinder have databases containing over 10,000 files. This should be sufficient for estimating search efficiency and slalability.
  19. In this work, we have laid down a framework for benchmarking of future MIR systems. There are only a handful of MIR systems available online, each of which is quite limited in scope. Still, these benchmarking techniques were applied to five online systems. Proposals were made concerning future benchmarking of full online audio retrieval systems. It is hoped that these recommendations will be considered and expanded upon as such systems become available.