Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database - 2019 presentation
1. Persian Classical Music
Instrument Recognition
(PCMIR) Using a Novel
Persian Music Database
9th International Conference on Computer and Knowledge Engineering (ICCKE
2019), October 24-25 2019, Ferdowsi University of Mashhad
Seyed Muhammad Hossein Mousavi
V. B. Surya Prasath
Seyed Muhammad Hassan Mousavi
2. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database1
The aim of this work : Classification of Persian musical instruments
Features : Mel-Frequency Cepstrum Coefficients (MFCCs), Spectral Roll-off, Spectral
Centroid, Zero Crossing Rate and Entropy Energy
Classes: 7 Persian musical instrument classes are Ney, Tar, Santur, Kamancheh, Tonbak, Ud
and Setar
Classifier: Multi-Layer Neural Network (MLNN)
1
Feature selection or dimensionality reduction : Fuzzy entropy feature selection method
Dataset: Proposing a new Persian music dataset
Comparison: Solanki, Arun, and Sachin Pandey. "Music instrument recognition using deep
convolutional neural networks." International Journal of Information Technology (2019): 1-10
The Structure of the Research
3. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database2
2 Introduction
The first lines of Samani, a music piece by Abol Hassan Saba
Which can be played by all melodic Persian classical instruments
4. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database3
3 Introduction
The commonly used Persian classical instruments
5. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database4
4 Related Researches
Author (s) Features Classes Classification Year Comment
Essid et al MFCC-Spectral Centroid
(Sc), - Spectral Width (Sw)-
the Spectral Asymmetry (Sa)
a new database
with 10 classes
(classic)
Gaussian Mixture
Model (GMM)
classifier
2004 a new pairwise feature
selection algorithms based
on inertia ratio
maximization - 79.87%.
Benetos et al audio spectrum flatness’s
mean, audio spectrum
flatness’s Variance, Mean of
the Audio Spectrum Envelope
and Mean of the Audio
Spectrum Spread and MFCC
6 classical
instruments
(classic)
non-negative matrix
factorization (NMF)
2006 Dataset = MIS - 95%
Joder et al Amplitude modulation (AM),
Zero crossing rates (ZCRs),
MFCCs
8 classical
instruments
(classic)
GMM, Hidden
Markov Model
(HMM), and Support
Vector Machine
(SVM)
2009 Spectral shape features in
feature extraction - on
Essid’s database - 84%
Barbedo et al combined different audio
signal features (spatial
domain)
classifying 25
different
classical music
instruments and
human voices
SVM classifier 2010 RWC and Iowa database =
- 84%
6. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database5
5 Related Researches
Author (s) Features Classes Classification Year Comment
Park et al Convolutional Neural
Networks (CNNs)
6 classical
instruments
(classic)
Convolutional Neural
Networks (CNNs)
2015 IOWA MIS database = 93.14 %
Bhalke Fractional Fourier Transform
(FrFT)-based Mel Frequency
Cepstral Coefficient (MFCC)
19 different
classic
instruments
Counter Propagation
Neural Network
(CPNN) classifier
2016 MUMS database = 91.84%
Shakibhamedan
et al
Fast-ICA and MFCC Persian music
instruments
recognition-6
Persian music
classes
SVM 2016 small new database = 83.71%
Han et al Convolutional Neural
Networks (CNN)
11 classes of
classic
instruments
Convolutional Neural
Networks (CNN)
2017 IRMAS Dataset = 89.6 %
Solanki et al Convolutional Neural
Networks (CNN)
11 classes of
classic
instruments
Convolutional Neural
Networks (CNN)
2019 IRMAS Dataset = 92.8 %
7. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database6
6 Proposed Method
Proposed method’s procedure
8. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database7
7 Proposed Method : Features
1. Zero-Crossing Rate (ZCR)
-. The rate of sign changes in audio frame is called Zero-Crossing Rate (ZCR) .
-. The number of signal changes values, from positive to negative and contrariwise, divided
by frames length.
2. Entropy Energy (EE)
-. The short-term entropy of energy can be expounded as a measure of sudden changes in the
energy level of a digital audio signal waveform.
9. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database8
8 Proposed Method : Features
3. Spectral Centroid (SC)
-. The spectral centroid is simple measure of spectral position and shape.
-. The spectral centroid is spectrum’s center of gravity.
4. Spectral Roll-off (SR)
-. Spectral roll-off represents the frequency below which a certain percentage (usually 80%–
90%) of the magnitude distribution of the spectrum is concentrated in the spectrum.
10. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database9
9 Proposed Method : Features
5. Mel-Frequency Cepstral Coefficients or (MFCCs)
-. A Cepstrum is the result of taking the inverse Fourier transform (IFT) of the logarithm
from the estimated spectrum of a signal.
-. Mel-Frequency Cepstrum (MFC) is a representation of the short-term power spectrum of a
sound.
-. Mel-frequency cepstral coefficients (MFCCs) are coefficients that collectively make up an
MFC.
-. Normally first 13 MFCCs are used because they are considered to carry enough
distinguishing information.
11. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database10
10 Proposed Method : Features
5. Mel-Frequency Cepstrum Coefficients or (MFCCs)
MFCCs are commonly derived as follows:
1. Take the Fourier transform of (a windowed excerpt of) a signal.
2. Map the powers of the spectrum obtained above onto the mel scale, using triangular
overlapping windows.
3. Take the logs of the powers at each of the mel frequencies.
4. Take the discrete cosine transform of the list of mel log powers, as if it were a signal.
5. The MFCCs are the amplitudes of the resulting spectrum.
12. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database11
11 Proposed Method : Features
5. Mel-Frequency Cepstrum Coefficients or (MFCCs)
Binary classification task of speech vs music with MFCC
13. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database12
12 Proposed Method : Features
SR, SC, EE and ZCR features wave form for a 2.5 second melody of Ney instrument
14. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database13
13 Proposed Method : Feature Selection
Feature Selection:
-. By eliminating fewer desirable attributes from the data, it could make the model clearer
and more realizable.
-. Luukka’s fuzzy feature selection method.
-. He used fuzzy entropy measures to removes features from data which it considers bearing
least amount of relevant information.
-. This method is based on combination of non-probabilistic fuzzy entropy and weighted
fuzzy entropy.
15. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database14
14 Proposed Method : Instruments
Spectrum of each instrument in a specific duration
16. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database15
15 Validation and Results : Proposed Database
Proposed small database:
-. 7 main Persian musical instruments: Ney, Tar, Santur, Kamancheh, Tonbak, Ud and
Setar classes.
-. Classes consist of 89 to 110 samples.
-. Each sample 5-10 seconds.
-. Database is developed with the aid of musicians whom we asked to play and we
recorded.
-. Some of the pieces were played in natural places such as rooms and music shops and a
few in the music studios.
17. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database16
16 Validation and Results : Proposed Database
Proposed small database:
Instrument samples Duration Size Recording device Recording place
Ney 102 5-10 sec 21 MB Phone Closed and open space
Tar 96 5-10 sec 20 MB Phone Closed and open space
Santur 107 5-10 sec 25 MB Phone Closed and open space
Kamancheh 101 5-10 sec 21 MB MP3 Closed and open space
Tonbak 110 5-10 sec 28 MB MP3 Closed and open space
Setar 89 5-10 sec 19 MB MP3 Closed and open space
Ud 93 5-10 sec 20 MB Phone Closed and open space
18. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database17
17 Validation and Results : Classification
Artificial Neural Networks (ANN):
-. A neural network with enough neurons can classify almost any data.
-. They are suitable for complex decision boundary problems having many variables.
-. Feed forward networks mostly consist of 1 or more hidden layers (sigmoid neurons)
which containing an output layer of linear neurons.
19. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database18
18 Validation and Results : Classification
Artificial Neural Networks (ANN):
Typical multilayer network architectures
log-sigmoid transition function A single-layer network of S
neurons having R inputs
20. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database19
19 Validation and Results : Results
Hidden layer
neurons
Training classification accuracy Testing classification accuracy
- Proposed
Method
[35] Proposed
Method
[35]
10 75.6 % 74.6 % 63.3 % 65.8 %
20 79.3 % 80.1 % 66.7 % 66.4 %
30 81.0 % 82.2 % 71.9 % 71.1 %
40 85.7 % 83.2 % 78.2 % 79.9 %
100 94.2 % 94.0 % 82.5 % 81.4 %
TRAIN AND TEST PRECISION CHANGES VS THE NUMBER OF NEURONS FOR ALL
INSTRUMENTS (PROPOSED METHOD VS [35]) ON PROPOSED DATABASE
21. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database20
20 Validation and Results : Results
Confusion Matrix:
-. A unique table which shows the performance visualization of an algorithm or method.
-. Usually works for supervised learning (matching matrix for unsupervised learning).
-. Rows represent instances in a predicted class.
-. Columns show the instances or samples in an actual class.
-. Purple column indicates number of samples per class.
-. Orange columns are the percentage of misclassification and number of misclassified
samples per classes.
-. Green columns are percentage of classification and number of classified samples per
classes
22. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database21
21 Validation and Results : Results
Ney 89% 91 - - 10% 9 1% 2 - -
Tar - 79% 76 8% 7 - - 13% 13 -
Santur - 10% 11 82% 88 - - 8% 8 -
Kamancheh 9% 8 - 2% 2 87% 89 2% 2 - -
Tonbak - 2% 2 1% 1 - 95% 105 2% 2 -
Setar - 16% 14 8% 7 - - 76% 68 -
Ud - 6% 6 18% 16 - - 6% 6 70% 65
- Ney Tar Santur Kamancheh Tombak Setar Ud
91 89% 11 11% 102
Data is splitted 70 % for train
and 30 % for test
76 79% 20 21% 96
88 82% 19 18% 107
89 87% 12 13% 101
105 95% 5 5% 110
68 76% 21 24% 89
65 70% 28 30% 93
True
Observation
True
Positive
Rate
False
Observation
False
Negative
Rate
Observations
698
PROPOSED METHOD CONFUSION MATRIX ON PROPOSED DATABASE
23. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database22
22 Validation and Results : Results
Proposed method’s recognition accuracy on proposed database
24. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database23
23 Validation and Results : Results
Classification Result:
-. In some instruments [35]’s method has better results in training stage.
-. In testing stage, proposed method is much better specially when number of hidden
layers increases.
-. The highest accuracy belongs to Tombak which is a percussion instrument.
-. The lowest for Ud.
-. As Tar, Setar and Santur sound somehow similar, there is a high percentage of
misclassification in their classes
-. 82.57% recognition accuracy.
25. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database24
24 Conclusion
Conclusion:
-. Combination spatial and frequency domain features
-. Using proper preprocessing
-. Nice feature selection methods
-. Strong classifier (NN)
-. Novel Persian musical instrument database
Future works:
-. Increasing Persian classical music instruments like Robab, Ghanoon, Daf and Gheychak
-. Using deep learning (like CNN)
26. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database25
25 REFERENCES
REFERENCES:
[1]Priemer, Roland. Introductory signal processing. Vol. 6. World Scientific Publishing Company, 1990.
[2]Sengupta, N., Sahidullah, M., & Saha, G. (2016). Lung sound classification using cepstral-based statistical features. Computers
in biology and medicine, 75, 118-129.
[3]Alan V. Oppenheimer and Ronald W. Schafer (1989). Discrete-Time Signal Processing. Prentice Hall. p. 1. ISBN 0-13-216771-
9.
[4] Duda, R. O., Hart, P. E., & Stork, D. G. (2012). Pattern classification. John Wiley & Sons.
[5]Essid, S., Richard, G., & David, B. (2004, October). Musical instrument recognition based on class pairwise feature selection.
In ISMIR.
[6]Benetos, E., Kotti, M., & Kotropoulos, C. (2006, May). Musical instrument classification using non-negative matrix
factorization algorithms and subset feature selection. In 2006 IEEE International Conference on Acoustics Speech and Signal
Processing Proceedings (Vol. 5, pp. V-V). IEEE.
[7]Joder, C., Essid, S., & Richard, G. (2009). Temporal integration for audio classification with application to musical instrument
classification. IEEE Transactions on Audio, Speech, and Language Processing, 17(1), 174-186..
[8]Barbedo, J. G. A., & Tzanetakis, G. (2010). Musical instrument classification using individual partials. IEEE Transactions on
Audio, Speech, and Language Processing, 19(1), 111-122.
[9]Park, T., & Lee, T. (2015). Musical instrument sound classification with deep convolutional neural network using feature fusion
approach. arXiv preprint arXiv:1512.07370.
[10]Bhalke, D. G., Rao, C. R., & Bormane, D. S. (2016). Automatic musical instrument classification using fractional fourier
transform based-MFCC features and counter propagation neural network. Journal of Intelligent Information Systems, 46(3), 425-
446.
[11]Shakibhamedan, Salar; Seyed Kooshan Hashemifard; Farhad Faradji & Mansour Vali, 2016, Persian Musical Instrument
Recognition System, First International Conference on New Research Achievements in Electrical and Computer Engineering,
Tehran, Iran
[12]Han, Y., Kim, J., Lee, K., Han, Y., Kim, J., & Lee, K. (2017). Deep convolutional neural networks for predominant instrument
recognition in polyphonic music. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 25(1), 208-221.
27. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database26
26 REFERENCES
REFERENCES:
[13]Bosch, J. J., Janer, J., Fuhrmann, F., & Herrera, P. (2012, October). A Comparison of Sound Segregation Techniques for
Predominant Instrument Recognition in Musical Audio Signals. In ISMIR (pp. 559-564).
[14]Giannakopoulos, T., & Pikrakis, A. (2014). Introduction to audio analysis: a MATLAB® approach. Academic Press.
[15]Erman, Lee Daniel. "An environment and system for machine understanding of connected speech." (1974).
[16]Scheirer, Eric, and Malcolm Slaney. "Construction and evaluation of a robust multifeature speech/music discriminator." 1997
IEEE international conference on acoustics, speech, and signal processing. Vol. 2. IEEE, 1997.
[17]Panagiotakis, Costas, and Georgios Tziritas. "A speech/music discriminator based on RMS and zero-crossings." IEEE
Transactions on multimedia 7.1 (2005): 155-166.
[18]Rabiner, Lawrence R., and Marvin R. Sambur. "An algorithm for determining the endpoints of isolated utterances." Bell
System Technical Journal 54.2 (1975): 297-315.
[19]Tzanetakis, George, and Perry Cook. "Musical genre classification of audio signals." IEEE Transactions on speech and audio
processing 10.5 (2002): 293-302.
[20]Kim Hyoung-Gook,Moreau Nicolas,Thomas Sikora, MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval,
JohnWiley & Sons, 2005.
[21]S. Theodoridis, K. Koutroumbas, Pattern Recognition, fourth ed., Academic Press, Inc., 2008.
[22]Luukka, P. (2011). Feature selection using fuzzy entropy measures with similarity classifier. Expert Systems with Applications,
38(4), 4600-4607.
[23]http://blogs.mathworks.com/loren/2011/11/29/subset-selection-and-regularization-part-2/
[24]Hastie, Trevor, et al. "The elements of statistical learning: data mining, inference and prediction." The Mathematical
Intelligencer 27.2 (2005): 83-85.
[25]Zadeh, Lotfi A. "Fuzzy sets." Information and control 8.3 (1965): 338-353.
[26]De Luca, A., & Termini, S. A definition of non-probabilistic entropy in setting of fuzzy set theory. Information Control (1971),
20, 301–312
28. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database27
27 REFERENCES
REFERENCES:
[27]Parkash, O., Sharma, P. K., & Mahajan, R. (2008). New measures of weighted fuzzy entropy and their applications for the
study of maximum weighted fuzzy entropy principle. Information Sciences, 178(11), 2389-2395.
[28]Zadeh, Lotfi A. "Fuzzy sets." Fuzzy Sets, Fuzzy Logic, And Fuzzy Systems: Selected Papers by Lotfi A Zadeh. 1996. 394-432.
[29]Stehman, S. V. (1997). Selecting and interpreting measures of thematic classification accuracy. Remote sensing of
Environment, 62(1), 77-89.
[30]Powers, D. M. (2011). Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation.
[31]https://www.mathworks.com/help/nnet/ug/multilayer-neural-network-architecture.html
[32]Simon, H. (2008). Neural Networks and Learning Machines: A Comprehensive Foundation.
[33]Daniel, G. (2013). Principles of artificial neural networks (Vol. 7). World Scientific.
[34]Cook, N., & Pople, A. (Eds.). (2004). The Cambridge history of twentieth-century music (Vol. 212). Cambridge: Cambridge
University Press.
[35]Solanki, Arun, and Sachin Pandey. "Music instrument recognition using deep convolutional neural networks." International
Journal of Information Technology (2019): 1-10.
[36]Yonekura, Asami, et al. "Automatic disease stage classification of glioblastoma multiforme histopathological images using
deep convolutional neural network." Biomedical engineering letters 8.3 (2018): 321-327.
[37]Prasath, VB Surya. "Deep learning based computer-aided diagnosis for neuroimaging data: focused review and future
potential." Neuroimmunol Neuroinflammation 5 (2018): 1.
[38]Mousavi, Seyed Muhammad Hossein, S. Younes MiriNezhad, and Vyacheslav Lyashenko. "An Evolutionary-Based Adaptive
Neuro-Fuzzy Expert System as a Family Counselor before Marriage with the Aim of Divorce Rate Reduction." 2nd International
Conference on Research Knowledge Base in Computer Engineering and IT, Education. Vol. 1. 2017.
[39]Mousavi, Seyed Muhammad Hossein. "Fuzzy Calculating of Human Brain’s Weight Using Depth Sensors.", 2nd Iranian
Symposium on Brain Mapping (ISBM2018),10 Oct 2018, National Brain Mapping Lab (NBML), Tehran, Iran.
29. Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database28
28 END
Thank You
?