Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
GRAPHICAL VISUALIZATION
OF MUSICAL EMOTIONS
Presented by:
Pranay Prasoon
MT/SC/10002/2012
M. TECH. SCIENTIFIC COMPUTING
Un...
Hindustani Classical Music(ICM)
 It is rich in both emotional content and musical content.
 There are seven notes in ICM...
Music relation to mathematics
 Mathematics is "the basis of sound" and sound is the
basic of musical aspects.
 Some basi...
Music and mathematics(contd.)
2. Frequency
 The number of times the sound wave completes a cycle of
oscillation in one se...
Music and mathematics(contd.)
4. Pitch Scale
 In India Classical Music each note pitch value is
dependent upon the previo...
Music and mathematics(contd.)
 Time
 Tempo: speed of Beats
 Rhythm : relative time durations of notes determine
in musi...
Literature survey
 We carefully studied the papers from international publishers to understand
the concept of India class...
Literature survey(contd.)
 Zhen-GuoChe, Tzu-An Chiang and Zhen-Hua Che. (2010) : In
this paper the author discusses the a...
Characterization of the Problem
 Our aim is to better explain and explore the relationship
between musical features and e...
Objective of work
1. The main objective of the work is to spread the light on the
emotion recognition in Indian classical ...
Research methodology
 Ground truth data with their features will lead the recognition
process with ANN.
 Artificial neur...
Model Formulation
Data Selection
Serial No. Target
emotions
Ragas in audio
clips
Number of
audio clips
Total
H Bhupali 20
A Bihag 28
1 P Des...
Pre-processing
 Manually divide all samples into two parts named happy and
sad.
 Convert our dataset to standard wave fo...
Feature selection
 Feature extraction involves analysis of speech signal.
 We extracted 13 features for our work:
1. Roll Off 8. pulse cla...
Root Mean Square Energy
 Energy of the signal x can be computed simply by taking the root average
of the square of the am...
Root Mean Square Energy(contd.)
S.N Happy Sad
1 .20566 .095603
2 .16613 .07754
Low energy
 It is defined as the percentage of analysis windows that have
less RMS energy than the average RMS energy acr...
Low energy(Contd.)
S.N HAPPY S.N SAD
1 .341212 1 .51122
2 .59215 2 .51104
3 .46183 3 .48924
4 .48755 4 .51524
Entropy
 The relation of entropy is with the emotion type of surprise.
 If the probability is less for the occurrence of...
Entropy(contd.)
S.N Happy S.N sad
1 .85721 1 .83682
2 .85596 2 .83643
3 .85319 3 .83185
4 .85982 4 .83892
Zero Cross Rating
 The zero-crossing rate is the rate of sign-changes along
a signal.
 Where S is a signal of length T a...
Zero cross Rating(contd.)
S.N Happy Sad
1 789.1929 878.5048
2 736.8308 873.3451
3 945.9796 776.6267
4 1045.694 855.9022
Pitch
 Pitch is a perceptual property that allows the ordering
of sounds on a frequency related scale.
 Pitches are comp...
Pitch(contd.)
S.N Happy_Prob_inc Happy_Prob_dec Sad_prob_ind Sad_Prob_dec
1
0.4403 0.4283 0.4137 0.3673
2
0.4295 0.4815 0....
Event Density
 It estimates number of note onset per second.
S.N Happy Sad
1 2.3038 1.5359
2 2.1369 2.0367
3 3.3723 1.535...
Centroid
 The centroid is defined as the center of gravity of the
spectrum
 It is calculated as the mean of the frequenc...
Centroid(contd.)
S.N Happy Sad
1
1907.3599 1857.3502
2
2353.4706 1936.1145
3
2307.5883 1748.8305
4
1995.0818 1756.3829
Mode
 It estimates the modality, i.e. Major vs. Minor.
 Mode return value between -1 to +1 .
 When closer to +1, the mo...
Mode(contd.)
S.N Happy Sad
1
-0.18676 0.14946
2
-0.06964 -0.0049425
4
-0.16293 -0.051353
4
-0.14412 0.061022
Pulse clarity
 Pulse clarity is considered as a high-level musical dimension
that conveys how easily in a given musical p...
Pulse clarity(contd.)
Roll Off
 Roll-off is the steepness of a transmission
function with frequency.
 The roll-off refers to the rate at which...
Classification
Classification
 Classify the data in two category.
 As input we take list of features.
 Classifier Used: ANN
 Three pr...
Artificial neural Network
 It is a computational model inspired by the human nervous
system.
 ANN generally presented as...
ANN(contd.)
 Architecture:
 Directed graph : each edge is assign an orientation.
 Classification using: Multilayer Feed...
ANN(contd.)
 Steps of Back propagation algorithm.
1. Normalize all I/P values in between 0 to 1.
2. Number of hidden node...
ANN(contd.)
 Input to the hidden layer is computed by multiplying I/P
values to their corresponding weights.
{I}H = [V ] ...
ANN(contd.)
 Input to output layer are computed by multiplying
corresponding weights.
{I}O = [W] {O}H
 output of output ...
ANN(contd.)
 Error can be calculated as:
 {d} which is local gradient of node is calculated as:
ANN(Contd.)
 [y] matrix is calculated :
[Y] = {O}H × d
 Change in Weight:
ANN(contd.)
 Error in Hidden layer :
 And the new d is :
 Calculate [x] matrix:
[x]= ×
ANN(contd.)
 Change in weight of input-hidden layer:
 Updated weights for next training:
 The process will be repeated ...
RESULT and STUDY
 Training Data: 70% of the total (138)
 Validation data: 15 % of the total (39)
 Testing data : 15% of...
Recognition of two emotional states
 1st experiment:
 13 features taken as input.
 Testing data: 39
 Correctly classif...
 Error Histogram:
 Performance graph:
Performance analysing
 Performance for 10 training and testing process. And we see that there is
little variation in accu...
conclusion
 We proposed a new model for automatic recognition of musical emotion which is
based on Artificial neural netw...
scope
 Study of work is useful for them who have
difficulties in understanding ragas in India classical
music.
 The stud...
Future work
 we plan to develop an automatic emotion recognizer for
Indian classical music with more emotions category fo...
Publication
 P. Prasoon and S. Chakraborty, “Raga Analysis
using Artificial Neural Network” - Communicated to
Computation...
Reference
[1] A. Srinivasan (2011). “Speech Recognition Using Hidden Markov
Model”. Applied Mathematical Sciences, Vol. 5,...
Reference(contd.)
[4] Daniela and Bernd Willimek. (2013). Music and Emotions Research
on the Theory of Musical Equilibrati...
Reference(contd.)
[8] Keshi Dai, Harriet J. Fell, and Joel MacAuslan. (2012) “Recognizing emotion in
speech using neural n...
Reference(contd.)
[12] www.wekepedia.org
[13] www.paragchordia.com
[14] www.swarganga.org
[15] www.mathworks.in
[16] www.s...
Reference(contd.)
[20] Yading Song, Simon Dixon, Marcus Pearce (2012) .”EVALUATION OF
MUSICAL FEATURES FOR EMOTION CLASSIF...
Thank you.....
Nächste SlideShare
Wird geladen in …5
×

Graphical visualization of musical emotions

1.393 Aufrufe

Veröffentlicht am

A unique method for Identifying Emotion in Indian Classical Music.

Veröffentlicht in: Wissenschaft, Technologie, Bildung
  • Als Erste(r) kommentieren

Graphical visualization of musical emotions

  1. 1. GRAPHICAL VISUALIZATION OF MUSICAL EMOTIONS Presented by: Pranay Prasoon MT/SC/10002/2012 M. TECH. SCIENTIFIC COMPUTING Under the guidance of: Dr. Saubhik Chakraborty Associate Professor, Dept. of Applied Mathematics
  2. 2. Hindustani Classical Music(ICM)  It is rich in both emotional content and musical content.  There are seven notes in ICM i.e. Sa re ga ma pa dha ni.  The base if Indian classical music is Raga. RAGA  A raga is simply a group of notes .  Each raga invokes a certain mood/ emotion.  Different sequence of notes represent different raga. Example:- Bageshri(sad) : ni sa ga ma dha ni sa Bhupali(happy): sa re ga pa dha sa
  3. 3. Music relation to mathematics  Mathematics is "the basis of sound" and sound is the basic of musical aspects.  Some basic terms with which we can relate music to the mathematics 1. Sound  Music is sound that is organized in a meaningful way with rhythm, melody, and harmony. These are consider as the three dimensions of music.  Sound is the form of energy
  4. 4. Music and mathematics(contd.) 2. Frequency  The number of times the sound wave completes a cycle of oscillation in one second is called its frequency. Frequency is measured in cycles per second or Hertz (Hz).  Higher the frequency , higher is the pitch value. And vice versa. 3. Amplitude  Amplitude is the size of the vibration, and this determines how loud the sound is.Measured in decibels and the range for human ear is (2-130 db)
  5. 5. Music and mathematics(contd.) 4. Pitch Scale  In India Classical Music each note pitch value is dependent upon the previous note pitch value. S=1 P=1.5 R=1.125 D=1.6875 G=1.1265625 N=1.8584375 M=1.423828
  6. 6. Music and mathematics(contd.)  Time  Tempo: speed of Beats  Rhythm : relative time durations of notes determine in musical language is called rhythm.
  7. 7. Literature survey  We carefully studied the papers from international publishers to understand the concept of India classical music, depth study of ragas of Indian classical music, neural network approach in music, pattern recognition, emotion recognition and features of musical clips.  Soltani . K, Ainon R.N (2007) and Yongjin Wang, Ling Guan(2008) author discussed deeply about the emotion recognition in speech signals.  Coutinho, E. &Cangelosi, A. (2010) : In this Book the author discusses about a model capacity of prediction of human emotion while listening to music. In this he added both psychoacoustic and physiological features for the prediction of emotion.  Keshi Dai, Harriet J. Fell, and Joel MacAuslan. (2012) : The author shows the good use of audio features to recognize emotion and the different combination of emotion to recognize and find the accuracy of each combination of emotions.
  8. 8. Literature survey(contd.)  Zhen-GuoChe, Tzu-An Chiang and Zhen-Hua Che. (2010) : In this paper the author discusses the advantage and characteristic of the genetic algorithm and back propagation neural network to train a feed forward neural network to cope with weight adjustment problem.  We found that very less models have been developed for raga identification in Indian classical music.  From our survey we find that neural networks gives high accuracy for identification when uses with multiple features.
  9. 9. Characterization of the Problem  Our aim is to better explain and explore the relationship between musical features and emotion.  The main problem initially identified was to extract musical features for each audio file.  Another problem was to develop a model for recognition of emotion from the musical clips.  Then to graphically visualize the performance of the process of recognition.
  10. 10. Objective of work 1. The main objective of the work is to spread the light on the emotion recognition in Indian classical music. 2. To understand the dependencies of features of music in recognition process. 3. To understand the ANN concept for recognition process. 4. To analyze the error in each recognition process. 5. Visualize the performance of model. (for training, validation and testing)
  11. 11. Research methodology  Ground truth data with their features will lead the recognition process with ANN.  Artificial neural network: ANN are computational models inspired by the nervous system of brain which is capable of machine learning as well as pattern recognition.  ANN when use with features of audio clips will be used for emotion classification.
  12. 12. Model Formulation
  13. 13. Data Selection Serial No. Target emotions Ragas in audio clips Number of audio clips Total H Bhupali 20 A Bihag 28 1 P Desh 30 98 P Marwa 20 Y Bageshree 24 S Bhairavi 19 2 A Bhimpalashi 20 98 D Deskar 15 Todi 20 Total 2 9 196 196
  14. 14. Pre-processing  Manually divide all samples into two parts named happy and sad.  Convert our dataset to standard wave format 44100 Hz.  The audio clips are we take is of 30 second each.
  15. 15. Feature selection
  16. 16.  Feature extraction involves analysis of speech signal.  We extracted 13 features for our work: 1. Roll Off 8. pulse clarity 2. Spread 9. mode 3. Zero cross 10. Entropy 4. Centroid 11. Brightness 5. RMS Energy 12. Probability of increment 6. Low Energy of two successive pitch 7. Event density 13. Probability of decrement of two successive pitch
  17. 17. Root Mean Square Energy  Energy of the signal x can be computed simply by taking the root average of the square of the amplitude, called root-mean-square (RMS): Formula:  Happy labelled son have more RMS energy as compared to sad labelled song.
  18. 18. Root Mean Square Energy(contd.) S.N Happy Sad 1 .20566 .095603 2 .16613 .07754
  19. 19. Low energy  It is defined as the percentage of analysis windows that have less RMS energy than the average RMS energy across the texture window.  As an example, vocal music with silences will have large low- energy value while continuous strings will have small low- energy value
  20. 20. Low energy(Contd.) S.N HAPPY S.N SAD 1 .341212 1 .51122 2 .59215 2 .51104 3 .46183 3 .48924 4 .48755 4 .51524
  21. 21. Entropy  The relation of entropy is with the emotion type of surprise.  If the probability is less for the occurrence of any event than the expectation is more. And then the surprise element is more.  The entropy system is based on this equation.
  22. 22. Entropy(contd.) S.N Happy S.N sad 1 .85721 1 .83682 2 .85596 2 .83643 3 .85319 3 .83185 4 .85982 4 .83892
  23. 23. Zero Cross Rating  The zero-crossing rate is the rate of sign-changes along a signal.  Where S is a signal of length T and the indicator function | {} is 1 if its argument is true and 0 otherwise.
  24. 24. Zero cross Rating(contd.) S.N Happy Sad 1 789.1929 878.5048 2 736.8308 873.3451 3 945.9796 776.6267 4 1045.694 855.9022
  25. 25. Pitch  Pitch is a perceptual property that allows the ordering of sounds on a frequency related scale.  Pitches are compared as "higher" and "lower”.  Pitches are usually quantified as frequencies in cycles per second.  With the use of pitch(pitch contour) we find two Landmark features for our work. Which are Probability of increment in- between two successive pitches and probability of decrement in between two successive pitches.
  26. 26. Pitch(contd.) S.N Happy_Prob_inc Happy_Prob_dec Sad_prob_ind Sad_Prob_dec 1 0.4403 0.4283 0.4137 0.3673 2 0.4295 0.4815 0.4133 0.391 3 0.406 0.4233 0.3317 0.3627 4 0.4147 0.4487 0.3881 0.3848
  27. 27. Event Density  It estimates number of note onset per second. S.N Happy Sad 1 2.3038 1.5359 2 2.1369 2.0367 3 3.3723 1.5359 4 2.6711 0.70117
  28. 28. Centroid  The centroid is defined as the center of gravity of the spectrum  It is calculated as the mean of the frequencies present in the signal, with their magnitudes as the weights.  x(n) represents the weighted frequency value, and f(n) represents the centre frequency.
  29. 29. Centroid(contd.) S.N Happy Sad 1 1907.3599 1857.3502 2 2353.4706 1936.1145 3 2307.5883 1748.8305 4 1995.0818 1756.3829
  30. 30. Mode  It estimates the modality, i.e. Major vs. Minor.  Mode return value between -1 to +1 .  When closer to +1, the more major the given excerpts is predicted to be, the closer the value is to -1, the more minor the excerpt might be.
  31. 31. Mode(contd.) S.N Happy Sad 1 -0.18676 0.14946 2 -0.06964 -0.0049425 4 -0.16293 -0.051353 4 -0.14412 0.061022
  32. 32. Pulse clarity  Pulse clarity is considered as a high-level musical dimension that conveys how easily in a given musical piece, listeners can Understand the underlying rhythmic. S.N Happy S.N Sad 1 .40913 1 .11393 2 .30407 2 .31914 3 .18223 3 .18962 4 .14884 4 .18083
  33. 33. Pulse clarity(contd.)
  34. 34. Roll Off  Roll-off is the steepness of a transmission function with frequency.  The roll-off refers to the rate at which the filter attenuates the input frequency after the cut-off frequency point. S.N Happy S.N Sad 1 4091.8974 1 3543.9783 2 4236.4054 2 3168.4931 3 4525.5054 3 4038.4008 4 4271.0604 4 4007.0263
  35. 35. Classification
  36. 36. Classification  Classify the data in two category.  As input we take list of features.  Classifier Used: ANN  Three process 1. Training 2. Validation 3. Testing
  37. 37. Artificial neural Network  It is a computational model inspired by the human nervous system.  ANN generally presented as interconnected neurons, which can compute values from input.  ANN can be defined based on three characteristic. 1. Architecture 2. Learning mechanism 3. activation function  Every system basically have 3 layer. 1. Input 2. Hidden layer 3. output
  38. 38. ANN(contd.)  Architecture:  Directed graph : each edge is assign an orientation.  Classification using: Multilayer Feed forward NN.  Learning method: Supervised Learning.  Algorithm used: back propagation.
  39. 39. ANN(contd.)  Steps of Back propagation algorithm. 1. Normalize all I/P values in between 0 to 1. 2. Number of hidden node= No of I/P node × No of O/P Node 2 3. V= weight between I/P and Hidden Nodes. W= weight between hidden and O/P Nodes. ( weights are initialized to random values between -1 to +1). 4. Input and Out put of Input layer: {O}I = {I}I
  40. 40. ANN(contd.)  Input to the hidden layer is computed by multiplying I/P values to their corresponding weights. {I}H = [V ] {O}I  Output of Hidden layer is computed using sigmoidal function.
  41. 41. ANN(contd.)  Input to output layer are computed by multiplying corresponding weights. {I}O = [W] {O}H  output of output layer is calculated as:
  42. 42. ANN(contd.)  Error can be calculated as:  {d} which is local gradient of node is calculated as:
  43. 43. ANN(Contd.)  [y] matrix is calculated : [Y] = {O}H × d  Change in Weight:
  44. 44. ANN(contd.)  Error in Hidden layer :  And the new d is :  Calculate [x] matrix: [x]= ×
  45. 45. ANN(contd.)  Change in weight of input-hidden layer:  Updated weights for next training:  The process will be repeated again and again until the error rate will reduce to very small.
  46. 46. RESULT and STUDY  Training Data: 70% of the total (138)  Validation data: 15 % of the total (39)  Testing data : 15% of total(39)
  47. 47. Recognition of two emotional states  1st experiment:  13 features taken as input.  Testing data: 39  Correctly classified for happy = 15 out of 17  Correctly classified for sad = 21 out of 22 Output Input Happy Sad Happy 15 2 Sad 21 1
  48. 48.  Error Histogram:
  49. 49.  Performance graph:
  50. 50. Performance analysing  Performance for 10 training and testing process. And we see that there is little variation in accuracy in our model. Test Number Performance Accuracy with LF Performance accuracy without LF 1. 96.687 62.646 2. 96.947 73.454 3. 98.159 65.09 4. 97.31 62.112 5. 98.054 63.878 6. 96.004 63.003 7. 96.808 68.433 8. 98.691 65.06 9. 82.258 70.112 10. 97.243 64.123
  51. 51. conclusion  We proposed a new model for automatic recognition of musical emotion which is based on Artificial neural network .  We used multilayer feed forward neural network for classification and the algorithm used is Back propagation.  A total of 13 features were extracted from each audio samples. We proposed two new features ( probability of increment in between two successive pitch and probability of decrement in between two successive pitches). Classification process carried out using neural network toolbox in matlab.  A total of 10 experiments were done and each time the accuracy of recognition of emotion was more the 90%. When we classify the model without our land mark features the accuracy is reduced by 30%. The average accuracy we archived for our model is 95.8161 features Without landmark feature With landmark feature Accuracy 65.7911 95.8161
  52. 52. scope  Study of work is useful for them who have difficulties in understanding ragas in India classical music.  The study will be helpful in psychology science to study the changes in brain when listen to Indian classical music.  Our study is useful in medical science.
  53. 53. Future work  we plan to develop an automatic emotion recognizer for Indian classical music with more emotions category for those people who have difficulties in understanding and identifying emotion in the Indian classical music.  We planed to develop model in which we will add some more physiological features like heart beat rate, skin temperature and brain signals. We feel that when we include physiological features the accuracy will increase more for the system.
  54. 54. Publication  P. Prasoon and S. Chakraborty, “Raga Analysis using Artificial Neural Network” - Communicated to Computational Music Science (Book Series), Springer as a research monograph.
  55. 55. Reference [1] A. Srinivasan (2011). “Speech Recognition Using Hidden Markov Model”. Applied Mathematical Sciences, Vol. 5, 2011, no. 79, 3943 – 3948 [2] BjörnSchuller, Manfred Lang, Gerhard Rigoll (2002): "Multimodal Emotion Recognition in Audiovisual Communication", Proc. ICME 2002, 3rd International Conference on Multimedia and Expo, IEEE, vol. 1, pp. 745-748, Lausanne, Switzerland, [3] Coutinho, E. &Cangelosi, A. (2010). “A Neural Network Model for the Prediction of Musical Emotions.”In S. Nefti-Meziani& J.G. Grey (Ed.). Advances in Cognitive Systems (pp. 331-368). London: IET Publisher. ISBN: 978-1849190756
  56. 56. Reference(contd.) [4] Daniela and Bernd Willimek. (2013). Music and Emotions Research on the Theory of Musical Equilibration (die Strebetendenz-Theorie). Copyright © 2011 Daniela und Bernd Willimek [5] Deryaozkan, Stefan scherer and Louis-philippemorency. (2013) “Step-wise emotion recognition using concatenated-HMM”, IEEE Transactions on Multimedia 15(2): 326-338 [6] Gaurav Pandey, chaitanya Mishra and Paul Ipe, “TANSEN: A system for automatic raga Identification “, (2003). PP.1350-1363. Indian International conference on AI. [7]Jack H. David Jr.(1995) , “The Mathematics of Music”. Spring,Math 1513.5097
  57. 57. Reference(contd.) [8] Keshi Dai, Harriet J. Fell, and Joel MacAuslan. (2012) “Recognizing emotion in speech using neural network” [9]Mohammad abd- alrahman mahmaoud Abushariah, Raja Noor Ainon, RoziatiZainuddin, MoustafaElshafei, Othman OmranKhalifa: (2012) “ Arabic speaker-independent continuous automatic speech recognition based on a phonetically rich and balanced speech corpus” . Int. Arab J. Inf. Technol. 9(1): 84-93 [10] O. Lartillot and P. Toiviainen, “A matlab toolbox for musical feature extraction from audio,” in Proc. Digital Audio Effects (DAFx-07), Bordeaux, France, Sep. 10-15 2007 [11] Sandeep bagchee, (1998) “Nad: Understanding Raga music” Eshwar, 1st edition. ISBN-13: 978-8186982075
  58. 58. Reference(contd.) [12] www.wekepedia.org [13] www.paragchordia.com [14] www.swarganga.org [15] www.mathworks.in [16] www.shadjamadhyam.com [17] www.22shruti.com [18] www.knowyourraga.com [19]www.skeptic.skepticgeek.com
  59. 59. Reference(contd.) [20] Yading Song, Simon Dixon, Marcus Pearce (2012) .”EVALUATION OF MUSICAL FEATURES FOR EMOTION CLASSIFICATION”.13th international society for Music Information Retrieval Conference (ISMIR). [21]Yongjin Wang, Ling Guan(2008) : Recognizing Human Emotional State From Audiovisual Signals. IEEE Transactions on Multimedia 10(4): 659- 668 [22] Zhen-GuoChe, Tzu-An Chiang and Zhen-Hua Che. (2010) .“Feed forward neural networks training: A comparison between genetic algorithm and back propagation learning algorithm”. International journal of innovation and computing, information and control .volume 7
  60. 60. Thank you.....

×