SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Downloaden Sie, um offline zu lesen
Tonic Identification System for
Hindustani and Carnatic Music
      Sankalp Gulati, Justin Salamon and Xavier Serra


                  Music Technology Group
                 Universitat Pompeu Fabra


    {sankalp.gulati, justin.salamon, xavier.serra}@upf.edu
7/23/12




           Introduction: Tonic in Indian art music

P
i
t     Tonic
c
h
    time
           ¤  The base pitch chosen by a performer that allows to
               explore the full pitch range in a comfortable way [1]

           ¤  Anchored as ‘Sa’ swar in a performance (mostly)

           ¤  All the other notes used in the raga exposition derive their
               meaning in relation to this pitch value

           ¤  All other accompanying instruments are tuned using this
               pitch as reference
                            2nd CompMusic Workshop, Istanbul, 2012              2
7/23/12




    Role of Drone Instrument

    ¤  Performer and audience needs to hear this pitch
        throughout the concert

    ¤  Reinforces the tonic and establishes all harmonic and
        melodic relationships




                  Surpeti or
                  Shrutibox


Sitar                                                        Tanpura

                                                Electronic
                                                Tanpura


                   2nd CompMusic Workshop, Istanbul, 2012                   3
7/23/12




Introduction: Tonal structure of Tanpura

                               ¤  Four strings

                               ¤  Tunings
                                    ¤  Sa-Sa’-Sa’-Pa
                                    ¤  Sa-Sa’-Sa’-Ma
                                    ¤  Sa-Sa’-Sa’-Ni

                               ¤  Special bridge with thread
                                   inserted (Jvari)
                                    ¤  Violate Helmholtz law [2]

                               ¤  Rich overtones [1]
       Bridge


                2nd CompMusic Workshop, Istanbul, 2012                        4
7/23/12




Introduction: Goals and Motivation

¤  Automatic labeling of the tonic in large databases of
    Indian art music

¤  Devise a system for identification of
   ¤  Tonic pitch for vocal excerpts
   ¤  Tonic pitch class profile for instrumental excerpts

¤  Use all the available data (audio + metadata) to
    achieve maximum accuracy

¤  Confidence measure for each output from the system




                   2nd CompMusic Workshop, Istanbul, 2012              5
7/23/12




Introduction: Goals and Motivation

¤  Fundamental information

¤  Tonic identification: crucial input for:
   ¤  Intonation analysis
   ¤  Raga recognition
   ¤  Melodic motivic analysis




                   2nd CompMusic Workshop, Istanbul, 2012             6
7/23/12




Relevant work: Tonic Identification

¤  Very little work done in the past

¤  Based on melody [ 4,5]

¤  Ranjani et al. take advantage of melodic characteristics
    of Carnatic music [4]




                 2nd CompMusic Workshop, Istanbul, 2012             7
7/23/12




Relevant work: Summary

¤  Utilized only the melodic aspects

¤  Used monophonic pitch trackers for heterophonic data

¤  Limited diversity in database
  ¤  Special raga categories, aalap sections, solo vocal
      recordings


¤  Unexplored aspects:
  ¤  Utilizing background audio content comprising drone sound
  ¤  Taking advantage of different types of available data, like
      audio and metadata
  ¤  Evaluation on diverse database


                 2nd CompMusic Workshop, Istanbul, 2012                8
7/23/12




 Methodology: System Overview



                                                        Manual
                                                        annotation
                                        Yes
Audio   Metadata



                                                       No


              No




                            Yes
                                                            Tonic

                   2nd CompMusic Workshop, Istanbul, 2012                      9
7/23/12




Methodology: System Overview

¤  Culture specific characteristics for tonic identification
   ¤  Presence of drone*
   ¤  Culture specific melodic characteristics
   ¤  Raga knowledge
   ¤  Melodic Motifs

¤  Use variable amount of data that is sufficient enough to
    identify tonic with maximum confidence.
   ¤  Audio data
   ¤  Metadata (Male/Female, Hindustani/Carnatic, Raga etc.)




                    2nd CompMusic Workshop, Istanbul, 2012                10
7/23/12




Methodology: Tonic Identification

¤  Audio example:

¤  Utilizing drone sound

¤  Chroma or multi-pitch analysis




                                                      Multi-pitch Analysis [7]




                 2nd CompMusic Workshop, Istanbul, 2012                                11
7/23/12




Tonic Identification: Signal Processing


               Audio



                                                                  Sinusoid Extraction

             Sinusoids

                                                                    Pitch Salience
                                                                    computation

Time frequency salience

                                                                  Tonic candidate
                                                                    generation

      Tonic candidates


                         2nd CompMusic Workshop, Istanbul, 2012                           12
7/23/12




  Tonic Identification: Signal Processing




¤  STFT
   ¤  Hop size: 11 ms
   ¤  Window length: 46 ms
   ¤  Window type: hamming
   ¤  FFT = 8192 points


                     2nd CompMusic Workshop, Istanbul, 2012             13
7/23/12




  Tonic Identification: Signal Processing




¤  Spectral peak picking
  ¤  Absolute threshold: -60 dB
  ¤  Relative threshold: -40 dB




                    2nd CompMusic Workshop, Istanbul, 2012             14
7/23/12




 Tonic Identification: Signal Processing




¤  Frequency/Amplitude
    correction
  ¤  Parabolic interpolation




                    2nd CompMusic Workshop, Istanbul, 2012             15
7/23/12




  Tonic Identification: Signal Processing




¤  Harmonic summation [7]
   ¤    Spectrum considered: 55-7200 Hz
   ¤    Frequency range: 55-1760 Hz
   ¤    Base frequency: 55 Hz
   ¤    Bin resolution: 10 cents per bin (120 per
         octave)
   ¤    N octaves: 5
   ¤    Maximum harmonics: 20
   ¤    Alpha: 1
   ¤    Beta: 0.8
   ¤    Square cosine window across 50 cents


                              2nd CompMusic Workshop, Istanbul, 2012             16
7/23/12




  Tonic Identification: Signal Processing




¤  Tonic candidate generation
  ¤  Number of salience peaks
      per frame: 5
  ¤  Frequency range: 110-550
      Hz
  ¤  After candidate selection
      salience is no longer
      considered!!!!

                   2nd CompMusic Workshop, Istanbul, 2012             17
7/23/12




Tonic Identification : Two sub-tasks

¤  Caters to both vocal and instrumental excerpts
  ¤  Identify tonic pitch class (PC) using multi-pitch histogram
  ¤  Estimate the correct octave using predominant melody

¤  Use predominant melody extraction approach proposed
    by Justin Salamon et al. [6]

¤  Tonic PCP
  ¤  Peak Picking + Machine learning

¤  Tonic octave estimation
  ¤  Rule based method + Classification based approach




                 2nd CompMusic Workshop, Istanbul, 2012                       18
7/23/12




   Tonic Identification : PC identification

     ¤  Classification based template learning

     ¤  Two kind of class mappings
                            ¤  Rank of the highest tonic PC
                            ¤  Highest peak as Tonic or Non tonic

     ¤  Feature extracted # 20 (f1-f10, a1-a10)
                                                                      Multipitch Histogram
                                                                          f2
                       1

                      0.9
Normalized salience




                      0.8

                      0.7              f3
                      0.6
                                                           f4
                      0.5                                                             f5
                      0.4

                      0.3

                      0.2

                      0.1

                                 100        150                 200             250          300   350   400
                                                    Frequency bins (1 bin = 10 cents), Ref: 55Hz
                                                  2nd CompMusic Workshop, Istanbul, 2012                                 19
7/23/12




      Tonic Identification : PC identification

      ¤  Decision Tree:
                                                                                  Sa
                                                                                                  Sa


                   <=5              >5




                                                                      salience
                                                                                             Pa




       <=-7                >-7                                                   Frequency



                                                                                             Pa
<=5           >5         <=-6            >-6                                      Sa
                                                                                                  Sa


                                                                      salience
                                 <=-11         >-11
                                                                                 Frequency


                                  2nd CompMusic Workshop, Istanbul, 2012                                         20
7/23/12




              Tonic Identification : Octave Identification

                ¤  Tonic octave
                            ¤  Rule based method
                            ¤  Classification based approach

                ¤  25 Features: a1-a25
                                                    Perdominent Melody Histogram
                       1
Normalized Salience




                      0.8

                      0.6

                      0.4

                      0.2

                       0
                              50       100            150             200             250     300   350
                                              Frequency bins (1 bin = 10 cents), Ref: 55 Hz


                                             2nd CompMusic Workshop, Istanbul, 2012                             21
7/23/12




Evaluation: Database

¤  Subset of CompMusic database (>300 Cds) [3]




 Approach 2: #540, 3min (PCP) + 238, full recordings (Octave)


                2nd CompMusic Workshop, Istanbul, 2012                22
7/23/12




Evaluation: Database

¤  Tonic distribution
                                        60
                                                                                Female singers
                                        50                                      Male singers


                  Number of instances
                                        40

                                        30

                                        20

                                        10

                                         0
                                             120   140   160   180 200 220      240   260    280
                                                               Frequency (Hz)


¤  Statistics (for 364 vocal excerpts)
   ¤  Male (80 %), Female (20%), Hindustani (38%), Carnatic (62%),
       Unique artist (#36)

¤  Statistics (for 540 vocal and instrumental excerpts)
   ¤  Hindustani (36%), Carnatic (64%), Unique artist (#55)

                                        2nd CompMusic Workshop, Istanbul, 2012                               23
7/23/12




Evaluation: Annotations

¤  Annotations done by the author

¤  Extracted 5 tonic candidates from multi-pitch histograms
    between 110-370 Hz

¤  Matlab GUI to speed up the annotation procedure




                 2nd CompMusic Workshop, Istanbul, 2012                  24
7/23/12




Evaluation: Accuracy measures

¤  Output correct within 50 cents of the ground truth

¤  10 fold cross validation + rule based classification

¤  Weka: data mining tool

¤  Feature selection: CfsSubsetEval (features > 80% folds)

¤  Classifier: J48 decision tree

¤  Performs better than
   ¤  SVM-polynomial kernel (6% difference in accuracy)
   ¤  K* classifier (5% difference in accuracy)



                  2nd CompMusic Workshop, Istanbul, 2012                25
7/23/12




            Results

                                    Class
Approach(%) Map         #folds              # Features Tonic pitch Tonic PCP           5th      4th      Other
                                     EQ
AP1_EXP1            -       -         -            -               -          85       10.7     0.93        3.3
AP1_EXP2          M1        1        no          1, S2             -          93.7     1.48     8.9         0.9
AP1_EXP3          M1       10        no          4, S3             -          92.9     1.9      3.5         1.7
AP1_EXP4          M1       10       yes          4, S4             -          74.2      11      7.6         6.7
AP1_EXP5          M2        1        no          1, S2             -          91       3.3       3          2.7
AP1_EXP6          M2       10        no          2, S5             -          91.8     2.2       3           3
AP1_EXP7          M2       10       yes          2, S5             -          87.8     4.2       4          3.9




¤  M1 : tonic PCP rank, M2 : highest peak tonic or non-tonic

¤  S1: [f2, f3, f5],   S2: [f2],    S3: [f2, f4, f6, a5],       S4: [f2, f3, a3, a5], S5: [f2, f3]


                                     2nd CompMusic Workshop, Istanbul, 2012                                       26
7/23/12




Results




¤  Approach 2, Octave identification
  ¤  Rule based approach – 99 %
  ¤  Classification based approach – 100%




                2nd CompMusic Workshop, Istanbul, 2012             27
7/23/12




             Discussion: PCP Identification

                 ¤  AP-1: Performance for male singers (95%), female singers
                     (88%)

                 ¤  Error cases
                    ¤  Mostly Ma tuning songs
                    ¤  More female singers

                 ¤  Sensitive to selected frequency range for tonic candidates,
                     a range of 110-370 Hz works optimal
                                                                  Pa
                                                                                                   Sa
                                                                          Sa                                  Sa
                            Sa
            Sa                                        Sa
                                                                                                        Ma



                                                                                       salience
                                   salience
salience




                       Pa




           Frequency                                Frequency                                     Frequency

                                              2nd CompMusic Workshop, Istanbul, 2012                                         28
7/23/12




Discussion : Octave Identification

¤  Challenges faced by rule based approach
          ¤  Hindustani musicians go roughly -500 cents below tonic
          ¤  Carnatic musicians generally don’t go that below tonic
          ¤  Melody estimation errors at low frequency
          ¤  Concept of Madhyam shruti

                                                  Perdominent Melody Histogram
                         1
  Normalized Salience




                        0.8

                        0.6

                        0.4

                        0.2

                         0
                              50   100              150             200             250     300   350
                                            Frequency bins (1 bin = 10 cents), Ref: 55 Hz


                                         2nd CompMusic Workshop, Istanbul, 2012                                   29
7/23/12




Conclusions and Future Work

¤  Drone sound in the background provides an important
    cue for the identification of tonic and can be utilized to
    automatically perform this task

¤  System should be fed with more information to
    differentiate between ‘Pa’ and ‘Ma’ tuning

¤  Future Work:
  ¤  Exploring melodic characteristics for tonic identification
  ¤  Deeper analysis of confidence measure concept
  ¤  Study influence of cultural background on human
      performance for this task




                   2nd CompMusic Workshop, Istanbul, 2012                    30
7/23/12




REFERENCES

1.  B. C. Deva. The Music of India: A Scientific Study. Munshiram Manoharlal
    Publishers, Delhi, 1980.
2.  C. V. Raman. On some Indian stringed instruments. In Indian Association for the
    Cultivation of Science, volume 33, pages 29-33, 1921.
3.  X. Serra. A multicultural approach in music information research. In 12th Int. Soc. for
    Music Info. Retrieval Conf., Miami, USA, Oct. 2011.
4.  R. Sengupta, N. Dey, D. Nag, A. K. Datta, and A. Mukerjee, “Automatic Tonic ( SA )
    Detection Algorithm in Indian Classical Vocal Music,” in National Symposium on
    Acoustics, 2005, pp. 1-5.
5.  T. V. Ranjani, H.G.; Arthi, S.; Sreenivas, “Carnatic music analysis: Shadja, swara
    identification and rAga verification in AlApana using stochastic models,” Applications
    of Signal Processing to Audio and Acoustics (WASPAA), IEEE Workshop, pp. 29-32,
    2011.
6.  J. Salamon and E. G´omez. Melody extraction from polyphonic music signals using
    pitch contour characteristics. IEEE Transactions on Audio, Speech, and Language
    Processing, 20(6):1759–1770, Aug. 2012.
7.  J. Salamon, E. G´omez, and J. Bonada. Sinusoid extraction and salience function
    design for predominant melody estimation. In Proc. 14th Int. Conf. on Digital Audio
    Effects (DAFX-11), pages 73–80, Paris, France, Sep. 2011.


                        2nd CompMusic Workshop, Istanbul, 2012                                 31
7/23/12




        Thank you


       Questions?



2nd CompMusic Workshop, Istanbul, 2012             32

Más contenido relacionado

Andere mochten auch

K TO 12 GRADE 5 LEARNER’S MATERIAL IN MUSIC (Q1-Q4)
K TO 12 GRADE 5 LEARNER’S MATERIAL IN MUSIC (Q1-Q4)K TO 12 GRADE 5 LEARNER’S MATERIAL IN MUSIC (Q1-Q4)
K TO 12 GRADE 5 LEARNER’S MATERIAL IN MUSIC (Q1-Q4)LiGhT ArOhL
 
The concepts of music
The concepts of musicThe concepts of music
The concepts of musiccorinbone
 
Module 6.8 mapeh
Module 6.8 mapehModule 6.8 mapeh
Module 6.8 mapehNoel Tan
 
Tonic, subdominant, dominant and intervals
Tonic, subdominant, dominant and intervalsTonic, subdominant, dominant and intervals
Tonic, subdominant, dominant and intervalsAngelica Nuby
 

Andere mochten auch (8)

Music 10 chords ppt
Music 10   chords pptMusic 10   chords ppt
Music 10 chords ppt
 
K TO 12 GRADE 5 LEARNER’S MATERIAL IN MUSIC (Q1-Q4)
K TO 12 GRADE 5 LEARNER’S MATERIAL IN MUSIC (Q1-Q4)K TO 12 GRADE 5 LEARNER’S MATERIAL IN MUSIC (Q1-Q4)
K TO 12 GRADE 5 LEARNER’S MATERIAL IN MUSIC (Q1-Q4)
 
Form
FormForm
Form
 
The concepts of music
The concepts of musicThe concepts of music
The concepts of music
 
Module 6.8 mapeh
Module 6.8 mapehModule 6.8 mapeh
Module 6.8 mapeh
 
The Elements of Music
The Elements of MusicThe Elements of Music
The Elements of Music
 
Tonic, subdominant, dominant and intervals
Tonic, subdominant, dominant and intervalsTonic, subdominant, dominant and intervals
Tonic, subdominant, dominant and intervals
 
Music 10 learning material
Music 10 learning material Music 10 learning material
Music 10 learning material
 

Mehr von Sankalp Gulati

Mining Melodic Patterns in Large Audio Collections of Indian Art Music
	Mining Melodic Patterns in Large Audio Collections of Indian Art Music	Mining Melodic Patterns in Large Audio Collections of Indian Art Music
Mining Melodic Patterns in Large Audio Collections of Indian Art MusicSankalp Gulati
 
Landmark Detection in Hindustani Music Melodies
Landmark Detection in Hindustani Music MelodiesLandmark Detection in Hindustani Music Melodies
Landmark Detection in Hindustani Music MelodiesSankalp Gulati
 
Phrase-based Rāga Recognition Using Vector Space Modeling
Phrase-based Rāga Recognition Using Vector Space ModelingPhrase-based Rāga Recognition Using Vector Space Modeling
Phrase-based Rāga Recognition Using Vector Space ModelingSankalp Gulati
 
[Tutorial] Computational Approaches to Melodic Analysis of Indian Art Music
[Tutorial] Computational Approaches to Melodic Analysis of Indian Art Music[Tutorial] Computational Approaches to Melodic Analysis of Indian Art Music
[Tutorial] Computational Approaches to Melodic Analysis of Indian Art MusicSankalp Gulati
 
Computational Approaches to Melodic Analysis of Indian Art Music
Computational Approaches to Melodic Analysis of Indian Art MusicComputational Approaches to Melodic Analysis of Indian Art Music
Computational Approaches to Melodic Analysis of Indian Art MusicSankalp Gulati
 
Computational Melodic Analysis of Indian Art Music
Computational Melodic Analysis of Indian Art MusicComputational Melodic Analysis of Indian Art Music
Computational Melodic Analysis of Indian Art MusicSankalp Gulati
 
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music CorporaComputational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music CorporaSankalp Gulati
 
Discovery and Characterization of Melodic Motives in Large Audio Music Collec...
Discovery and Characterization of Melodic Motives in Large Audio Music Collec...Discovery and Characterization of Melodic Motives in Large Audio Music Collec...
Discovery and Characterization of Melodic Motives in Large Audio Music Collec...Sankalp Gulati
 
Tonic Identification System for Hindustani and Carnatic Music
Tonic Identification System for Hindustani and Carnatic MusicTonic Identification System for Hindustani and Carnatic Music
Tonic Identification System for Hindustani and Carnatic MusicSankalp Gulati
 

Mehr von Sankalp Gulati (10)

Mining Melodic Patterns in Large Audio Collections of Indian Art Music
	Mining Melodic Patterns in Large Audio Collections of Indian Art Music	Mining Melodic Patterns in Large Audio Collections of Indian Art Music
Mining Melodic Patterns in Large Audio Collections of Indian Art Music
 
Landmark Detection in Hindustani Music Melodies
Landmark Detection in Hindustani Music MelodiesLandmark Detection in Hindustani Music Melodies
Landmark Detection in Hindustani Music Melodies
 
Phrase-based Rāga Recognition Using Vector Space Modeling
Phrase-based Rāga Recognition Using Vector Space ModelingPhrase-based Rāga Recognition Using Vector Space Modeling
Phrase-based Rāga Recognition Using Vector Space Modeling
 
[Tutorial] Computational Approaches to Melodic Analysis of Indian Art Music
[Tutorial] Computational Approaches to Melodic Analysis of Indian Art Music[Tutorial] Computational Approaches to Melodic Analysis of Indian Art Music
[Tutorial] Computational Approaches to Melodic Analysis of Indian Art Music
 
Computational Approaches to Melodic Analysis of Indian Art Music
Computational Approaches to Melodic Analysis of Indian Art MusicComputational Approaches to Melodic Analysis of Indian Art Music
Computational Approaches to Melodic Analysis of Indian Art Music
 
Computational Melodic Analysis of Indian Art Music
Computational Melodic Analysis of Indian Art MusicComputational Melodic Analysis of Indian Art Music
Computational Melodic Analysis of Indian Art Music
 
Computational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music CorporaComputational Approaches for Melodic Description in Indian Art Music Corpora
Computational Approaches for Melodic Description in Indian Art Music Corpora
 
Hindify
HindifyHindify
Hindify
 
Discovery and Characterization of Melodic Motives in Large Audio Music Collec...
Discovery and Characterization of Melodic Motives in Large Audio Music Collec...Discovery and Characterization of Melodic Motives in Large Audio Music Collec...
Discovery and Characterization of Melodic Motives in Large Audio Music Collec...
 
Tonic Identification System for Hindustani and Carnatic Music
Tonic Identification System for Hindustani and Carnatic MusicTonic Identification System for Hindustani and Carnatic Music
Tonic Identification System for Hindustani and Carnatic Music
 

Último

Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsDianaGray10
 
Patch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updatePatch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updateadam112203
 
Technical SEO for Improved Accessibility WTS FEST
Technical SEO for Improved Accessibility  WTS FESTTechnical SEO for Improved Accessibility  WTS FEST
Technical SEO for Improved Accessibility WTS FESTBillieHyde
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxNeo4j
 
From the origin to the future of Open Source model and business
From the origin to the future of  Open Source model and businessFrom the origin to the future of  Open Source model and business
From the origin to the future of Open Source model and businessFrancesco Corti
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxSatishbabu Gunukula
 
How to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxHow to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxKaustubhBhavsar6
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch TuesdayIvanti
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNeo4j
 
Scenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosScenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosErol GIRAUDY
 
Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingFrancesco Corti
 
Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.IPLOOK Networks
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInThousandEyes
 
How to release an Open Source Dataweave Library
How to release an Open Source Dataweave LibraryHow to release an Open Source Dataweave Library
How to release an Open Source Dataweave Libraryshyamraj55
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024Brian Pichman
 
Introduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationIntroduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationKnoldus Inc.
 
.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptxHansamali Gamage
 
UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3DianaGray10
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxNeo4j
 

Último (20)

Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projects
 
Patch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updatePatch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 update
 
Technical SEO for Improved Accessibility WTS FEST
Technical SEO for Improved Accessibility  WTS FESTTechnical SEO for Improved Accessibility  WTS FEST
Technical SEO for Improved Accessibility WTS FEST
 
SheDev 2024
SheDev 2024SheDev 2024
SheDev 2024
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
 
From the origin to the future of Open Source model and business
From the origin to the future of  Open Source model and businessFrom the origin to the future of  Open Source model and business
From the origin to the future of Open Source model and business
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptx
 
How to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxHow to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptx
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch Tuesday
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4j
 
Scenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosScenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenarios
 
Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is going
 
Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
 
How to release an Open Source Dataweave Library
How to release an Open Source Dataweave LibraryHow to release an Open Source Dataweave Library
How to release an Open Source Dataweave Library
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024
 
Introduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationIntroduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its application
 
.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx
 
UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
 

Tonic Identification System for Indian Art Music

  • 1. Tonic Identification System for Hindustani and Carnatic Music Sankalp Gulati, Justin Salamon and Xavier Serra Music Technology Group Universitat Pompeu Fabra {sankalp.gulati, justin.salamon, xavier.serra}@upf.edu
  • 2. 7/23/12 Introduction: Tonic in Indian art music P i t Tonic c h time ¤  The base pitch chosen by a performer that allows to explore the full pitch range in a comfortable way [1] ¤  Anchored as ‘Sa’ swar in a performance (mostly) ¤  All the other notes used in the raga exposition derive their meaning in relation to this pitch value ¤  All other accompanying instruments are tuned using this pitch as reference 2nd CompMusic Workshop, Istanbul, 2012 2
  • 3. 7/23/12 Role of Drone Instrument ¤  Performer and audience needs to hear this pitch throughout the concert ¤  Reinforces the tonic and establishes all harmonic and melodic relationships Surpeti or Shrutibox Sitar Tanpura Electronic Tanpura 2nd CompMusic Workshop, Istanbul, 2012 3
  • 4. 7/23/12 Introduction: Tonal structure of Tanpura ¤  Four strings ¤  Tunings ¤  Sa-Sa’-Sa’-Pa ¤  Sa-Sa’-Sa’-Ma ¤  Sa-Sa’-Sa’-Ni ¤  Special bridge with thread inserted (Jvari) ¤  Violate Helmholtz law [2] ¤  Rich overtones [1] Bridge 2nd CompMusic Workshop, Istanbul, 2012 4
  • 5. 7/23/12 Introduction: Goals and Motivation ¤  Automatic labeling of the tonic in large databases of Indian art music ¤  Devise a system for identification of ¤  Tonic pitch for vocal excerpts ¤  Tonic pitch class profile for instrumental excerpts ¤  Use all the available data (audio + metadata) to achieve maximum accuracy ¤  Confidence measure for each output from the system 2nd CompMusic Workshop, Istanbul, 2012 5
  • 6. 7/23/12 Introduction: Goals and Motivation ¤  Fundamental information ¤  Tonic identification: crucial input for: ¤  Intonation analysis ¤  Raga recognition ¤  Melodic motivic analysis 2nd CompMusic Workshop, Istanbul, 2012 6
  • 7. 7/23/12 Relevant work: Tonic Identification ¤  Very little work done in the past ¤  Based on melody [ 4,5] ¤  Ranjani et al. take advantage of melodic characteristics of Carnatic music [4] 2nd CompMusic Workshop, Istanbul, 2012 7
  • 8. 7/23/12 Relevant work: Summary ¤  Utilized only the melodic aspects ¤  Used monophonic pitch trackers for heterophonic data ¤  Limited diversity in database ¤  Special raga categories, aalap sections, solo vocal recordings ¤  Unexplored aspects: ¤  Utilizing background audio content comprising drone sound ¤  Taking advantage of different types of available data, like audio and metadata ¤  Evaluation on diverse database 2nd CompMusic Workshop, Istanbul, 2012 8
  • 9. 7/23/12 Methodology: System Overview Manual annotation Yes Audio Metadata No No Yes Tonic 2nd CompMusic Workshop, Istanbul, 2012 9
  • 10. 7/23/12 Methodology: System Overview ¤  Culture specific characteristics for tonic identification ¤  Presence of drone* ¤  Culture specific melodic characteristics ¤  Raga knowledge ¤  Melodic Motifs ¤  Use variable amount of data that is sufficient enough to identify tonic with maximum confidence. ¤  Audio data ¤  Metadata (Male/Female, Hindustani/Carnatic, Raga etc.) 2nd CompMusic Workshop, Istanbul, 2012 10
  • 11. 7/23/12 Methodology: Tonic Identification ¤  Audio example: ¤  Utilizing drone sound ¤  Chroma or multi-pitch analysis Multi-pitch Analysis [7] 2nd CompMusic Workshop, Istanbul, 2012 11
  • 12. 7/23/12 Tonic Identification: Signal Processing Audio Sinusoid Extraction Sinusoids Pitch Salience computation Time frequency salience Tonic candidate generation Tonic candidates 2nd CompMusic Workshop, Istanbul, 2012 12
  • 13. 7/23/12 Tonic Identification: Signal Processing ¤  STFT ¤  Hop size: 11 ms ¤  Window length: 46 ms ¤  Window type: hamming ¤  FFT = 8192 points 2nd CompMusic Workshop, Istanbul, 2012 13
  • 14. 7/23/12 Tonic Identification: Signal Processing ¤  Spectral peak picking ¤  Absolute threshold: -60 dB ¤  Relative threshold: -40 dB 2nd CompMusic Workshop, Istanbul, 2012 14
  • 15. 7/23/12 Tonic Identification: Signal Processing ¤  Frequency/Amplitude correction ¤  Parabolic interpolation 2nd CompMusic Workshop, Istanbul, 2012 15
  • 16. 7/23/12 Tonic Identification: Signal Processing ¤  Harmonic summation [7] ¤  Spectrum considered: 55-7200 Hz ¤  Frequency range: 55-1760 Hz ¤  Base frequency: 55 Hz ¤  Bin resolution: 10 cents per bin (120 per octave) ¤  N octaves: 5 ¤  Maximum harmonics: 20 ¤  Alpha: 1 ¤  Beta: 0.8 ¤  Square cosine window across 50 cents 2nd CompMusic Workshop, Istanbul, 2012 16
  • 17. 7/23/12 Tonic Identification: Signal Processing ¤  Tonic candidate generation ¤  Number of salience peaks per frame: 5 ¤  Frequency range: 110-550 Hz ¤  After candidate selection salience is no longer considered!!!! 2nd CompMusic Workshop, Istanbul, 2012 17
  • 18. 7/23/12 Tonic Identification : Two sub-tasks ¤  Caters to both vocal and instrumental excerpts ¤  Identify tonic pitch class (PC) using multi-pitch histogram ¤  Estimate the correct octave using predominant melody ¤  Use predominant melody extraction approach proposed by Justin Salamon et al. [6] ¤  Tonic PCP ¤  Peak Picking + Machine learning ¤  Tonic octave estimation ¤  Rule based method + Classification based approach 2nd CompMusic Workshop, Istanbul, 2012 18
  • 19. 7/23/12 Tonic Identification : PC identification ¤  Classification based template learning ¤  Two kind of class mappings ¤  Rank of the highest tonic PC ¤  Highest peak as Tonic or Non tonic ¤  Feature extracted # 20 (f1-f10, a1-a10) Multipitch Histogram f2 1 0.9 Normalized salience 0.8 0.7 f3 0.6 f4 0.5 f5 0.4 0.3 0.2 0.1 100 150 200 250 300 350 400 Frequency bins (1 bin = 10 cents), Ref: 55Hz 2nd CompMusic Workshop, Istanbul, 2012 19
  • 20. 7/23/12 Tonic Identification : PC identification ¤  Decision Tree: Sa Sa <=5 >5 salience Pa <=-7 >-7 Frequency Pa <=5 >5 <=-6 >-6 Sa Sa salience <=-11 >-11 Frequency 2nd CompMusic Workshop, Istanbul, 2012 20
  • 21. 7/23/12 Tonic Identification : Octave Identification ¤  Tonic octave ¤  Rule based method ¤  Classification based approach ¤  25 Features: a1-a25 Perdominent Melody Histogram 1 Normalized Salience 0.8 0.6 0.4 0.2 0 50 100 150 200 250 300 350 Frequency bins (1 bin = 10 cents), Ref: 55 Hz 2nd CompMusic Workshop, Istanbul, 2012 21
  • 22. 7/23/12 Evaluation: Database ¤  Subset of CompMusic database (>300 Cds) [3] Approach 2: #540, 3min (PCP) + 238, full recordings (Octave) 2nd CompMusic Workshop, Istanbul, 2012 22
  • 23. 7/23/12 Evaluation: Database ¤  Tonic distribution 60 Female singers 50 Male singers Number of instances 40 30 20 10 0 120 140 160 180 200 220 240 260 280 Frequency (Hz) ¤  Statistics (for 364 vocal excerpts) ¤  Male (80 %), Female (20%), Hindustani (38%), Carnatic (62%), Unique artist (#36) ¤  Statistics (for 540 vocal and instrumental excerpts) ¤  Hindustani (36%), Carnatic (64%), Unique artist (#55) 2nd CompMusic Workshop, Istanbul, 2012 23
  • 24. 7/23/12 Evaluation: Annotations ¤  Annotations done by the author ¤  Extracted 5 tonic candidates from multi-pitch histograms between 110-370 Hz ¤  Matlab GUI to speed up the annotation procedure 2nd CompMusic Workshop, Istanbul, 2012 24
  • 25. 7/23/12 Evaluation: Accuracy measures ¤  Output correct within 50 cents of the ground truth ¤  10 fold cross validation + rule based classification ¤  Weka: data mining tool ¤  Feature selection: CfsSubsetEval (features > 80% folds) ¤  Classifier: J48 decision tree ¤  Performs better than ¤  SVM-polynomial kernel (6% difference in accuracy) ¤  K* classifier (5% difference in accuracy) 2nd CompMusic Workshop, Istanbul, 2012 25
  • 26. 7/23/12 Results Class Approach(%) Map #folds # Features Tonic pitch Tonic PCP 5th 4th Other EQ AP1_EXP1 - - - - - 85 10.7 0.93 3.3 AP1_EXP2 M1 1 no 1, S2 - 93.7 1.48 8.9 0.9 AP1_EXP3 M1 10 no 4, S3 - 92.9 1.9 3.5 1.7 AP1_EXP4 M1 10 yes 4, S4 - 74.2 11 7.6 6.7 AP1_EXP5 M2 1 no 1, S2 - 91 3.3 3 2.7 AP1_EXP6 M2 10 no 2, S5 - 91.8 2.2 3 3 AP1_EXP7 M2 10 yes 2, S5 - 87.8 4.2 4 3.9 ¤  M1 : tonic PCP rank, M2 : highest peak tonic or non-tonic ¤  S1: [f2, f3, f5], S2: [f2], S3: [f2, f4, f6, a5], S4: [f2, f3, a3, a5], S5: [f2, f3] 2nd CompMusic Workshop, Istanbul, 2012 26
  • 27. 7/23/12 Results ¤  Approach 2, Octave identification ¤  Rule based approach – 99 % ¤  Classification based approach – 100% 2nd CompMusic Workshop, Istanbul, 2012 27
  • 28. 7/23/12 Discussion: PCP Identification ¤  AP-1: Performance for male singers (95%), female singers (88%) ¤  Error cases ¤  Mostly Ma tuning songs ¤  More female singers ¤  Sensitive to selected frequency range for tonic candidates, a range of 110-370 Hz works optimal Pa Sa Sa Sa Sa Sa Sa Ma salience salience salience Pa Frequency Frequency Frequency 2nd CompMusic Workshop, Istanbul, 2012 28
  • 29. 7/23/12 Discussion : Octave Identification ¤  Challenges faced by rule based approach ¤  Hindustani musicians go roughly -500 cents below tonic ¤  Carnatic musicians generally don’t go that below tonic ¤  Melody estimation errors at low frequency ¤  Concept of Madhyam shruti Perdominent Melody Histogram 1 Normalized Salience 0.8 0.6 0.4 0.2 0 50 100 150 200 250 300 350 Frequency bins (1 bin = 10 cents), Ref: 55 Hz 2nd CompMusic Workshop, Istanbul, 2012 29
  • 30. 7/23/12 Conclusions and Future Work ¤  Drone sound in the background provides an important cue for the identification of tonic and can be utilized to automatically perform this task ¤  System should be fed with more information to differentiate between ‘Pa’ and ‘Ma’ tuning ¤  Future Work: ¤  Exploring melodic characteristics for tonic identification ¤  Deeper analysis of confidence measure concept ¤  Study influence of cultural background on human performance for this task 2nd CompMusic Workshop, Istanbul, 2012 30
  • 31. 7/23/12 REFERENCES 1.  B. C. Deva. The Music of India: A Scientific Study. Munshiram Manoharlal Publishers, Delhi, 1980. 2.  C. V. Raman. On some Indian stringed instruments. In Indian Association for the Cultivation of Science, volume 33, pages 29-33, 1921. 3.  X. Serra. A multicultural approach in music information research. In 12th Int. Soc. for Music Info. Retrieval Conf., Miami, USA, Oct. 2011. 4.  R. Sengupta, N. Dey, D. Nag, A. K. Datta, and A. Mukerjee, “Automatic Tonic ( SA ) Detection Algorithm in Indian Classical Vocal Music,” in National Symposium on Acoustics, 2005, pp. 1-5. 5.  T. V. Ranjani, H.G.; Arthi, S.; Sreenivas, “Carnatic music analysis: Shadja, swara identification and rAga verification in AlApana using stochastic models,” Applications of Signal Processing to Audio and Acoustics (WASPAA), IEEE Workshop, pp. 29-32, 2011. 6.  J. Salamon and E. G´omez. Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Transactions on Audio, Speech, and Language Processing, 20(6):1759–1770, Aug. 2012. 7.  J. Salamon, E. G´omez, and J. Bonada. Sinusoid extraction and salience function design for predominant melody estimation. In Proc. 14th Int. Conf. on Digital Audio Effects (DAFX-11), pages 73–80, Paris, France, Sep. 2011. 2nd CompMusic Workshop, Istanbul, 2012 31
  • 32. 7/23/12 Thank you Questions? 2nd CompMusic Workshop, Istanbul, 2012 32