MFCC and CZT Comparison for Speech Recognition

•Als PPT, PDF herunterladen•

1 gefällt mir•1,460 views

This document summarizes a seminar presentation on comparing the acoustic features of MFCC and CZT-based cepstrum for speech recognition. MFCC is commonly used in speech recognition but the presentation evaluates CZT as an alternative. An experiment was conducted with 6 speakers, testing recognition rates using MFCC, CZT, and a combination. The results showed that combining MFCC and CZT-based features achieved the highest recognition rate of 86.25%, demonstrating CZT's ability to enhance frequency resolution of transient speech signals.

Technologie

SEMINAR ON

Acoustic Feature Comparison of MFCC and
CZT-based Cepstrum for Speech Recognition

Guided by:- Presented by:-
Prof. R.V.Pawar Neehal B. Jiwane

Introduction

 The Mel-Frequency Cepstral Coefficients (MFCC) are the most
widely used features in speech recognition field.

 Automatic speech recognition (ASR) systems.

 Feature extraction.

 The MFCC parameters perform better than others in the recognition
accuracy.

Chirp Z-Transform

Fig 2: Oreration in CZT

Data Time Warping

DTW algorithm is based on Dynamic Programming techniques

Fig. 3. A Warping between two time series

Experiment condition
Process Description
1) Speaker 3 Female
3 Male
2) Tools Cool Edit Pro 2.0 tool
3) Environment Laboratory
4) Sampling Frequency, fs 300-3000 Hz
4) Utterance Noisy area

RECOGNITION
Testing Testing Correct Percentage Testing Testing Correct Percentage
Set Number Number % Set Number Number %

0 8 6 75 0 8 6 75
1 8 7 87.5 1 8 7 87.5
2 8 8 100 2 8 8 100
3 8 3 37.5 3 8 4 50
4 8 4 50 4 8 5 62.5
5 8 8 100 5 8 8 100
6 8 6 75 6 8 7 87.5
7 8 8 100 7 8 8 100
8 8 8 100 8 8 8 100
9 8 8 100 9 8 8 100
conditions:fl=300,fh=3000,M=256
Table 1. Recognition Rate of the MFCC
Table 2. Recognition Rate of the MFCC+CZTBased

Testing Testing Correct Percenta
Set Number Number ge
%

0 8 6 75 Cepstral MFCC MFCC&CZT-
Coefficients based
1 8 7 87.5
2 8 8 100 Testing 80 80
Number
3 8 6 50
Correct 66 69
4 8 6 62.5 Number
5 8 8 100 Percentage / % 79.825 86.25
6 8 7 87.5
7 8 8 100 fl=300,fh=3000,M=256
8 8 8 100
Table 4. Different Cepstral Coefficients
9 8 8 100
conditions:fl=300,fh=3000,M=512

Table 3. Recognition Rate of the MFCC+CZTBased

Conclusion
 The design and implementation of the experiment, we come to
the following conclusions, a new approach, called CZT-based
algorithm, was developed to extract speech signals that are highly
transient in nature.

 We combine the CZTbased method with MFCC has
demonstrated its superiority over the previously reported MFCC
method in that the frequency resolution of the highly transient
speech signals is much enhanced, with better accuracy,
widespread integration of speech recognition technology into end-
user applications is ahead.

REFERENCES

[1] L.R. Rabiner, B.Gold, in: Theory and Application of Digital Signal Processing,
Prentice-Hail, Englewood Cliffs, NJ, 1975, p.393.

[2] J.P. Openshaw, Z.P. Sun, J.S. Mason, "A comparison of composite features
under degraded speech in speaker recognition", Proceedings of the International
Conference on Acoustics, Speech, and Signal Processing.

[3] R. Vergin, D. O’Shaughnessy, V. Gupta, "Compensated mel frequency
cepstrum coefficients", Proceedings of the International Conference on Acoustics,
Speech, and Signal Processing.

[4] Picone J W, "Signal modeling techniques in speech recognition", In
Proceedings of the IEEE,1993,81(9):1215- 1247.

[5] Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient
(MFCC) and Dynamic Time Warping (DTW) Techniques. Lindasalwa Muda,
Mumtaj Begam and I. Elamvazuthi

Empfohlen

Speaker recognition using MFCCHira Shaukat

Alchemus pcms ver1alchemussales

Shinto keynotejoeykwon

M tsuchiya+utokyo+2013 2-1masaru168

Siva ekonomija - istraživanja stavova građana i privrede, novembar 2014 NALED Serbia

Charles dickensIana Majei

Vezbanje yoge - saveti strucnjaka i osnovna uputstvaNALED Serbia

KhmerTalks: Hth job for youthsKhmerTalks

Empfohlen

Speaker recognition using MFCCHira Shaukat

Alchemus pcms ver1alchemussales

Shinto keynotejoeykwon

M tsuchiya+utokyo+2013 2-1masaru168

Siva ekonomija - istraživanja stavova građana i privrede, novembar 2014 NALED Serbia

Charles dickensIana Majei

Vezbanje yoge - saveti strucnjaka i osnovna uputstvaNALED Serbia

KhmerTalks: Hth job for youthsKhmerTalks

OS Commercesheila_musonza

Nacionalna browfield konferencija - Ana Ilić, Srpska filmska asocijacijaNALED Serbia

Rezultati NALED-a 2014.NALED Serbia

What Every CMO Should Know About SEOangelheld

Majei Iana-UkraineIana Majei

Zrenjaninska Siva knjiga propisa 2013NALED Serbia

Let’s do it toul kork talkKhmerTalks

M tsuchiya+osaka u+2015 12-18 v2masaru168

Berrett-Koehler Social Networking Best PracticesBerrett-Koehler Publishers

National program for countering shadow economy in SerbiaNALED Serbia

X sastanak Foruma strucnjaka za lokalni ekonomski razvoj, primeri uspesnih ko...NALED Serbia

2012 03-17エッヂランク康孝鈴木

FlashcardsDelizabeth_11

Izvestaj zaI kvartal 2013 - status regulatorne reformeNALED Serbia

Siva knjiga 8 - preporuke za smanjenje birokratije u SrbijiNALED Serbia

Pregled parafiskalnih nameta koji su ukinuti septembar oktobar 2012NALED Serbia

Sratistik data latihan analisiskhanifsyafii

680report finalRajesh M

Vibration study of a OCDC bracketRussell Varvel

日本無人島開発Fumiya Kiyohiro

Chapter 3 -Built-in Matlab FunctionsSiva Gopal

Weitere ähnliche Inhalte

Andere mochten auch

OS Commercesheila_musonza

Nacionalna browfield konferencija - Ana Ilić, Srpska filmska asocijacijaNALED Serbia

Rezultati NALED-a 2014.NALED Serbia

What Every CMO Should Know About SEOangelheld

Majei Iana-UkraineIana Majei

Zrenjaninska Siva knjiga propisa 2013NALED Serbia

Let’s do it toul kork talkKhmerTalks

M tsuchiya+osaka u+2015 12-18 v2masaru168

Berrett-Koehler Social Networking Best PracticesBerrett-Koehler Publishers

National program for countering shadow economy in SerbiaNALED Serbia

X sastanak Foruma strucnjaka za lokalni ekonomski razvoj, primeri uspesnih ko...NALED Serbia

2012 03-17エッヂランク康孝鈴木

FlashcardsDelizabeth_11

Izvestaj zaI kvartal 2013 - status regulatorne reformeNALED Serbia

Siva knjiga 8 - preporuke za smanjenje birokratije u SrbijiNALED Serbia

Pregled parafiskalnih nameta koji su ukinuti septembar oktobar 2012NALED Serbia

Andere mochten auch (16)

OS Commerce

Nacionalna browfield konferencija - Ana Ilić, Srpska filmska asocijacija

Rezultati NALED-a 2014.

What Every CMO Should Know About SEO

Majei Iana-Ukraine

Zrenjaninska Siva knjiga propisa 2013

Let’s do it toul kork talk

M tsuchiya+osaka u+2015 12-18 v2

Berrett-Koehler Social Networking Best Practices

National program for countering shadow economy in Serbia

X sastanak Foruma strucnjaka za lokalni ekonomski razvoj, primeri uspesnih ko...

2012 03-17エッヂランク

Flashcards

Izvestaj zaI kvartal 2013 - status regulatorne reforme

Siva knjiga 8 - preporuke za smanjenje birokratije u Srbiji

Pregled parafiskalnih nameta koji su ukinuti septembar oktobar 2012

Ähnlich wie MFCC and CZT Comparison for Speech Recognition

Sratistik data latihan analisiskhanifsyafii

680report finalRajesh M

Vibration study of a OCDC bracketRussell Varvel

日本無人島開発Fumiya Kiyohiro

Chapter 3 -Built-in Matlab FunctionsSiva Gopal

Spss Destigueste4de3ec

Final ProjectTeng-Hu Cheng

An Autocorrelation Analysis Approach to Detecting Land Cover Change using Hyp...grssieee

Automatic comparison of malwareUltraUploader

Frame detection.pdfinfomerlin

Anova Isep Hendariguest88ca541

Improved Sensitivity and Dynamic Range Using the Clarus SQ 8 GC/MS System for...PerkinElmer, Inc.

Pulsar Thermion 2 LRF PRO Reticle Catalogue | Optics TradeOptics-Trade

(September 13, 2023) Webinar: Seeing Double: Preclinical Multiplexed PET for...Scintica Instrumentation

Pulsar Thermion 2 Reticle Catalogue | Optics TradeOptics-Trade

Anova Isep Hendariguest88ca541

Ähnlich wie MFCC and CZT Comparison for Speech Recognition (20)

Sratistik data latihan analisis

680report final

Vibration study of a OCDC bracket

日本無人島開発

Chapter 3 -Built-in Matlab Functions

Spss Desti

Final Project

An Autocorrelation Analysis Approach to Detecting Land Cover Change using Hyp...

Automatic comparison of malware

Frame detection.pdf

Anova Isep Hendari

Improved Sensitivity and Dynamic Range Using the Clarus SQ 8 GC/MS System for...

Pulsar Thermion 2 LRF PRO Reticle Catalogue | Optics Trade

(September 13, 2023) Webinar: Seeing Double: Preclinical Multiplexed PET for...

Pulsar Thermion 2 Reticle Catalogue | Optics Trade

Anova Isep Hendari

Kürzlich hochgeladen

How to convert PDF to text with Nanonetsnaman860154

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

Histor y of HAM Radio presentation slidevu2urc

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

Slack Application Development 101 Slidespraypatel2

Kürzlich hochgeladen (20)

How to convert PDF to text with Nanonets

Unblocking The Main Thread Solving ANRs and Frozen Frames

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams

Presentation on how to chat with PDF using ChatGPT code interpreter

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

08448380779 Call Girls In Friends Colony Women Seeking Men

Automating Google Workspace (GWS) & more with Apps Script

08448380779 Call Girls In Civil Lines Women Seeking Men

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

Exploring the Future Potential of AI-Enabled Smartphone Processors

Factors to Consider When Choosing Accounts Payable Services Providers.pptx

Injustice - Developers Among Us (SciFiDevCon 2024)

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

Histor y of HAM Radio presentation slide

Salesforce Community Group Quito, Salesforce 101

Finology Group – Insurtech Innovation Award 2024

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Slack Application Development 101 Slides

MFCC and CZT Comparison for Speech Recognition

1. SEMINAR ON Acoustic Feature Comparison of MFCC and CZT-based Cepstrum for Speech Recognition Guided by:- Presented by:- Prof. R.V.Pawar Neehal B. Jiwane

2. Introduction  The Mel-Frequency Cepstral Coefficients (MFCC) are the most widely used features in speech recognition field.  Automatic speech recognition (ASR) systems.  Feature extraction.  The MFCC parameters perform better than others in the recognition accuracy.

3. MFCC Fig. 1. MFCC Block Diagram

4. Chirp Z-Transform Fig 2: Oreration in CZT

5. Data Time Warping DTW algorithm is based on Dynamic Programming techniques Fig. 3. A Warping between two time series

6. Experiment condition Process Description 1) Speaker 3 Female 3 Male 2) Tools Cool Edit Pro 2.0 tool 3) Environment Laboratory 4) Sampling Frequency, fs 300-3000 Hz 4) Utterance Noisy area

7. RECOGNITION Testing Testing Correct Percentage Testing Testing Correct Percentage Set Number Number % Set Number Number % 0 8 6 75 0 8 6 75 1 8 7 87.5 1 8 7 87.5 2 8 8 100 2 8 8 100 3 8 3 37.5 3 8 4 50 4 8 4 50 4 8 5 62.5 5 8 8 100 5 8 8 100 6 8 6 75 6 8 7 87.5 7 8 8 100 7 8 8 100 8 8 8 100 8 8 8 100 9 8 8 100 9 8 8 100 conditions:fl=300,fh=3000,M=256 Table 1. Recognition Rate of the MFCC Table 2. Recognition Rate of the MFCC+CZTBased

8. Testing Testing Correct Percenta Set Number Number ge % 0 8 6 75 Cepstral MFCC MFCC&CZT- Coefficients based 1 8 7 87.5 2 8 8 100 Testing 80 80 Number 3 8 6 50 Correct 66 69 4 8 6 62.5 Number 5 8 8 100 Percentage / % 79.825 86.25 6 8 7 87.5 7 8 8 100 fl=300,fh=3000,M=256 8 8 8 100 Table 4. Different Cepstral Coefficients 9 8 8 100 conditions:fl=300,fh=3000,M=512 Table 3. Recognition Rate of the MFCC+CZTBased

9. Conclusion  The design and implementation of the experiment, we come to the following conclusions, a new approach, called CZT-based algorithm, was developed to extract speech signals that are highly transient in nature.  We combine the CZTbased method with MFCC has demonstrated its superiority over the previously reported MFCC method in that the frequency resolution of the highly transient speech signals is much enhanced, with better accuracy, widespread integration of speech recognition technology into end- user applications is ahead.

10. REFERENCES [1] L.R. Rabiner, B.Gold, in: Theory and Application of Digital Signal Processing, Prentice-Hail, Englewood Cliffs, NJ, 1975, p.393. [2] J.P. Openshaw, Z.P. Sun, J.S. Mason, "A comparison of composite features under degraded speech in speaker recognition", Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. [3] R. Vergin, D. O’Shaughnessy, V. Gupta, "Compensated mel frequency cepstrum coefficients", Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. [4] Picone J W, "Signal modeling techniques in speech recognition", In Proceedings of the IEEE,1993,81(9):1215- 1247. [5] Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques. Lindasalwa Muda, Mumtaj Begam and I. Elamvazuthi