SlideShare ist ein Scribd-Unternehmen logo
1 von 4
Downloaden Sie, um offline zu lesen
IJSRD - International Journal for Scientific Research & Development| Vol. 1, Issue 9, 2013 | ISSN (online): 2321-0613
All rights reserved by www.ijsrd.com 1934
Abstract—This paper presents an approach to speaker
recognition using frequency spectral information with Mel
frequency for the improvement of speech feature
representation in a Vector Quantization codebook based
recognition approach. The Mel frequency approach extracts
the features of the speech signal to get the training and
testing vectors. The VQ Codebook approach uses training
vectors to form clusters and recognize accurately with the
help of LBG algorithm
Key words: Speaker Recognition, MFCC, Mel Frequencies,
Vector Quantization.
I. INTRODUCTION
Speech is the most natural way of communicating. Unlike
other forms of identification, such as passwords or keys,
speech is the most non-intrusive as a biometric. A person’s
voice cannot be stolen, forgotten or lost, therefore speaker
recognition allows for a secure method of authenticating
speakers.
In speaker recognition system, an unknown speaker
is compared against a database of known speakers, and the
best matching speaker is given as the identification result.
This system is based on the speaker-specific information
which is included in speech waves.
Earlier systems used the traditional approaches but
in this paper, we intend to increase the efficiency and
accuracy of the system by making use of a new approach
which would include the fusion neural networks and
clustering algorithms.
The general process of speaker identification involves two
stages:
1) Training Mode
2) Recognition Mode
In the training mode, a new speaker (with known identity) is
enrolled into the system’s database. In the recognition mode,
an unknown speaker gives a speech input and the system
makes a decision about the speaker’s identity.
II. BLOCK DIAGRAM
The process of real time speaker identification system
consists of two main phases. During the first phase, speaker
enrolment, speech samples are collected from the speakers,
and they are used to train their models. The collection of
enrolled models is also called a speaker database. In the
second phase, identification phase, a test sample from an
unknown speaker is compared against the speaker database.
Both phases include the same first step, feature extraction,
which is used to extract speaker dependent characteristics
from speech. The main purpose of this step is to reduce the
amount of test data while retaining speaker discriminative
information. Then in the enrolment phase, these features are
modelled and stored in the speaker database. This process is
represented in Fig. 1.
Fig. 1: Speaker Enrolment
In the identification step, the extracted features are
compared against the models stored in the speaker database.
Based on these comparisons the final decision about speaker
identity is made. This process is represented in Fig. 2.
Fig. 2: Speaker Authentication
III. PRE-PROCESSING
Speech is recorded by sampling the input which results in a
discrete time speech signal. Pre-processing is a technique
used to make discrete time speech signal more amendable
for the process that follows. The pre-processing techniques
are used to enhance feature extraction. They include pre-
emphasis, framing and windowing. Pre-emphasis is
extensively explained below whereas frame blocking and
windowing are explained in further parts.
A. Pre-emphasis
Pre-emphasis is a technique used in speech processing to
enhance high frequencies of the signal. There are two main
Speaker Recognition System using MFCC and Vector Quantization
Approach
Deepak Harjani1
Mohita Jethwani2
Ms. Mani Roja3
1, 2
Student 3
Associate Professor
1, 3
Dept. of Electronics and Telecommunication 2
Dept. of Computer Science
1, 2, 3
Thadomal Shahani Engineering College, Mumbai, India
Speaker Recognition System using MFCC and Vector Quantization Approach
(IJSRD/Vol. 1/Issue 9/2013/0061)
All rights reserved by www.ijsrd.com 1935
factors driving the need for pre-emphasis. Firstly, the speech
signal generally contains more speaker specific information
in the higher frequencies than the lower frequencies.
Secondly, pre-emphasis removes some of the glottal effects
from the vocal tract parameters. For voiced sounds which
have a steep roll-off in the high frequency region, the glottal
source has an approximately –12dB/octave slope. However,
when the acoustic energy radiates from the lips, this causes a
roughly +6dB/octave boost to the spectrum. As a result, a
speech signal when recorded with a microphone from a
distance, has approximately –6dB /octave slope downward
compared to the true spectrum. Therefore, by applying pre-
emphasis, the spectrum is flattened, consisting of formats of
similar heights. The spectrum of unvoiced sound is already
flat therefore there is no reason to pre-emphasize them. This
allows feature extraction to focus on all aspects of the
speech signal.
Pre-emphasis is implemented as a first-order Finite Impulse
Response (FIR) filter defined as:
H(z) = 1 – αz-1 (1.1)
Generally α is chosen to be between 0.9 and 0.95. We have
used α=0.95.
B. Frame Blocking
To prevent the occurrence of aliasing effect, frame blocking
is used to convert the continuous speech signal into frames
of desired sample length. In this step the continuous speech
signal is blocked into frames of N samples, with adjacent
frames being separated by M such that M < N. The first
frame consists of the first N samples. The second frame
begins M samples after the first frame, and overlaps it by N
- M samples. Similarly, the third frame begins 2M samples
after the first frame (or M samples after the second frame)
and overlaps it by N - 2M samples. This process continues
until all the speech is accounted for within one or more
frames. Typical values for N and M are N = 256 (which is
equivalent to ~ 30 mSec windowing and facilitate the fast
radix-2 FFT) and M = 100. [1]
C. Windowing
The signal obtained after frame blocking has signal
discontinuities at the beginning and end of each frame. To
minimize these signal discontinuities blocking is used. In
blocking, the spectral distortion is minimized by using a
window to taper the signal to zero at the beginning and at
the end of each frame. The different windows available for
this process include rectangular window, triangular window,
hanning window, hamming window, etc. If we define the
window as w(n), 0 ≤ n ≤ N–1, where N is the number of
samples in each frame, then the result of windowing is the
signal as in Equation 1.2
y1(n) = x1(n) * w(n) (1.2)
Typically the Hamming window is used for the windowing
process, which has the form as in Equation 1.3:
( ) ( ) (1.3)
IV. FEATURE EXTRACTION
The amount of data, generated during the speech production,
is quite large while the essential characteristics of the speech
process change relatively slowly and therefore, they require
less data. According to these matters feature extraction is a
process of reducing data while retaining speaker
discriminative information. [2]
In order to create a speaker profile, the speech signal must
be analyzed to produce some representation that can be used
as a basis for such a model. In speech analysis this is known
as feature extraction. Feature extraction allows for speaker
specific characteristics to be derived from the speech signal,
which are used to create a speaker model. The speaker
model uses a distortion measure to determine features which
are similar. This places importance on the features extracted,
to accurately represent the speech signal. Feature extraction
phase consists of transforming the speech signal in a set of
feature vectors called parameters. [3]
The aim of this transformation is to obtain a new
representation which is more compact, less redundant, and
more suitable for statistical modeling and calculation of
distances. Most of the speech parameterizations used in
speaker recognition systems relies on cepstral representation
of the speech signal. [4]
A wide range of possibilities exist for parametrically
representing the speech signal for the speaker recognition
task, such as Linear Prediction Coding (LPC), Mel-
Frequency Cepstrum Coefficients (MFCC), and others.
MFCC is perhaps the best known and most popular, and
these will be used in this project.
V. MEL-FREQUENCY CEPSTRAL COEFFICIENTS
MFCC’s are based on the known variation of the human
ear’s critical bandwidths with frequency; filters spaced
linearly at low frequencies and logarithmically at high
frequencies have been used to capture the phonetically
important characteristics of speech. This is expressed in the
mel-frequency scale, which is linear frequency spacing
below 1000 Hz and a logarithmic spacing above 1000 Hz.
[1]
Fig. 3: Block diagram of MFCC processor
A block diagram of the structure of an MFCC processor is
given in Figure 2. The speech input is typically recorded at a
sampling rate above 10000 Hz. This sampling frequency
was chosen to minimize the effects of aliasing in the Analog
-to digital conversion. These sampled signals can capture all
frequencies up to 5 kHz, which cover most energy of sounds
that are generated by humans. As been discussed previously,
the main purpose of the MFCC processor is to mimic the
behaviour of the human ears. In addition, rather than the
speech waveforms themselves, MFFC’s are shown to be less
susceptible to mentioned variations. Figure 2 shows the
block diagram of the MFCC processor.
Speaker Recognition System using MFCC and Vector Quantization Approach
(IJSRD/Vol. 1/Issue 9/2013/0061)
All rights reserved by www.ijsrd.com 1936
A. Fast Fourier Transform
All this while the computations have been carried out in the
time domain. But since for MFCC, we need the samples in
the frequency domain, we use the Fast Fourier Transform
method to convert each frame of N samples from time
domain into the frequency domain. The FFT is a fast
algorithm to implement the Discrete Fourier Transform
(DFT) which is defined on the set of N samples {xn} as in
Equation 2.1:
∑ (2.1)
The spectrum obtained after the fast Fourier transform is
used for mel-frequency warping.
B. Mel-Frequency Warping
MFCCs are typically computed by using a bank of
triangular-shaped filters, with the centre frequency of the
filter spaced linearly for frequencies less than 1000 Hz and
logarithmically above 1000 Hz. The bandwidth of each filter
is determined by the centre frequencies of the two adjacent
filters and is dependent on the frequency range of the filter
bank and number of filters chosen for design. But for the
human auditory system it is estimated that the filters have a
bandwidth that is related to the centre frequency of the filter.
Further it has been shown that there is no evidence of two
regions (linear and logarithmic) in the experimentally
determined Mel frequency scale.
Recent studies on the effectiveness of different
frequency regions of the speech spectrum for speaker
recognition frequency-scale warping method provided better
performance than standard Mel scale filter bank. [4]
The frequency-scale warping is implemented by
using the bilinear transform method. A range of warping
functions can be achieved by using the bilinear transform
technique. It is extremely flexible and a wide range of
warping functions can be obtained by suitably fixing a
“warping factor” for the given sampling frequency. We use
this framework to carry out an experimental study toward
determining the optimal choice of frequency-scale warping
for the speaker recognition task.
Fig. 4: Frequency warping using second
Order Bilinear transform
A nonlinear warping of the frequency scale can be effected
by bilinear transformation, given as the transfer function of
a first-order all pass filters.
( ) = (2.2)
This mapping in the Z-plane maps the unit circle onto itself.
From (4) we can obtain
ώ = arg(D(e-jω
)) =
| | ( )
| | ( )
(2.3)
More control over frequency warping can be achieved by
using a second order bilinear transform [4].
C. Cepstrum
After frequency warping, we convert the log Mel spectrum
from frequency back into time. The result obtained after the
Cepstrum operation is called the Mel Frequency Cepstrum
Coefficients (MFCC). Since the Mel spectrum coefficients
are real numbers, so are their logarithmic values. Therefore
we can convert them back into the time domain by using
simple Discrete Cosine Transform. Therefore if we denote
those mel power spectrum coefficients that are the result of
the last step are Sk, k=1, 2…K, we can calculate the
MFCC's, as in Equation 2.4:
ξn = ∑ ( ) [ ( ) ] (2.4)
The cepstral representation of the speech spectrum provides
a good representation of the local spectral properties of the
signal for the given frame analysis. [3]
VI. FEATURE MATCHING AND SPEAKER
RECOGNITION
Fig. 5: Conceptual diagram illustrating vector quantization
codebook formation
The state-of-the-art in feature matching techniques used in
speaker recognition includes Dynamic Time Warping
(DTW), Hidden Markov Modelling (HMM), and Vector
Quantization (VQ). In this project, the VQ approach will be
used, due to ease of implementation and high accuracy. VQ
is a process of mapping vectors from a large vector space to
a finite number of regions in that space. Each region is
called a cluster and can be represented by its centre called a
codeword. The collection of all codeword is called a
codebook.
Speaker Recognition System using MFCC and Vector Quantization Approach
(IJSRD/Vol. 1/Issue 9/2013/0061)
All rights reserved by www.ijsrd.com 1937
Figure 5 shows a conceptual diagram to illustrate this
recognition process. In the figure, only two speakers and
two dimensions of the acoustic space are shown.
The training vectors obtained are used to build a speech
specific VQ codebook for the speaker dictionary. The LBG
algorithm is used for clustering a set of L training vectors
into a set of M codebook vectors. This algorithm designs an
M-vector codebook in stages. [6] It starts first by designing
a 1-vector codebook, then uses a splitting technique on the
codeword to initialize the search for a 2-vector codebook,
and continues the splitting process until the desired M-
vector codebook is obtained. This algorithm is based on the
nearest-neighbour search procedure which assigns each
training vector to a cluster associated with the closest
codeword. The centroid of the clusters obtained and the
distortion calculates the sum of the distances of all training
vectors so as to decide whether the procedure has
converged. The point of convergence is used to decide the
result of the speaker recognition. [7]
VII. EXPERIMENTAL RESULTS
The database consists of 20 distinct speakers including both
male and female speakers. It also contains 50 sound files
used for training and testing the Speaker Recognition
module. New sound files are recorded in real time for testing
the Continuous Speech Recognition module in clean and
noisy environments for both multi-speaker and speaker-
independent modes. Recognition rate of the trained VQ
Codebook model is defined as follows:
(3.1)
In the equation 3.1, RR is the recognition rate, Ncorrect is
the number of correct recognition of testing speech samples
per digit, and Ntotal is the total number of testing speech
samples per digit [8].
Environment
Number of
Samples
Tested
Number of
Samples
Recognized
Accurately
Recognition
Rate
(%)
Clean 20 19 95
Noisy 20 16 80
Table 1: Overall Recognition Rate of Proposed Speaker
Recognition System
Feature Extraction
Technique
Feature recognition
technique
Recognition
rate
(%)
Reference
LPC VQ and HMM 62% to 96% [9]
MFCC VQ 70% to 85% [10]
MFCC VQ 88.88% [2]
MFCC VQ 57% to 100% [11]
Table. 2: Comparative Performance of various Speaker
Recognition Researches
VIII. CONCLUSION
The performance measured on the basis of accuracy, time
taken to compute the feature recognition, it was observed
that the Speaker Recognition System performs well in both
clean and noisy environment with both multi-speaker and
speaker independent modes. The entire research process was
carried out using MATLAB R13 on an Intel i5 powered
machine. It is noticed that the recognition results on the
clean environment are much higher than the recognition
results of the noisy environment.
ACKNOWLEDGMENT
The authors would like to thank the college authorities for
providing infrastructure to carry out experimental and
research work required. The authors would also like to thank
Ms Mani Roja, Associate Professor of the Electronics and
Telecommunication Department of Thadomal Shahani
Engineering College for reviewing the paper and guiding the
authors with their valuable feedbacks.
REFERENCES
[1] Santosh Gaikwad “Feature Extraction Using Fusion
MFCC For Continuous Marathi Speech Recognition”.
[2] M. A. M. Abu Shariah, R. N. Ainon, R. Zainuddin, and
O. O. Khalifa, “Human Computer Interaction Using
Isolated-Words Speech Recognition Technology,”
IEEE Proceedings of The International Conference on
Intelligent and Advanced Systems (ICIAS’07), Kuala
Lumpur, Malaysia, pp. 1173 – 1178, 2007.
[3] Ibrahim Patel, Dr.Y.Srinivasa Rao “Speech
Recognition using Hidden Markov Model with MFCC-
Sub band technique”
[4] S.K., Podder, “Segment-based Stochastic Modelings
for Speech Recognition”. PhD Thesis. Department of
Electrical and Electronic Engineering, Ehime
University, Matsuyama 790-77, Japan, 1997.
[5] Pradeep Kumar P and Preeti Rao” A Study of
Frequency-Scale Warping for Speaker Recognition”
Englewood Cliffs, N.J., 1993.
[6] Clarence Goh Kok Leon “Robust Computer Voice
Recognition Using Improved MFCC Algorithm”, 2009
International Conference on New Trends in
Information and Service Science.
[7] Ahmad A. M. Abushariah ‘English Digits Speech
Recognition System Based on Hidden Markov
Models”, Conference on Computer and
Communication Engineering (ICCCE 2010).
[8] S.K., Podder, “Segment-based Stochastic Modelings
for Speech Recognition”. PhD Thesis. Department of
Electrical and Electronic Engineering, Ehime
University, Matsuyama 790-77, Japan, 1997.
[9] S.M., Ahadi, H., Sheikhzadeh, R.L., Brennan, and
G.H., Freeman, “An Efficient Front-End for Automatic
Speech Recognition”. IEEE International Conference
on Electronics, Circuits and Systems (ICECS2003),
Sharjah, United Arab Emirates, 2003.
[10] M.R., Hasan, M., Jamil, and M.G., Saifur Rahman,
“Speaker Identification Using Mel Frequency Cepstral
Coefficients”. 3rd International Conference on
Electrical and Computer Engineering, Dhaka,
Bangladesh, pp. 565-568, 2004.
[11] M.Z., Bhotto and M.R., Amin, “Bangali Text
Dependent Speaker Identification Using MelFrequency
Cepstrum Coefficient and Vector Quantization”. 3rd
International Conference on Electrical and Computer
Engineering, Dhaka, Bangladesh, pp. 569-572, 2004.

Weitere ähnliche Inhalte

Was ist angesagt?

Speech based password authentication system on FPGA
Speech based password authentication system on FPGASpeech based password authentication system on FPGA
Speech based password authentication system on FPGARajesh Roshan
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognitionRichie
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCCHira Shaukat
 
COLEA : A MATLAB Tool for Speech Analysis
COLEA : A MATLAB Tool for Speech AnalysisCOLEA : A MATLAB Tool for Speech Analysis
COLEA : A MATLAB Tool for Speech AnalysisRushin Shah
 
Voice biometric recognition
Voice biometric recognitionVoice biometric recognition
Voice biometric recognitionphyuhsan
 
Isolated words recognition using mfcc, lpc and neural network
Isolated words recognition using mfcc, lpc and neural networkIsolated words recognition using mfcc, lpc and neural network
Isolated words recognition using mfcc, lpc and neural networkeSAT Journals
 
Automatic Speaker Recognition system using MFCC and VQ approach
Automatic Speaker Recognition system using MFCC and VQ approachAutomatic Speaker Recognition system using MFCC and VQ approach
Automatic Speaker Recognition system using MFCC and VQ approachAbdullah al Mamun
 
SPEKER RECOGNITION UNDER LIMITED DATA CODITION
SPEKER RECOGNITION UNDER LIMITED DATA CODITIONSPEKER RECOGNITION UNDER LIMITED DATA CODITION
SPEKER RECOGNITION UNDER LIMITED DATA CODITIONniranjan kumar
 
A Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemA Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemVani011
 
DEVELOPMENT OF SPEAKER VERIFICATION UNDER LIMITED DATA AND CONDITION
DEVELOPMENT OF SPEAKER VERIFICATION  UNDER LIMITED DATA AND CONDITIONDEVELOPMENT OF SPEAKER VERIFICATION  UNDER LIMITED DATA AND CONDITION
DEVELOPMENT OF SPEAKER VERIFICATION UNDER LIMITED DATA AND CONDITIONniranjan kumar
 
Digital modeling of speech signal
Digital modeling of speech signalDigital modeling of speech signal
Digital modeling of speech signalVinodhini
 
Speaker identification
Speaker identificationSpeaker identification
Speaker identificationTriloki Gupta
 
LPC for Speech Recognition
LPC for Speech RecognitionLPC for Speech Recognition
LPC for Speech RecognitionDr. Uday Saikia
 
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnnAutomatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnnijcsa
 

Was ist angesagt? (20)

Speech based password authentication system on FPGA
Speech based password authentication system on FPGASpeech based password authentication system on FPGA
Speech based password authentication system on FPGA
 
Speaker recognition.
Speaker recognition.Speaker recognition.
Speaker recognition.
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCC
 
COLEA : A MATLAB Tool for Speech Analysis
COLEA : A MATLAB Tool for Speech AnalysisCOLEA : A MATLAB Tool for Speech Analysis
COLEA : A MATLAB Tool for Speech Analysis
 
Voice biometric recognition
Voice biometric recognitionVoice biometric recognition
Voice biometric recognition
 
Isolated words recognition using mfcc, lpc and neural network
Isolated words recognition using mfcc, lpc and neural networkIsolated words recognition using mfcc, lpc and neural network
Isolated words recognition using mfcc, lpc and neural network
 
Automatic Speaker Recognition system using MFCC and VQ approach
Automatic Speaker Recognition system using MFCC and VQ approachAutomatic Speaker Recognition system using MFCC and VQ approach
Automatic Speaker Recognition system using MFCC and VQ approach
 
Speech Signal Analysis
Speech Signal AnalysisSpeech Signal Analysis
Speech Signal Analysis
 
SPEKER RECOGNITION UNDER LIMITED DATA CODITION
SPEKER RECOGNITION UNDER LIMITED DATA CODITIONSPEKER RECOGNITION UNDER LIMITED DATA CODITION
SPEKER RECOGNITION UNDER LIMITED DATA CODITION
 
speech enhancement
speech enhancementspeech enhancement
speech enhancement
 
Mini Project- Audio Enhancement
Mini Project-  Audio EnhancementMini Project-  Audio Enhancement
Mini Project- Audio Enhancement
 
A Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemA Survey on Speaker Recognition System
A Survey on Speaker Recognition System
 
DEVELOPMENT OF SPEAKER VERIFICATION UNDER LIMITED DATA AND CONDITION
DEVELOPMENT OF SPEAKER VERIFICATION  UNDER LIMITED DATA AND CONDITIONDEVELOPMENT OF SPEAKER VERIFICATION  UNDER LIMITED DATA AND CONDITION
DEVELOPMENT OF SPEAKER VERIFICATION UNDER LIMITED DATA AND CONDITION
 
Speech Signal Processing
Speech Signal ProcessingSpeech Signal Processing
Speech Signal Processing
 
Digital modeling of speech signal
Digital modeling of speech signalDigital modeling of speech signal
Digital modeling of speech signal
 
Speaker identification
Speaker identificationSpeaker identification
Speaker identification
 
Subband Coding
Subband CodingSubband Coding
Subband Coding
 
LPC for Speech Recognition
LPC for Speech RecognitionLPC for Speech Recognition
LPC for Speech Recognition
 
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnnAutomatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
 

Andere mochten auch

Speech recognition
Speech recognitionSpeech recognition
Speech recognitionCharu Joshi
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By MatlabAnkit Gujrati
 
Artificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemArtificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemREHMAT ULLAH
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentationhimanshubhatti
 
Speech recognition project report
Speech recognition project reportSpeech recognition project report
Speech recognition project reportSarang Afle
 
MFCC Malta - MICE Presentation 2017
MFCC Malta - MICE Presentation 2017MFCC Malta - MICE Presentation 2017
MFCC Malta - MICE Presentation 2017MICEboard
 
Speaker recognition system by abhishek mahajan
Speaker recognition system by abhishek mahajanSpeaker recognition system by abhishek mahajan
Speaker recognition system by abhishek mahajanAbhishek Mahajan
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech RecognitionAhmed Moawad
 
Voice Identification And Recognition System, Matlab
Voice Identification And Recognition System, MatlabVoice Identification And Recognition System, Matlab
Voice Identification And Recognition System, MatlabSohaib Tallat
 
Factor analysis
Factor analysisFactor analysis
Factor analysisDhruv Goel
 
Timbral modeling for music artist recognition using i-vectors
Timbral modeling for music artist recognition using i-vectorsTimbral modeling for music artist recognition using i-vectors
Timbral modeling for music artist recognition using i-vectorsHamid Eghbal-zadeh
 
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super VectorLec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super VectorUnited States Air Force Academy
 
음성인식 기술
음성인식 기술음성인식 기술
음성인식 기술SeongJun Mun
 
Linear prediction
Linear predictionLinear prediction
Linear predictionUma Rajaram
 
Linear predictive coding documentation
Linear predictive coding  documentationLinear predictive coding  documentation
Linear predictive coding documentationchakravarthy Gopi
 
Kaldi-voice: Your personal speech recognition server using open source code
Kaldi-voice: Your personal speech recognition server using open source codeKaldi-voice: Your personal speech recognition server using open source code
Kaldi-voice: Your personal speech recognition server using open source codeXavier Anguera
 

Andere mochten auch (20)

Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By Matlab
 
Artificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemArtificial intelligence Speech recognition system
Artificial intelligence Speech recognition system
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentation
 
Speech recognition project report
Speech recognition project reportSpeech recognition project report
Speech recognition project report
 
first_seminar
first_seminarfirst_seminar
first_seminar
 
second_seminar
second_seminarsecond_seminar
second_seminar
 
MFCC Malta - MICE Presentation 2017
MFCC Malta - MICE Presentation 2017MFCC Malta - MICE Presentation 2017
MFCC Malta - MICE Presentation 2017
 
third_seminar
third_seminarthird_seminar
third_seminar
 
Speaker recognition system by abhishek mahajan
Speaker recognition system by abhishek mahajanSpeaker recognition system by abhishek mahajan
Speaker recognition system by abhishek mahajan
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Voice Identification And Recognition System, Matlab
Voice Identification And Recognition System, MatlabVoice Identification And Recognition System, Matlab
Voice Identification And Recognition System, Matlab
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Timbral modeling for music artist recognition using i-vectors
Timbral modeling for music artist recognition using i-vectorsTimbral modeling for music artist recognition using i-vectors
Timbral modeling for music artist recognition using i-vectors
 
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super VectorLec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
 
음성인식 기술
음성인식 기술음성인식 기술
음성인식 기술
 
Linear prediction
Linear predictionLinear prediction
Linear prediction
 
Linear predictive coding documentation
Linear predictive coding  documentationLinear predictive coding  documentation
Linear predictive coding documentation
 
Kaldi-voice: Your personal speech recognition server using open source code
Kaldi-voice: Your personal speech recognition server using open source codeKaldi-voice: Your personal speech recognition server using open source code
Kaldi-voice: Your personal speech recognition server using open source code
 
Mfcc & czt
Mfcc & cztMfcc & czt
Mfcc & czt
 

Ähnlich wie Speaker Recognition System using MFCC and Vector Quantization Approach

Speaker recognition on matlab
Speaker recognition on matlabSpeaker recognition on matlab
Speaker recognition on matlabArcanjo Salazaku
 
Wavelet Based Noise Robust Features for Speaker Recognition
Wavelet Based Noise Robust Features for Speaker RecognitionWavelet Based Noise Robust Features for Speaker Recognition
Wavelet Based Noise Robust Features for Speaker RecognitionCSCJournals
 
Analysis of Suitable Extraction Methods and Classifiers For Speaker Identific...
Analysis of Suitable Extraction Methods and Classifiers For Speaker Identific...Analysis of Suitable Extraction Methods and Classifiers For Speaker Identific...
Analysis of Suitable Extraction Methods and Classifiers For Speaker Identific...IRJET Journal
 
Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...IOSR Journals
 
Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...IOSR Journals
 
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...IDES Editor
 
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueCSCJournals
 
Speech Analysis and synthesis using Vocoder
Speech Analysis and synthesis using VocoderSpeech Analysis and synthesis using Vocoder
Speech Analysis and synthesis using VocoderIJTET Journal
 
Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...IJECEIAES
 
AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND T...
AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND T...AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND T...
AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND T...IJCSEA Journal
 
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...TELKOMNIKA JOURNAL
 
Speaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVMSpeaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVMIRJET Journal
 
Course report-islam-taharimul (1)
Course report-islam-taharimul (1)Course report-islam-taharimul (1)
Course report-islam-taharimul (1)TANVIRAHMED611926
 
Speaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home ApplicationsSpeaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home ApplicationsRoger Gomes
 
Classification of Language Speech Recognition System
Classification of Language Speech Recognition SystemClassification of Language Speech Recognition System
Classification of Language Speech Recognition Systemijtsrd
 

Ähnlich wie Speaker Recognition System using MFCC and Vector Quantization Approach (20)

Speaker recognition on matlab
Speaker recognition on matlabSpeaker recognition on matlab
Speaker recognition on matlab
 
Wavelet Based Noise Robust Features for Speaker Recognition
Wavelet Based Noise Robust Features for Speaker RecognitionWavelet Based Noise Robust Features for Speaker Recognition
Wavelet Based Noise Robust Features for Speaker Recognition
 
Analysis of Suitable Extraction Methods and Classifiers For Speaker Identific...
Analysis of Suitable Extraction Methods and Classifiers For Speaker Identific...Analysis of Suitable Extraction Methods and Classifiers For Speaker Identific...
Analysis of Suitable Extraction Methods and Classifiers For Speaker Identific...
 
Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...
 
Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...
 
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
 
Speaker Recognition Using Vocal Tract Features
Speaker Recognition Using Vocal Tract FeaturesSpeaker Recognition Using Vocal Tract Features
Speaker Recognition Using Vocal Tract Features
 
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
 
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
 
Speech Analysis and synthesis using Vocoder
Speech Analysis and synthesis using VocoderSpeech Analysis and synthesis using Vocoder
Speech Analysis and synthesis using Vocoder
 
Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...
 
AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND T...
AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND T...AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND T...
AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND T...
 
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
 
Speaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVMSpeaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVM
 
Ijetcas14 426
Ijetcas14 426Ijetcas14 426
Ijetcas14 426
 
Course report-islam-taharimul (1)
Course report-islam-taharimul (1)Course report-islam-taharimul (1)
Course report-islam-taharimul (1)
 
H42045359
H42045359H42045359
H42045359
 
Speaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home ApplicationsSpeaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home Applications
 
D04812125
D04812125D04812125
D04812125
 
Classification of Language Speech Recognition System
Classification of Language Speech Recognition SystemClassification of Language Speech Recognition System
Classification of Language Speech Recognition System
 

Mehr von ijsrd.com

IoT Enabled Smart Grid
IoT Enabled Smart GridIoT Enabled Smart Grid
IoT Enabled Smart Gridijsrd.com
 
A Survey Report on : Security & Challenges in Internet of Things
A Survey Report on : Security & Challenges in Internet of ThingsA Survey Report on : Security & Challenges in Internet of Things
A Survey Report on : Security & Challenges in Internet of Thingsijsrd.com
 
IoT for Everyday Life
IoT for Everyday LifeIoT for Everyday Life
IoT for Everyday Lifeijsrd.com
 
Study on Issues in Managing and Protecting Data of IOT
Study on Issues in Managing and Protecting Data of IOTStudy on Issues in Managing and Protecting Data of IOT
Study on Issues in Managing and Protecting Data of IOTijsrd.com
 
Interactive Technologies for Improving Quality of Education to Build Collabor...
Interactive Technologies for Improving Quality of Education to Build Collabor...Interactive Technologies for Improving Quality of Education to Build Collabor...
Interactive Technologies for Improving Quality of Education to Build Collabor...ijsrd.com
 
Internet of Things - Paradigm Shift of Future Internet Application for Specia...
Internet of Things - Paradigm Shift of Future Internet Application for Specia...Internet of Things - Paradigm Shift of Future Internet Application for Specia...
Internet of Things - Paradigm Shift of Future Internet Application for Specia...ijsrd.com
 
A Study of the Adverse Effects of IoT on Student's Life
A Study of the Adverse Effects of IoT on Student's LifeA Study of the Adverse Effects of IoT on Student's Life
A Study of the Adverse Effects of IoT on Student's Lifeijsrd.com
 
Pedagogy for Effective use of ICT in English Language Learning
Pedagogy for Effective use of ICT in English Language LearningPedagogy for Effective use of ICT in English Language Learning
Pedagogy for Effective use of ICT in English Language Learningijsrd.com
 
Virtual Eye - Smart Traffic Navigation System
Virtual Eye - Smart Traffic Navigation SystemVirtual Eye - Smart Traffic Navigation System
Virtual Eye - Smart Traffic Navigation Systemijsrd.com
 
Ontological Model of Educational Programs in Computer Science (Bachelor and M...
Ontological Model of Educational Programs in Computer Science (Bachelor and M...Ontological Model of Educational Programs in Computer Science (Bachelor and M...
Ontological Model of Educational Programs in Computer Science (Bachelor and M...ijsrd.com
 
Understanding IoT Management for Smart Refrigerator
Understanding IoT Management for Smart RefrigeratorUnderstanding IoT Management for Smart Refrigerator
Understanding IoT Management for Smart Refrigeratorijsrd.com
 
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...ijsrd.com
 
A Review: Microwave Energy for materials processing
A Review: Microwave Energy for materials processingA Review: Microwave Energy for materials processing
A Review: Microwave Energy for materials processingijsrd.com
 
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web LogsWeb Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logsijsrd.com
 
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEMAPPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEMijsrd.com
 
Making model of dual axis solar tracking with Maximum Power Point Tracking
Making model of dual axis solar tracking with Maximum Power Point TrackingMaking model of dual axis solar tracking with Maximum Power Point Tracking
Making model of dual axis solar tracking with Maximum Power Point Trackingijsrd.com
 
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...ijsrd.com
 
Study and Review on Various Current Comparators
Study and Review on Various Current ComparatorsStudy and Review on Various Current Comparators
Study and Review on Various Current Comparatorsijsrd.com
 
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...ijsrd.com
 
Defending Reactive Jammers in WSN using a Trigger Identification Service.
Defending Reactive Jammers in WSN using a Trigger Identification Service.Defending Reactive Jammers in WSN using a Trigger Identification Service.
Defending Reactive Jammers in WSN using a Trigger Identification Service.ijsrd.com
 

Mehr von ijsrd.com (20)

IoT Enabled Smart Grid
IoT Enabled Smart GridIoT Enabled Smart Grid
IoT Enabled Smart Grid
 
A Survey Report on : Security & Challenges in Internet of Things
A Survey Report on : Security & Challenges in Internet of ThingsA Survey Report on : Security & Challenges in Internet of Things
A Survey Report on : Security & Challenges in Internet of Things
 
IoT for Everyday Life
IoT for Everyday LifeIoT for Everyday Life
IoT for Everyday Life
 
Study on Issues in Managing and Protecting Data of IOT
Study on Issues in Managing and Protecting Data of IOTStudy on Issues in Managing and Protecting Data of IOT
Study on Issues in Managing and Protecting Data of IOT
 
Interactive Technologies for Improving Quality of Education to Build Collabor...
Interactive Technologies for Improving Quality of Education to Build Collabor...Interactive Technologies for Improving Quality of Education to Build Collabor...
Interactive Technologies for Improving Quality of Education to Build Collabor...
 
Internet of Things - Paradigm Shift of Future Internet Application for Specia...
Internet of Things - Paradigm Shift of Future Internet Application for Specia...Internet of Things - Paradigm Shift of Future Internet Application for Specia...
Internet of Things - Paradigm Shift of Future Internet Application for Specia...
 
A Study of the Adverse Effects of IoT on Student's Life
A Study of the Adverse Effects of IoT on Student's LifeA Study of the Adverse Effects of IoT on Student's Life
A Study of the Adverse Effects of IoT on Student's Life
 
Pedagogy for Effective use of ICT in English Language Learning
Pedagogy for Effective use of ICT in English Language LearningPedagogy for Effective use of ICT in English Language Learning
Pedagogy for Effective use of ICT in English Language Learning
 
Virtual Eye - Smart Traffic Navigation System
Virtual Eye - Smart Traffic Navigation SystemVirtual Eye - Smart Traffic Navigation System
Virtual Eye - Smart Traffic Navigation System
 
Ontological Model of Educational Programs in Computer Science (Bachelor and M...
Ontological Model of Educational Programs in Computer Science (Bachelor and M...Ontological Model of Educational Programs in Computer Science (Bachelor and M...
Ontological Model of Educational Programs in Computer Science (Bachelor and M...
 
Understanding IoT Management for Smart Refrigerator
Understanding IoT Management for Smart RefrigeratorUnderstanding IoT Management for Smart Refrigerator
Understanding IoT Management for Smart Refrigerator
 
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
 
A Review: Microwave Energy for materials processing
A Review: Microwave Energy for materials processingA Review: Microwave Energy for materials processing
A Review: Microwave Energy for materials processing
 
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web LogsWeb Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
 
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEMAPPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
 
Making model of dual axis solar tracking with Maximum Power Point Tracking
Making model of dual axis solar tracking with Maximum Power Point TrackingMaking model of dual axis solar tracking with Maximum Power Point Tracking
Making model of dual axis solar tracking with Maximum Power Point Tracking
 
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
 
Study and Review on Various Current Comparators
Study and Review on Various Current ComparatorsStudy and Review on Various Current Comparators
Study and Review on Various Current Comparators
 
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
 
Defending Reactive Jammers in WSN using a Trigger Identification Service.
Defending Reactive Jammers in WSN using a Trigger Identification Service.Defending Reactive Jammers in WSN using a Trigger Identification Service.
Defending Reactive Jammers in WSN using a Trigger Identification Service.
 

Kürzlich hochgeladen

PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLManishPatel169454
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICSUNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICSrknatarajan
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spaintimesproduction05
 

Kürzlich hochgeladen (20)

PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICSUNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 

Speaker Recognition System using MFCC and Vector Quantization Approach

  • 1. IJSRD - International Journal for Scientific Research & Development| Vol. 1, Issue 9, 2013 | ISSN (online): 2321-0613 All rights reserved by www.ijsrd.com 1934 Abstract—This paper presents an approach to speaker recognition using frequency spectral information with Mel frequency for the improvement of speech feature representation in a Vector Quantization codebook based recognition approach. The Mel frequency approach extracts the features of the speech signal to get the training and testing vectors. The VQ Codebook approach uses training vectors to form clusters and recognize accurately with the help of LBG algorithm Key words: Speaker Recognition, MFCC, Mel Frequencies, Vector Quantization. I. INTRODUCTION Speech is the most natural way of communicating. Unlike other forms of identification, such as passwords or keys, speech is the most non-intrusive as a biometric. A person’s voice cannot be stolen, forgotten or lost, therefore speaker recognition allows for a secure method of authenticating speakers. In speaker recognition system, an unknown speaker is compared against a database of known speakers, and the best matching speaker is given as the identification result. This system is based on the speaker-specific information which is included in speech waves. Earlier systems used the traditional approaches but in this paper, we intend to increase the efficiency and accuracy of the system by making use of a new approach which would include the fusion neural networks and clustering algorithms. The general process of speaker identification involves two stages: 1) Training Mode 2) Recognition Mode In the training mode, a new speaker (with known identity) is enrolled into the system’s database. In the recognition mode, an unknown speaker gives a speech input and the system makes a decision about the speaker’s identity. II. BLOCK DIAGRAM The process of real time speaker identification system consists of two main phases. During the first phase, speaker enrolment, speech samples are collected from the speakers, and they are used to train their models. The collection of enrolled models is also called a speaker database. In the second phase, identification phase, a test sample from an unknown speaker is compared against the speaker database. Both phases include the same first step, feature extraction, which is used to extract speaker dependent characteristics from speech. The main purpose of this step is to reduce the amount of test data while retaining speaker discriminative information. Then in the enrolment phase, these features are modelled and stored in the speaker database. This process is represented in Fig. 1. Fig. 1: Speaker Enrolment In the identification step, the extracted features are compared against the models stored in the speaker database. Based on these comparisons the final decision about speaker identity is made. This process is represented in Fig. 2. Fig. 2: Speaker Authentication III. PRE-PROCESSING Speech is recorded by sampling the input which results in a discrete time speech signal. Pre-processing is a technique used to make discrete time speech signal more amendable for the process that follows. The pre-processing techniques are used to enhance feature extraction. They include pre- emphasis, framing and windowing. Pre-emphasis is extensively explained below whereas frame blocking and windowing are explained in further parts. A. Pre-emphasis Pre-emphasis is a technique used in speech processing to enhance high frequencies of the signal. There are two main Speaker Recognition System using MFCC and Vector Quantization Approach Deepak Harjani1 Mohita Jethwani2 Ms. Mani Roja3 1, 2 Student 3 Associate Professor 1, 3 Dept. of Electronics and Telecommunication 2 Dept. of Computer Science 1, 2, 3 Thadomal Shahani Engineering College, Mumbai, India
  • 2. Speaker Recognition System using MFCC and Vector Quantization Approach (IJSRD/Vol. 1/Issue 9/2013/0061) All rights reserved by www.ijsrd.com 1935 factors driving the need for pre-emphasis. Firstly, the speech signal generally contains more speaker specific information in the higher frequencies than the lower frequencies. Secondly, pre-emphasis removes some of the glottal effects from the vocal tract parameters. For voiced sounds which have a steep roll-off in the high frequency region, the glottal source has an approximately –12dB/octave slope. However, when the acoustic energy radiates from the lips, this causes a roughly +6dB/octave boost to the spectrum. As a result, a speech signal when recorded with a microphone from a distance, has approximately –6dB /octave slope downward compared to the true spectrum. Therefore, by applying pre- emphasis, the spectrum is flattened, consisting of formats of similar heights. The spectrum of unvoiced sound is already flat therefore there is no reason to pre-emphasize them. This allows feature extraction to focus on all aspects of the speech signal. Pre-emphasis is implemented as a first-order Finite Impulse Response (FIR) filter defined as: H(z) = 1 – αz-1 (1.1) Generally α is chosen to be between 0.9 and 0.95. We have used α=0.95. B. Frame Blocking To prevent the occurrence of aliasing effect, frame blocking is used to convert the continuous speech signal into frames of desired sample length. In this step the continuous speech signal is blocked into frames of N samples, with adjacent frames being separated by M such that M < N. The first frame consists of the first N samples. The second frame begins M samples after the first frame, and overlaps it by N - M samples. Similarly, the third frame begins 2M samples after the first frame (or M samples after the second frame) and overlaps it by N - 2M samples. This process continues until all the speech is accounted for within one or more frames. Typical values for N and M are N = 256 (which is equivalent to ~ 30 mSec windowing and facilitate the fast radix-2 FFT) and M = 100. [1] C. Windowing The signal obtained after frame blocking has signal discontinuities at the beginning and end of each frame. To minimize these signal discontinuities blocking is used. In blocking, the spectral distortion is minimized by using a window to taper the signal to zero at the beginning and at the end of each frame. The different windows available for this process include rectangular window, triangular window, hanning window, hamming window, etc. If we define the window as w(n), 0 ≤ n ≤ N–1, where N is the number of samples in each frame, then the result of windowing is the signal as in Equation 1.2 y1(n) = x1(n) * w(n) (1.2) Typically the Hamming window is used for the windowing process, which has the form as in Equation 1.3: ( ) ( ) (1.3) IV. FEATURE EXTRACTION The amount of data, generated during the speech production, is quite large while the essential characteristics of the speech process change relatively slowly and therefore, they require less data. According to these matters feature extraction is a process of reducing data while retaining speaker discriminative information. [2] In order to create a speaker profile, the speech signal must be analyzed to produce some representation that can be used as a basis for such a model. In speech analysis this is known as feature extraction. Feature extraction allows for speaker specific characteristics to be derived from the speech signal, which are used to create a speaker model. The speaker model uses a distortion measure to determine features which are similar. This places importance on the features extracted, to accurately represent the speech signal. Feature extraction phase consists of transforming the speech signal in a set of feature vectors called parameters. [3] The aim of this transformation is to obtain a new representation which is more compact, less redundant, and more suitable for statistical modeling and calculation of distances. Most of the speech parameterizations used in speaker recognition systems relies on cepstral representation of the speech signal. [4] A wide range of possibilities exist for parametrically representing the speech signal for the speaker recognition task, such as Linear Prediction Coding (LPC), Mel- Frequency Cepstrum Coefficients (MFCC), and others. MFCC is perhaps the best known and most popular, and these will be used in this project. V. MEL-FREQUENCY CEPSTRAL COEFFICIENTS MFCC’s are based on the known variation of the human ear’s critical bandwidths with frequency; filters spaced linearly at low frequencies and logarithmically at high frequencies have been used to capture the phonetically important characteristics of speech. This is expressed in the mel-frequency scale, which is linear frequency spacing below 1000 Hz and a logarithmic spacing above 1000 Hz. [1] Fig. 3: Block diagram of MFCC processor A block diagram of the structure of an MFCC processor is given in Figure 2. The speech input is typically recorded at a sampling rate above 10000 Hz. This sampling frequency was chosen to minimize the effects of aliasing in the Analog -to digital conversion. These sampled signals can capture all frequencies up to 5 kHz, which cover most energy of sounds that are generated by humans. As been discussed previously, the main purpose of the MFCC processor is to mimic the behaviour of the human ears. In addition, rather than the speech waveforms themselves, MFFC’s are shown to be less susceptible to mentioned variations. Figure 2 shows the block diagram of the MFCC processor.
  • 3. Speaker Recognition System using MFCC and Vector Quantization Approach (IJSRD/Vol. 1/Issue 9/2013/0061) All rights reserved by www.ijsrd.com 1936 A. Fast Fourier Transform All this while the computations have been carried out in the time domain. But since for MFCC, we need the samples in the frequency domain, we use the Fast Fourier Transform method to convert each frame of N samples from time domain into the frequency domain. The FFT is a fast algorithm to implement the Discrete Fourier Transform (DFT) which is defined on the set of N samples {xn} as in Equation 2.1: ∑ (2.1) The spectrum obtained after the fast Fourier transform is used for mel-frequency warping. B. Mel-Frequency Warping MFCCs are typically computed by using a bank of triangular-shaped filters, with the centre frequency of the filter spaced linearly for frequencies less than 1000 Hz and logarithmically above 1000 Hz. The bandwidth of each filter is determined by the centre frequencies of the two adjacent filters and is dependent on the frequency range of the filter bank and number of filters chosen for design. But for the human auditory system it is estimated that the filters have a bandwidth that is related to the centre frequency of the filter. Further it has been shown that there is no evidence of two regions (linear and logarithmic) in the experimentally determined Mel frequency scale. Recent studies on the effectiveness of different frequency regions of the speech spectrum for speaker recognition frequency-scale warping method provided better performance than standard Mel scale filter bank. [4] The frequency-scale warping is implemented by using the bilinear transform method. A range of warping functions can be achieved by using the bilinear transform technique. It is extremely flexible and a wide range of warping functions can be obtained by suitably fixing a “warping factor” for the given sampling frequency. We use this framework to carry out an experimental study toward determining the optimal choice of frequency-scale warping for the speaker recognition task. Fig. 4: Frequency warping using second Order Bilinear transform A nonlinear warping of the frequency scale can be effected by bilinear transformation, given as the transfer function of a first-order all pass filters. ( ) = (2.2) This mapping in the Z-plane maps the unit circle onto itself. From (4) we can obtain ώ = arg(D(e-jω )) = | | ( ) | | ( ) (2.3) More control over frequency warping can be achieved by using a second order bilinear transform [4]. C. Cepstrum After frequency warping, we convert the log Mel spectrum from frequency back into time. The result obtained after the Cepstrum operation is called the Mel Frequency Cepstrum Coefficients (MFCC). Since the Mel spectrum coefficients are real numbers, so are their logarithmic values. Therefore we can convert them back into the time domain by using simple Discrete Cosine Transform. Therefore if we denote those mel power spectrum coefficients that are the result of the last step are Sk, k=1, 2…K, we can calculate the MFCC's, as in Equation 2.4: ξn = ∑ ( ) [ ( ) ] (2.4) The cepstral representation of the speech spectrum provides a good representation of the local spectral properties of the signal for the given frame analysis. [3] VI. FEATURE MATCHING AND SPEAKER RECOGNITION Fig. 5: Conceptual diagram illustrating vector quantization codebook formation The state-of-the-art in feature matching techniques used in speaker recognition includes Dynamic Time Warping (DTW), Hidden Markov Modelling (HMM), and Vector Quantization (VQ). In this project, the VQ approach will be used, due to ease of implementation and high accuracy. VQ is a process of mapping vectors from a large vector space to a finite number of regions in that space. Each region is called a cluster and can be represented by its centre called a codeword. The collection of all codeword is called a codebook.
  • 4. Speaker Recognition System using MFCC and Vector Quantization Approach (IJSRD/Vol. 1/Issue 9/2013/0061) All rights reserved by www.ijsrd.com 1937 Figure 5 shows a conceptual diagram to illustrate this recognition process. In the figure, only two speakers and two dimensions of the acoustic space are shown. The training vectors obtained are used to build a speech specific VQ codebook for the speaker dictionary. The LBG algorithm is used for clustering a set of L training vectors into a set of M codebook vectors. This algorithm designs an M-vector codebook in stages. [6] It starts first by designing a 1-vector codebook, then uses a splitting technique on the codeword to initialize the search for a 2-vector codebook, and continues the splitting process until the desired M- vector codebook is obtained. This algorithm is based on the nearest-neighbour search procedure which assigns each training vector to a cluster associated with the closest codeword. The centroid of the clusters obtained and the distortion calculates the sum of the distances of all training vectors so as to decide whether the procedure has converged. The point of convergence is used to decide the result of the speaker recognition. [7] VII. EXPERIMENTAL RESULTS The database consists of 20 distinct speakers including both male and female speakers. It also contains 50 sound files used for training and testing the Speaker Recognition module. New sound files are recorded in real time for testing the Continuous Speech Recognition module in clean and noisy environments for both multi-speaker and speaker- independent modes. Recognition rate of the trained VQ Codebook model is defined as follows: (3.1) In the equation 3.1, RR is the recognition rate, Ncorrect is the number of correct recognition of testing speech samples per digit, and Ntotal is the total number of testing speech samples per digit [8]. Environment Number of Samples Tested Number of Samples Recognized Accurately Recognition Rate (%) Clean 20 19 95 Noisy 20 16 80 Table 1: Overall Recognition Rate of Proposed Speaker Recognition System Feature Extraction Technique Feature recognition technique Recognition rate (%) Reference LPC VQ and HMM 62% to 96% [9] MFCC VQ 70% to 85% [10] MFCC VQ 88.88% [2] MFCC VQ 57% to 100% [11] Table. 2: Comparative Performance of various Speaker Recognition Researches VIII. CONCLUSION The performance measured on the basis of accuracy, time taken to compute the feature recognition, it was observed that the Speaker Recognition System performs well in both clean and noisy environment with both multi-speaker and speaker independent modes. The entire research process was carried out using MATLAB R13 on an Intel i5 powered machine. It is noticed that the recognition results on the clean environment are much higher than the recognition results of the noisy environment. ACKNOWLEDGMENT The authors would like to thank the college authorities for providing infrastructure to carry out experimental and research work required. The authors would also like to thank Ms Mani Roja, Associate Professor of the Electronics and Telecommunication Department of Thadomal Shahani Engineering College for reviewing the paper and guiding the authors with their valuable feedbacks. REFERENCES [1] Santosh Gaikwad “Feature Extraction Using Fusion MFCC For Continuous Marathi Speech Recognition”. [2] M. A. M. Abu Shariah, R. N. Ainon, R. Zainuddin, and O. O. Khalifa, “Human Computer Interaction Using Isolated-Words Speech Recognition Technology,” IEEE Proceedings of The International Conference on Intelligent and Advanced Systems (ICIAS’07), Kuala Lumpur, Malaysia, pp. 1173 – 1178, 2007. [3] Ibrahim Patel, Dr.Y.Srinivasa Rao “Speech Recognition using Hidden Markov Model with MFCC- Sub band technique” [4] S.K., Podder, “Segment-based Stochastic Modelings for Speech Recognition”. PhD Thesis. Department of Electrical and Electronic Engineering, Ehime University, Matsuyama 790-77, Japan, 1997. [5] Pradeep Kumar P and Preeti Rao” A Study of Frequency-Scale Warping for Speaker Recognition” Englewood Cliffs, N.J., 1993. [6] Clarence Goh Kok Leon “Robust Computer Voice Recognition Using Improved MFCC Algorithm”, 2009 International Conference on New Trends in Information and Service Science. [7] Ahmad A. M. Abushariah ‘English Digits Speech Recognition System Based on Hidden Markov Models”, Conference on Computer and Communication Engineering (ICCCE 2010). [8] S.K., Podder, “Segment-based Stochastic Modelings for Speech Recognition”. PhD Thesis. Department of Electrical and Electronic Engineering, Ehime University, Matsuyama 790-77, Japan, 1997. [9] S.M., Ahadi, H., Sheikhzadeh, R.L., Brennan, and G.H., Freeman, “An Efficient Front-End for Automatic Speech Recognition”. IEEE International Conference on Electronics, Circuits and Systems (ICECS2003), Sharjah, United Arab Emirates, 2003. [10] M.R., Hasan, M., Jamil, and M.G., Saifur Rahman, “Speaker Identification Using Mel Frequency Cepstral Coefficients”. 3rd International Conference on Electrical and Computer Engineering, Dhaka, Bangladesh, pp. 565-568, 2004. [11] M.Z., Bhotto and M.R., Amin, “Bangali Text Dependent Speaker Identification Using MelFrequency Cepstrum Coefficient and Vector Quantization”. 3rd International Conference on Electrical and Computer Engineering, Dhaka, Bangladesh, pp. 569-572, 2004.