SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY &
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

MANAGEMENT INFORMATION SYSTEM (IJITMIS)

ISSN 0976 – 6405(Print)
ISSN 0976 – 6413(Online)
Volume 4, Issue 3, September - December (2013), pp. 68-84
© IAEME: http://www.iaeme.com/IJITMIS.asp
Journal Impact Factor (2013): 5.2372 (Calculated by GISI)
www.jifactor.com

IJITMIS
©IAEME

DESIGN A TEXT-PROMPT SPEAKER RECOGNITION SYSTEM
USING LPC-DERIVED FEATURES
Dr. Mustafa Dhiaa Al-Hassani1, Dr. Abdulkareem A. Kadhim2
1

2

Computer Science/Mustansiriyah University, Baghdad, Iraq
College of Information Technology/Al-Nahrain University, Baghdad, Iraq

ABSTRACT
Humans are integrated closer to computers every day, and computers are taking over
many services that used to be based on face-to-face contact between humans. This has
prompted an active development in the field of biometric systems. The use of biometric
information has been known widely for both person identification and security applications.
The paper is concerned with the use of speaker features for protection against unauthorized
access. A speaker recognition system for 6304 speech samples is presented that relies on LPCderived features. A vocabulary of 46 speech samples is built for 10 speakers, where each
authorized person is asked to utter every sample 10 times. Two different modes are considered
in identifying individuals according to their speech samples. In the closed-set speaker
identification, it is found that all tested LPC-derived features outperform the raw LPC
coefficients and 84% to 97% identification rates are achieved. Applying the preprocessing steps
to the speech signals (preemphasis, remove DC offset, frame blocking, overlapping,
normalization and windowing) improve the representation of speech features, and up to 100%
identification rate was obtained using weighted Linear Predictive Cepstral Coefficients
(LPCC). In the open-set speaker verification mode of our proposed system model, the system
selects randomly a pass phrase of 8-samples length from its database for each trial a speaker is
presented to the system. Up to 213 text-prompt trials from 23-different speakers (authorized
and unauthorized) are recorded (i.e., 1704 samples) in order to study the system behavior and
to generate the optimal threshold in which the speakers are verified or not when compared to
those training references of authorized speakers constructed in the first mode, where the best
obtained speaker verification rate is greater than 99%.
Keywords: Biometric, LPC-derived features,
Identification, Speaker Verification, Text-prompt.
68

LSF,

Speaker

Recognition,

Speaker
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

I. INTRODUCTION
As everyday life is getting more and more computerized, automated security systems
are getting more and more important. Today most personal banking tasks can be performed
over the Internet and soon they can also be performed on mobile devices such as cell phones
and PDAs. The key task of an automated security system is to verify that the users are in fact
those who claim to be [1].
Since the level of security breaches and transaction fraud increases, the need for highly
secure identification and personal verification technologies is becoming apparent. Biometricbased solutions are able to provide confidential financial transactions and personal data privacy
[2]. The need for biometrics can be found in federal, state and local governments, in the
military, and in commercial applications [1, 3]. A biometric system is essentially a pattern
recognition system that establishes the authenticity of a specific physiological or behavioral
characteristic possessed by a user. They are typically based on some single biometric feature of
humans, but several hybrid systems also exist [2, 4, 5, 1, 6].
Human voice can serve as a key for any security objects, and it is not easy to lose or
forget it. This technique can be used to verify the identity claimed by people accessing systems;
that is, it enables control of access to various services by voice [3, 7]. Speaker recognition has
received for many years the attention of researchers working in the field of signal processing.
This technology has been developed in such a way that it can be used in a number of
applications, such as: voice dialing, banking over a telephone network, person authentication,
remote access to computers, command and control systems, network security and protection,
entry and access control systems, data access/information retrieval, Monitoring, … etc [8, 5, 9,
10, 11].
II. AIM OF THE WORK
This work aims to build a speaker recognition (identification/verification) system that
automatically authenticate a speaker's identity by his/her voice, according to a random textprompt generated by the system, and then gives only the authorized persons a privilege or an
access right to the facility that need to be protected from the intrusion of unauthorized persons.
III. THE PROPOSED SPEAKER RECOGNITION SYSTEM MODEL
In this section, several linear prediction based methods (LPC, PARCOR, LAR, ASRC,
LPCC, and LSF) are tested for text-dependent speaker recognition system in a closed-set mode.
The open-set speaker verification mode is also investigated, which involves speaker’s
verification according to a randomly text-prompt sentence generated by the system. The block
diagram for the proposed speaker recognition system model, shown in Fig. (1), illustrates that
the input speech is passed through six preprocessing operations (preemphasis, remove DC
offset, frame blocking, overlapping, normalization and windowing) prior to feature extraction
phase. If the match is lower than certain threshold, then the identity claims is verified
"Accepted", otherwise, the speaker is "Rejected" [1].

69
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

Figure (1): Block-Diagram of the proposed Speaker Recognition System Model
3.1. Speech Recording
Any speaker recognition system depends on speech recording samples as input data.
The speech signals used for training and testing are recorded in a quiet (but not a soundproof)
rooms via high quality built-in microphone and digitized by a sound card of type
Crystal − Intel (r) integrated audio using DELL Latitude C400 Notebook and having the
following recording features: .wav file format, 11 kHz sampling rate, 2-bytes/sample and single
channel [1].
3.2. Database Construction
In this work, database samples were recorded in two modes of operation:
Closed-set speaker identification mode
Open-set speaker verification mode
In order to evaluate the identification/verification performance of the proposed system model,
each speaker is asked to utter the vocabulary data sets, shown in Table-1, for a maximum of 10
utterances/sample.
The number of repetition R ( 1≤ R < 10 ) can be considered as training set during an
enrollment phase to train the speaker’s model of authorized persons, and the other ( 10 – R )
repetitions are considered for testing during a matching phase to classify them with those
training references in the database. As a result, the total database size of speaker’s samples for
this mode is [1]:
Total DB S ize = 10 × No. of Samples × No. of Spea ker s

(1)

No. of Training Re ferences = R × No.of Samples × No. of Speaker s

(2)

No. of Test Samples = (10 − R) × No. of Samples × No. of Spea ker s

(3)

70
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

Table-1: The recorded speech samples
Data Sets

Speech Samples

1) Digits

0 ... 9

2) Characters
3) Words

‘A’ ... ‘Z’
Accept, Reject, Open, Close, Help,
Computer, Yes, No, Copy, Paste

For practical purposes, these data sets are very interesting because the similarities
between several samples (especially letters) lead to the realization of important problems in
speech recognition.
In the closed-set speaker identification mode, up to 4600 samples were collected from
different persons, whereas 1704 samples were recorded in the open-set speaker verification
mode.
In the open-set speaker verification mode of our proposed system model, the system
selects randomly a pass phrase of 8-samples length from its database for each trial a speaker is
presented to the system. Up to 213 text-prompt trials from different speakers (i.e., authorized
and unauthorized) are recorded (i.e., 1704 samples) in order to study the system behavior. In
fact, the generated text-prompt sentence, shown in Table-2, is a random number between 1 and
46 which corresponds to the samples in the vocabulary shown in Table-1. This is performed in
order to study the system behavior and to generate the optimal threshold in which the speakers
are verified to be accepted or not when compared to those training references of authorized
speakers constructed in the first mode [1].
Table-2: Examples of Randomly Text-Prompt Sentences generated by the System

Table-2 illustrates five examples of text-prompt sentences generated by the system
where column Si ( i=1,2,...,8 ) stands for sample number i , which compose a sentence in each
row [1].

71
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

3.3. Preprocessing
The basic idea behind speech preprocessing is to generate a signal with a fine structure
as close as possible to that of the original speech signal. This produces a data reduction facility
with easier task analysis [11]. A number of processing techniques adopted in this system model
are applied in the following sequence:

• Preemphasis
Usually the digital speech signal, s[n], is preemphasized first. This is achieved by
passing the signal through a high-pass filter. This process emphasis the high frequencies
relative to low frequencies, hence, compensating the effect of band limiting the input signal
with a low-pass filter in the recording process. The most commonly used preemphasis filter is
given by the following transfer function [12, 13, 10, 14]:
(4)
where α typically lies in the range of 0.9 ≤ α < 1.0 , which controls the slope of the filter
that is simply implemented as a first order differentiator:
(5)
For the proposed system model α is set to 0.95 [1].

• The Removal of DC offset
DC offset occurs when hardware, such as a sound card, adds DC current to a recorded
audio signal. This current produces a recorded waveform that is not centered on the baseline.
Therefore, removing this DC offset is the process of forcing the input signal mean to the
baseline by adding a constant value to the samples in the sound file. An illustrative example
of removing DC offset from a waveform file is shown in Fig. (2) [1].

Figure (2): Removal of DC offset from a Waveform file (a) Exhibits DC offset,
(b) After the removal of DC offset

• Frame-Blocking
It is the process of blocking or splitting the input speech samples into equal durations of
N samples length to carryout frame-wise analysis. The selection of the frame length is a crucial
parameter for successful spectral analysis, due to the trade-off between the time and frequency
resolutions. The window should be long enough for adequate frequency resolution, but on the
other hand, it should be short enough so that it would capture the local spectral properties.
Typically a frame length of 10 − 30 milliseconds is used. The signal for the i-th frame is given
by [15, 14, 10, 12]:
72
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

(6)
In this work, a frame length N = 256 samples with a duration of 23.2 milliseconds is used [1].

• Overlapping
Usually adjacent frames are overlapped. The frame is shifted forward by a fixed
amount, typically 30 – 50 % of the frame length along the signal. The purpose of the
overlapping is to avoid losing of information since that each speech sound of the input
sequence would be approximately centered at some frame [1, 15, 13, 16].
• Normalization
The frames of speech are normalized to make their power equal to unity. This step is
very important since the extracted frames have different intensities due to the speaker loudness,
speaker distance from the microphone and recording level. The normalization is done by
dividing each sample by the square root of the sum of squares of all the samples in the segment
as stated below:
(7)
where S[n] is the speech sample, N is the number of samples in the segment which is
256, and the subscript norm refers to normalization [1].

• Windowing
The purpose of windowing is to reduce the effect of spectral-leakage (type of distortion
in spectral analysis) that results from the framing process. Windowing involves multiplying a
speech signal x(n) by a finite-duration window w(n), which yields a set of speech samples
weighted by the shape of the window, as stated by the following equation [1, 15, 13, 17, 12]:
(8)
where N is the size of the window or frame.
There exist many different windowing functions; Table-3 lists the window functions
that are used in our experiments and their shapes illustrated in Fig. (3) [1].
Table-3: Rectangular, Hamming and Kaiser Window-Function

73
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

Figure (3): Rectangular, Hamming and Kaiser Window-Function of 256 Samples Length
[1]
3.4. Feature Extraction
Having acquired the testing or training utterances, it is now the role of the feature
extractor to extract the features from the speech samples. Feature extraction refers to the
process of reducing dimensionality by forming a new “smaller” set of features from the original
feature set of the patterns. This can be done by extracting some numerical measurements from
raw input patterns [8, 1, 15]. Several linear prediction based features are tested, which include
LPC, PARCOR, LAR, ASRC, LPCC, and LSF.
• Linear Predictive Coding (LPC)
Linear prediction (LP) forms an integral part of almost all modern day speech coding
algorithms. The fundamental idea is that a speech sample can be approximated as a linear
combination of past samples. Within a signal frame, the weights used to compute the linear
combination are found by minimizing the mean-squared prediction error; the resultant weights,
or linear prediction coefficients (LPCs), are used to represent the particular frame [18]. The
importance of this method lies in its ability to provide extremely accurate estimates of the
speech parameters, and in its relative speed of computation [20, 19].
The LPC model, assumes that each sample s(n) at time n, can be approximated by a
linear sum of the p previous samples
p

s[ n ] ≈

∑

a [ k ] s[ n − k ]

k =1

(9)

where s[n] is an approximation of the present output, s[n−k] are past outputs, p is the prediction
order; and {a[k]}, k = 1...p are the model parameters called the predictor coefficients that need
to be determined so that the average prediction error (or residual) is as small as possible [10,
19].
The prediction error for nth sample is given by the difference between the actual sample
and its predicted value [1, 13, 20, 10]:
p

e[ n ] = s[ n] − ∑ a[ k ] s[ n − k ]
k =1

74

(10)
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

Equivalently,

p

s[ n ] = ∑ a[ k ] s[ n − k ] + e[ n ]

(11)

k =1

When the prediction residual e[n] is small, predictor Eq. (9) approximates s[n] well.
The total squared prediction error is given by
E =

∑ e[ n ]

2

n
p

=

∑ ( s[ n ] − ∑ a [ k ] s [ n − k ])

2

(12)

k =1

n

Minimization of error is achieved by setting the partial derivatives of E with respect to
the model parameters {a[k]} to zero:
∂E
= 0, k = 1,..., p
(13)
∂ a[ k ]
By writing out Eq. (13) for k = 1 ... p, the problem of finding the optimal predictor
coefficients reduced to solve of so-called (Yule- Walker) AR equations. Depending on the
choice of the error minimization interval in Eq. (12), there are two methods for solving the AR
equations: covariance method and autocorrelation method [13, 10, 19]. The two methods do not
have large difference, but the autocorrelation method is the preferred since it is computationally
more efficient and it always guarantees a stable filter.
In matrix form, the set of linear equations is represented by Ra = v which can be
rewritten as [13, 1, 21]:

R

a

v

 R(0)
R(1) K R(p−1)   a1   R(1) 

  

R(0) K R(p− 2)  a2   R(2) 
 R(1)
=
 M
M
O
M  M   M 

  

R(p−1) R(p− 2) K R(0)   ap   R(p) 

  


(14)

where R is a special type of matrix called Toeplitz matrix (symmetric with all diagonal
elements equal, this facilitates the solution of the Yule-Walker equations for the LP coefficients
{ak} through computationally fast algorithms such as the Levinson – Durbin algorithm), a is
the vector of the LPC coefficients and v is the autocorrelation. Both the matrix R and vector v
are completely defined by p autocorrelation samples. The autocorrelation sequence of s[n] is
defined as [1, 21, 10, 19, 13]:

R[ k ] =

1
N

N −1− k

∑ s[ n ] s [ n + k ]
n=0

(15)

where N is the number of data points in the segment.
Due to the redundancy in the Yule-Walker (AR) equations, there exists an efficient
algorithm for finding the solution, known as Levinson-Durbin recursion [1, 10, 19, 20, 13].
75
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

E ( 0 ) = R( 0 )
ki
ai( i )

(16)



=  R( i ) −


= ki

i −1

∑a

( i −1 )
j

j =1



R( i − j ) E( i − 1 ) , 1 ≤ i ≤ p



a(j i ) = a(j i −1 ) − ki ai(−i −1 ) ,
j

1 ≤ j ≤ i −1

E ( i ) = ( 1 − ki2 ) E ( i −1 )

(17)

(18)
(19)

where
k i : Partial Correlation Coefficients (PARCOR).
a j ( i ) : is the jth predictor (LPC) coefficient after i iterations.
E ( i ) : is the prediction error after i iterations.
The Levinson-Durbin procedure takes the autocorrelation sequence as its input, and
produces the coefficients a[k]; k = 1… p. The time complexity of the procedure is O(p2) as
opposed to standard Gaussian elimination method whose complexity is O(p3). Equations (16 –
19) are solved recursively for i = 1, 2, …, p, where p is the order of the LPC analysis and the
final solution is given as [13, 1, 10, 20, 19]:

aj = aj( p) ,

1≤ j ≤ p

(20)

• Partial Correlation Coefficients (PARCOR)
Several alternative representations can be derived from LPC coefficients when the
autocorrelation method is used. The Levinson-Durbin algorithm produces the quantities {[k i ]};
i = 1, 2, … p (are in the range of -1 ≤ k i ≤ 1), which is known as the reflection or PARCOR
coefficients [13, 1].
• Log Area Ratio (LAR)
A new parameter set, which can be derived from the PARCOR coefficients, is obtained
by taking the logarithm of the area ratio, yielding log area ratios (LARs) {g i } defined as [19,
20, 22, 10, 1, 13].

 1 − ki 
g i = log 
(21)
1+ k , 1 ≤ i ≤ p


i 
• Arcsin Reflection Coefficients (ASRC)
An alternative for the log area ratios are arcsin reflection coefficients, simply computed
as taking the sine inverse of the reflection coefficients [10, 1, 13].

arcsin i = sin −1 ( k i ) ,

76

1≤ i ≤ p

(22)
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

• Linear Predictive Cepstral Coefficients (LPCC)
An important fact is that cepstrum can also be derived directly from the LPC parameter
set. The relationship between cepstrum coefficients c n and prediction coefficients a k is
represented in the following equations [1, 9, 13]:
c1 = a1
n−1

cn = ∑ (1 − k / n ) . ak . cn−k + an , 1 < n ≤ p

(23)

k =1

where p is a prediction order. It is usually said that the cepstrum, derived in such a way
represents the “smoothed” version of the spectrum. Similar to LPC analysis, increasing the
number of coefficients results in more details [10, 4].
Because of the sensitivity of the low-order cepstral coefficients to overall spectral slope
and the sensitivity of the high-order cepstral coefficients to noise (and other forms of noise-like
variability), it has become a standard technique to weight the cepstral coefficients by a tapered
window so as to minimize these sensitivities and improving the performance of these
coefficients [19, 14, 13, 1]. To achieve the robustness for large values of n, it must consider a
more general weighting of the form:
where

)
c (n) = c(n) × w(n) , 1 ≤ n ≤ p

(24)

 P
πn 
w(n) = 1 + sin(
) ,
p 
 2


(25)

1≤ n ≤ p

This weighting function truncates the computation and deemphasis c n around n = 1 and
around n = P [19].
• Line Spectral Frequencies (LSFs)
Another representation of the LP parameters of the all-pole spectrum is the set of line
spectral frequencies (LSF’s) or line spectrum pairs (LSP’s) [23, 21]. It is proposed to be
employed in speech compression and other audio signals, which is the most widely
representation of LPC parameters used for quantization and coding but they have been applied
with good results to speaker recognition [23, 24, 10, 1]. LSFs are the roots of the following
polynomials:

P(z) = B(z ) + z -(p+ 1 ) B(z -1 )

Q(z) = B(z ) - z -(p+ 1 ) B(z -1 )

(26)
(27)

where B(z) = 1/H(z) = 1 − A(z) is the inverse LPC filter. The roots of P(z) and Q(z) are
interleaved and occur in complex-conjugate pairs so that only p/2 roots are retained for each of
P(z) and Q(z) (p roots in total). Also, the root magnitudes are known to be unity and, therefore,
only their angles (frequencies) are needed.
Each root of B(z) corresponds to one root in each of P(z) and Q(z). Therefore, if the
frequencies of this pair of roots are close, then the original root in B(z) likely represents a
formant, and, otherwise, this latter root represents a wide bandwidth feature of the spectrum.
These correspondences provide us with an intuitive interpretation of the LSP coefficients [13].

77
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

3.5. Pattern Matching
The resulting test template, which is an N-dimensional feature vector, is compared
against the stored reference templates to find the closest match. The process is to find which
unknown class matches a predefined class or classes. For the speaker identification task, the
unknown speaker is compared to all references in the database. This comparison can be done
through Euclidean (E.D.) or city-block (C.D.) distance measures [1, 25, 26], as shown below:
N

∑1 ( a i
i=

E .D . =

C .D . =

N

∑

i=1

a

i

(28)

− bi ) 2

− bi

(29)

where A and B are two vectors, such that A = [a1 a2 … aN] and B = [b1 b2 … bN].
3.6. Decision Rule
The decision rule process is to select the pattern that best match the unknown one. The
primary methods for the discrimination process are either to measure the difference between
the two feature vectors or to measure the similarity. In our approach the minimum distance
classifier, by measuring the difference between the two patterns, is used for speaker
recognition. This classifier assigns the unknown pattern to the nearest predefined pattern. The
bigger distance between the two vectors, is the greater difference. On the other hand, the
identity of the unknown speaker was verified by considering the best matched reference in the
database where their distance is lower than a certain threshold [1, 25, 26].
IV. EXPERIMENTAL RESULTS
Many experiments and test conditions were accomplished to measure the performance
of the proposed system with different criterions concerning: preemphasis, frame overlapping,
LPC order, window type, cepstral weighting and the text-prompt speaker verification.
The identification rate is defined as the ratio of correct identified speakers to the total
number of test samples which corresponds to a nearest neighbor decision rule.

Identifica tion Rate =

No. of Correctly Identified Spea ker s
Total No. of Samples Tested

× 100 %

(30)

4.1. Identification Rate for LP based Coefficients
A more appropriate comparison can be made if the entire LP based coefficients methods
(LPC, PARCOR, LAR, ASRC, LPCC and LSF) are measured under identical conditions (the
order of LP based coefficients P = 15, remove DC offset, no overlap between successive
frames, normalization, rectangular window type). The classification results are shown in Table4 and its equivalent chart Fig. (4).

78
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

Table-4: Identification Rate for the LP based Coefficients
Euclidean Distance City-block Distance
(C. D.)
(E. D.)
84.173
84.217
LPC
95.173
95.260
PARCOR
94.000
94.652
LAR
94.826
95.608
ASRC
97.087
97.521
LPCC
95.695
95.782
LSF

Figure (4): Identification Rate for the LP based Coefficients
It is clear from Table-4 and its corresponding chart Fig. (4), that all tested LPC-derived
features outperform the raw LPC coefficients which give about 84% identification rates.
4.2. Preemphasis of Speech Signals
There is a need to see the effects of preemphasis on digital speech signals before any
further preprocessing steps. This is obviously demonstrated in the classification results of Fig.
(5) according to the following conditions: preemphasis of speech signals, P = 15, remove DC
offset, no overlap between successive frames, normalization, rectangular window type, and
City-block distance measure were used.

Figure (5): Effect of Preemphasis Speech Samples on Identification Rates
Figure (5) clearly indicates the higher improvements in identification rates overall LPCbased systems in the range of 93% to 98% after applying the preemphasis step to the speech
signal.
79
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

4.3. LPC Predictor Order (P)
The order of the linear prediction analysis (P) is a compromise among spectral accuracy
and computation complexity (time/memory). Based on the previous tests; further improvements
in identification rates, shown by Table-5 and Fig. (6), can be achieved when LPC predictor
order (P) is studied according to different values (P = 15, 30, 45) with overlapping successive
frames to 50% of frame size.
Table-5: Identification Rates for different LPC Predictor Order (P =15, 30, 45)
P = 15
P = 30
P = 45
92.695
96.347
97.565
LPC
97.652
99.347
99.869
PARCOR
95.478
98.782
99.347
LAR
97.434
99.173
99.695
ASRC
99.130
99.956
99.956
LPCC
98.130
99.695
99.826
LSF

100
LPC
PARCOR
LAR
ASRC
LPCC
LSF

Identification Rate %

99
98
97
96
95
94
93
92
15

30

45

LPC Order P (Number of Coefficients)

Figure (6): Effect of LPC Predictor Order (P =15, 30, 45) on Identification Rates
It is clearly seen from the results of Table-5 and Fig. (6) that the increasing number of
predictor order P with the overlap between successive frames give positive influence for most
identification rates. Therefore, the predictor order P is taken to be 45 for the next experimental
tests.
4.4. Windowing Function
After determining the appropriate LPC predictor order, the system behavior for
different window types must be studied. Therefore, Table-6 is considered for this purpose
according to the following conditions: Rectangular, Hamming and Kaiser window types,
overlap successive frames to 50% of frame size, LPCC cepstral weighting.
From this experiment, it is clearly indicated that the speaker identification rates are
improved further by adopting new window types like Kaiser window. The latter gives the best
accuracy when compared to the other two window types used (rectangular and Hamming).

80
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

Table-6: Identification Rates for LP based Coefficients with different Window Type
Rectangular Hamming Kaiser
97.5652
94.9130
97.6087
LPC
99.8696
99.4783
99.7391
PARCOR
99.3478
99.4783
99.6087
LAR
99.6957
99.4783
99.6522
ASRC
99.9565
99.9565
100
LPCC
99.8261
99.9130
99.9130
LSF
4.5. Text-Prompt Speaker Verification
Another test is needed for the sake of verifying speakers identity from a randomly textprompt generated by the system. This is relies on the best results obtained from the previous
experiments, it is undoubtedly illustrated that LPCC exhibits paramount results when compared
to other LP based coefficients. Therefore, it is selected to be the feature extraction method for
speaker verification mode. The advantage of text-prompting is that a possible intruder cannot
know beforehand what the phrase will be because the prompt text is changed on each trial.
Furthermore, our system takes additional precautions for the recording time by forcing the user
to utter the pass phrase within a short time interval (up to 15 seconds), which provides
additional difficulty on the intruder to use a device or software that synthesizes the user’s
voice. It is worthwhile that the system is automatically split the sentence back to its attribute
samples, then a pattern matching process is performed only to those equivalent samples
features in the database (with regards to the system security threshold).
A total of 213 speakers’ trials (8 samples length each) from 23 persons (authorized and
unauthorized) are considered for verification test to obtain the optimum threshold of Crossover
Error Rate (CER). This is defined as the point where the False Rejection Rate (FRR) and the
False Acceptance Rate (FAR) curves meet in verifying user's identity. Different threshold
values were considered in the verification test, as shown in Table-7.
Table-7: Text-Prompt Speaker Verification Rates for different Thresholds using Cityblock distance
Threshold Successful
FAR
FRR
(θ)
Decision
15.05
65.2582
0.0000
34.7418
15.40
71.8310
0.0000
28.1690
15.75
83.5681
0.0000
16.4319
16.10
92.9578
0.0000
7.0422
16.45
97.1831
0.0000
2.8169
16.80
99.5305
0.0000
0.4695
17.15
99.5305
0.0000
0.4695
17.50
97.1831
2.8169
0.0000
17.85
96.2442
3.7558
0.0000
18.20
95.3052
4.6948
0.0000
18.55
95.3052
4.6948
0.0000
The successful decision in Table-7 corresponds to the rate of accepting registered
persons and rejecting non-registered ones for all trials. The variation of FAR and FRR with
81
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

different threshold values are also shown in Fig. (7), where the obtained CER is approximately
17.15 (which is the most suitable security threshold) for 99.53% successful decision rate.

Figure (7): FAR and FRR Performance Curve for different threshold levels using cityblock distance
V. CONCLUSION
A speaker recognition system for 6304 speech samples is presented that relies on LPCderived features and acceptable results have been obtained.
In the closed-set speaker identification, it is found that all tested LPC-derived features
outperform the raw LPC coefficients where 84% to 97% identification rates are achieved. An
improvement in identification rates with LPC-based systems is obtained in the range of 97% to
99% by applying the preprocessing steps (preemphasis, remove DC offset, frame blocking,
overlap successive frames to 50% of frame size, normalization and windowing) to the speech
signal and increasing the predictor order (P). According to speaker identification tests
performed, one can deduce that LPCC exhibits paramount results when compared to other LPC
based coefficients. However, the accuracy can be further improved by weighting the cepstral
coefficients to obtain identification rates close to 100%.
The open-set speaker verification mode is also presented for 213 trials (randomly textprompt sentences generated by the system) from 23 persons (1704 samples). The obtained
verification rates, greater than 99%, using our proposed system model is considered to be quite
suitable.
VI. REFERENCES
[1] Mustafa D. Al-Hassani, “Identification Techniques using Speech Signals and
Fingerprints”, Ph.D. Thesis, Department of Computer Science, Al-Nahrain University,
Baghdad, Iraq, September 2006.
[2] Tiwalade O. Majekodunmi, Francis E. Idachaba, “A Review of the Fingerprint, Speaker
Recognition, Face Recognition and Iris Recognition Based Biometric Identification
Technologies”, Proceedings of the World Congress on Engineering Vol. II WCE,
London, U.K, 2011.
[3] M. Eriksson, “Biometrics Fingerprint based identity verification”, M. Sc. Thesis,
Department of Computer Science, UMEÅ University, August 2001.
82
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

[4] Yuan Yujin, Zhao Peihua, ZhouQun, “Research of Speaker Recognition Based on
Combination of LPCC and MFCC”, Electronic Information Engineering, Training and
Experimental Center, Handan College, China, 2010.
[5] Anil K. Jain and Arun Ross, "Introduction to Biometrics”, Springer Science+Business
Media, LLC, USA, 2008.
[6] S. Gunnam, “Fingerprint Recognition and Analysis System”, A mini-thesis Presented to
Dr. David P. Beach, Dept of Electronics and Computer Technology, Indiana state
university, Terre Haute, In Partial Fulfillment of the Requirements for ECT 680, April
2004.
[7] E. Hjelms, “Biometric Systems: A Face Recognition Approach”, Department of
Informatics, University of Oslo, Oslo, Norway, 2000.
[8] Valentin Andrei, Constantin Paleologu, Corneliu Burileanu, “Implementation of a RealTime Text Dependent Speaker Identification System”, University “Politehnica” of
Bucharest, Romania, 2011.
[9] E. Karpov, “Real-Time Speaker Identification”, M. Sc. Thesis, Department of Computer
Science, University of Joensuu, Finland, January 2003.
[10] T. Kinnunen, “Spectral Features for Automatic Text-Independent Speaker Recognition”,
Ph. D. Thesis, Department of Computer Science, University of Joensuu, Finland,
December 2003.
[11] T. Chen, “The Past, Present, and Future of Speech Processing”, IEEE Signal Processing
Magazine, No.5, May 1998.
[12] Biswajit Kar, Sandeep Bhatia & P. K. Dutta, “Audio -Visual Biometric Based Speaker
Identification”, International Conference on Computational Intelligence and Multimedia
Applications, India, 2007.
[13] Antonio M. Peinado, Jos´e C. Segura, “Speech Recognition Over Digital Channels:
Robustness and Standards”, John Wiley & Sons Ltd, University of Granada, Spain,
2006.
[14] B. R. Wildermoth, “Text-Independent Speaker Recognition using Source Based
Features”, M. Sc. Thesis, Griffith University, Australia, January 2001.
[15] Ch.Srinivasa Kumar, P. Mallikarjuna Rao ,“Design of An Automatic Speaker Recognition
System Using MFCC, Vector Quantization And LBG Algorithm”, Ch.Srinivasa Kumar et
al. / International Journal on Computer Science and Engineering (IJCSE), Vol. 3 No. 8
August 2011.
[16] Ciira wa Maina and John MacLaren Walsh, “Log Spectra Enhancement Using Speaker
Dependent Priors for Speaker Verification”, Drexel University, Department of Electrical
and Computer Engineering, Philadelphia, PA 19104, 2011.
[17] Ning WANG, P. C. CHING, and Tan LEE, “Robust Speaker Verification Using Phase
Information of Speech”, Department of Electronic Engineering, The Chinese University
of Hong Kong, 2010.
[18] Wai C. Chu, “Speech Coding Algorithms: Foundation and Evolution of Standardized
Coders”, John Wiley & Sons, Inc., California, USA, 2003.
[19] L. Rabiner, B.-H. Juang, “Fundamentals of Speech Recognition”, Prentice-Hall, Inc.,
Englewood Cliffs, New Jersey, 1993.
[20] Yasir. A.-M. Taleb, “Statistical and Wavelet Approaches for Speaker Identification”, M.
Sc. Thesis, Department of Computer Engineering, Al-Nahrain University, Iraq, June
2003.
[21] N. Batri, “Robust Spectral Parameter Coding in Speech Processing”, M. Sc. Thesis,
Department of Electrical Engineering, McGill University, Montreal, Canada, May 1998.
83
International Journal of Information Technology & Management Information System (IJITMIS), ISSN
0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME

[22] J. P. Campbell, “Speaker Recognition: A Tutorial”, IEEE Proceedings, Vol. 85, No. 9,
1997.
[23] A. K. Khandani and F. Lahouti, “Intra-frame and Inter-frame Coding of Speech: LSF
Parameters Using a Trellis Structure”, Department of Electrical and Computer
Engineering, University of Waterloo, Ontario, Canada, June 2000.
[24] J. Rothweiler, “A Root Finding Algorithm for Line Spectral Frequencies”, Proceedings of
the IEEE International Conference on Acoustics, Speech, and Signal Processing
(ICASSP-99), March 15-19, U.S.A., 1999.
[25] S. E. Umbaugh, “Computer Vision and Image Processing”, Prentice-Hall, Inc., U.S.A.,
1998.
[26] R. C. Gonzalez, Richard E. Woods, “Digital Image Processing”, Second Edition,
Prentice-Hall Inc., New Jersey, U.S.A., 2002.
[27] Lokesh S. Khedekar and Dr.A.S.Alvi, “Advanced Smart Credential Cum Unique
Identification and Recognition System. (ASCUIRS)”, International Journal of Computer
Engineering & Technology (IJCET), Volume 4, Issue 1, 2013, pp. 97 - 104, ISSN Print:
0976 – 6367, ISSN Online: 0976 – 6375.
[28] Pallavi P. Ingale and Dr. S.L. Nalbalwar, “Novel Approach to Text Independent Speaker
Identification”, International Journal of Electronics and Communication Engineering &
Technology (IJECET), Volume 3, Issue 2, 2012, pp. 87 - 93, ISSN Print: 0976- 6464,
ISSN Online: 0976 –6472.
[29] Vijay M.Mane, GauravV. Chalkikar and Milind E. Rane, “Multiscale Iris Recognition
System”, International Journal of Electronics and Communication Engineering &
Technology (IJECET), Volume 3, Issue 1, 2012, pp. 317 - 324, ISSN Print: 0976- 6464,
ISSN Online: 0976 –6472.
[30] Dr. Mustafa Dhiaa Al-Hassani, Dr. Abdulkareem A. Kadhim and Dr. Venus W. Samawi,
“Fingerprint Identification Technique Based on Wavelet-Bands Selection Features
(WBSF)”, International Journal of Computer Engineering & Technology (IJCET),
Volume 4, Issue 3, 2013, pp. 308 - 323, ISSN Print: 0976 – 6367, ISSN Online:
0976 – 6375.
[31] Viplav Gautam, Saurabh Sharma, Swapnil Gautam and Gaurav Sharma, “Identification
and Verification of Speaker using Mel Frequency Cepstral Coefficient”, International
Journal of Electronics and Communication Engineering & Technology (IJECET),
Volume 3, Issue 2, 2012, pp. 413 - 423, ISSN Print: 0976- 6464, ISSN Online: 0976 –
6472.

84

Weitere ähnliche Inhalte

Andere mochten auch

Fra Big Data til Small Data - Ina Svarød
Fra Big Data til Small Data -  Ina SvarødFra Big Data til Small Data -  Ina Svarød
Fra Big Data til Small Data - Ina SvarødBouvet ASA
 
Bài Giảng Vi Xử Lý ICTU
Bài Giảng Vi Xử Lý ICTUBài Giảng Vi Xử Lý ICTU
Bài Giảng Vi Xử Lý ICTUNgô Doãn Tình
 
Актуальный ландшафт угроз ИБ
Актуальный ландшафт угроз ИБ Актуальный ландшафт угроз ИБ
Актуальный ландшафт угроз ИБ Solar Security
 
Chemytri Presentation
Chemytri PresentationChemytri Presentation
Chemytri Presentationklaudiusio
 
Digital strategi - den pragmatiske tilgang.
Digital strategi - den pragmatiske tilgang. Digital strategi - den pragmatiske tilgang.
Digital strategi - den pragmatiske tilgang. Kristian Friis
 
Conventions of music videos
Conventions of music videosConventions of music videos
Conventions of music videosOliviaaaa1
 
Evolve Social Credentials
Evolve Social Credentials Evolve Social Credentials
Evolve Social Credentials David Wesson
 
Personalised Cancer Care December 2014
Personalised Cancer Care December 2014Personalised Cancer Care December 2014
Personalised Cancer Care December 2014ANGLE plc
 

Andere mochten auch (14)

Fra Big Data til Small Data - Ina Svarød
Fra Big Data til Small Data -  Ina SvarødFra Big Data til Small Data -  Ina Svarød
Fra Big Data til Small Data - Ina Svarød
 
Algorithm Class at KPHB C, c++, ds,cpp,java,data structures training institut...
Algorithm Class at KPHB C, c++, ds,cpp,java,data structures training institut...Algorithm Class at KPHB C, c++, ds,cpp,java,data structures training institut...
Algorithm Class at KPHB C, c++, ds,cpp,java,data structures training institut...
 
Bài Giảng Vi Xử Lý ICTU
Bài Giảng Vi Xử Lý ICTUBài Giảng Vi Xử Lý ICTU
Bài Giảng Vi Xử Lý ICTU
 
Актуальный ландшафт угроз ИБ
Актуальный ландшафт угроз ИБ Актуальный ландшафт угроз ИБ
Актуальный ландшафт угроз ИБ
 
Leucipo
LeucipoLeucipo
Leucipo
 
Chemytri Presentation
Chemytri PresentationChemytri Presentation
Chemytri Presentation
 
Presentation1
Presentation1Presentation1
Presentation1
 
Digital strategi - den pragmatiske tilgang.
Digital strategi - den pragmatiske tilgang. Digital strategi - den pragmatiske tilgang.
Digital strategi - den pragmatiske tilgang.
 
50120130405015
5012013040501550120130405015
50120130405015
 
Conventions of music videos
Conventions of music videosConventions of music videos
Conventions of music videos
 
Evolve Social Credentials
Evolve Social Credentials Evolve Social Credentials
Evolve Social Credentials
 
40120130405007
4012013040500740120130405007
40120130405007
 
30120130405027
3012013040502730120130405027
30120130405027
 
Personalised Cancer Care December 2014
Personalised Cancer Care December 2014Personalised Cancer Care December 2014
Personalised Cancer Care December 2014
 

Ähnlich wie 50320130403005 2

Improving the accuracy of fingerprinting system using multibiometric approach
Improving the accuracy of fingerprinting system using multibiometric approachImproving the accuracy of fingerprinting system using multibiometric approach
Improving the accuracy of fingerprinting system using multibiometric approachIJERA Editor
 
A Robust Speaker Identification System
A Robust Speaker Identification SystemA Robust Speaker Identification System
A Robust Speaker Identification Systemijtsrd
 
AN EFFICIENT SPEECH RECOGNITION SYSTEM
AN EFFICIENT SPEECH RECOGNITION SYSTEMAN EFFICIENT SPEECH RECOGNITION SYSTEM
AN EFFICIENT SPEECH RECOGNITION SYSTEMcseij
 
Improvement of Security Systems by Keystroke Dynamics of Passwords
Improvement of Security Systems by Keystroke Dynamics of PasswordsImprovement of Security Systems by Keystroke Dynamics of Passwords
Improvement of Security Systems by Keystroke Dynamics of PasswordsIJCSIS Research Publications
 
SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK Kamonasish Hore
 
Parameters Optimization for Improving ASR Performance in Adverse Real World N...
Parameters Optimization for Improving ASR Performance in Adverse Real World N...Parameters Optimization for Improving ASR Performance in Adverse Real World N...
Parameters Optimization for Improving ASR Performance in Adverse Real World N...Waqas Tariq
 
Identity authentication using voice biometrics technique
Identity authentication using voice biometrics techniqueIdentity authentication using voice biometrics technique
Identity authentication using voice biometrics techniqueeSAT Journals
 
Speech Emotion Recognition Using Machine Learning
Speech Emotion Recognition Using Machine LearningSpeech Emotion Recognition Using Machine Learning
Speech Emotion Recognition Using Machine LearningIRJET Journal
 
Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...IJECEIAES
 
IRJET- Voice Command Execution with Speech Recognition and Synthesizer
IRJET- Voice Command Execution with Speech Recognition and SynthesizerIRJET- Voice Command Execution with Speech Recognition and Synthesizer
IRJET- Voice Command Execution with Speech Recognition and SynthesizerIRJET Journal
 
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONHMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONAM Publications
 
Speaker identification under noisy conditions using hybrid convolutional neur...
Speaker identification under noisy conditions using hybrid convolutional neur...Speaker identification under noisy conditions using hybrid convolutional neur...
Speaker identification under noisy conditions using hybrid convolutional neur...IAESIJAI
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...AIRCC Publishing Corporation
 
Speaker recognition in android
Speaker recognition in androidSpeaker recognition in android
Speaker recognition in androidAnshuli Mittal
 
Using AI to recognise person
Using AI to recognise personUsing AI to recognise person
Using AI to recognise personSolutionsPortal
 

Ähnlich wie 50320130403005 2 (20)

Improving the accuracy of fingerprinting system using multibiometric approach
Improving the accuracy of fingerprinting system using multibiometric approachImproving the accuracy of fingerprinting system using multibiometric approach
Improving the accuracy of fingerprinting system using multibiometric approach
 
50120140502007
5012014050200750120140502007
50120140502007
 
A Robust Speaker Identification System
A Robust Speaker Identification SystemA Robust Speaker Identification System
A Robust Speaker Identification System
 
50120140505010
5012014050501050120140505010
50120140505010
 
AN EFFICIENT SPEECH RECOGNITION SYSTEM
AN EFFICIENT SPEECH RECOGNITION SYSTEMAN EFFICIENT SPEECH RECOGNITION SYSTEM
AN EFFICIENT SPEECH RECOGNITION SYSTEM
 
19 ijcse-01227
19 ijcse-0122719 ijcse-01227
19 ijcse-01227
 
Improvement of Security Systems by Keystroke Dynamics of Passwords
Improvement of Security Systems by Keystroke Dynamics of PasswordsImprovement of Security Systems by Keystroke Dynamics of Passwords
Improvement of Security Systems by Keystroke Dynamics of Passwords
 
SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK
 
Parameters Optimization for Improving ASR Performance in Adverse Real World N...
Parameters Optimization for Improving ASR Performance in Adverse Real World N...Parameters Optimization for Improving ASR Performance in Adverse Real World N...
Parameters Optimization for Improving ASR Performance in Adverse Real World N...
 
Identity authentication using voice biometrics technique
Identity authentication using voice biometrics techniqueIdentity authentication using voice biometrics technique
Identity authentication using voice biometrics technique
 
Speech Emotion Recognition Using Machine Learning
Speech Emotion Recognition Using Machine LearningSpeech Emotion Recognition Using Machine Learning
Speech Emotion Recognition Using Machine Learning
 
Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...
 
IRJET- Voice Command Execution with Speech Recognition and Synthesizer
IRJET- Voice Command Execution with Speech Recognition and SynthesizerIRJET- Voice Command Execution with Speech Recognition and Synthesizer
IRJET- Voice Command Execution with Speech Recognition and Synthesizer
 
J1802035460
J1802035460J1802035460
J1802035460
 
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONHMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
 
50120140504016
5012014050401650120140504016
50120140504016
 
Speaker identification under noisy conditions using hybrid convolutional neur...
Speaker identification under noisy conditions using hybrid convolutional neur...Speaker identification under noisy conditions using hybrid convolutional neur...
Speaker identification under noisy conditions using hybrid convolutional neur...
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 
Speaker recognition in android
Speaker recognition in androidSpeaker recognition in android
Speaker recognition in android
 
Using AI to recognise person
Using AI to recognise personUsing AI to recognise person
Using AI to recognise person
 

Mehr von IAEME Publication

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME Publication
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...IAEME Publication
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSIAEME Publication
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSIAEME Publication
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSIAEME Publication
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSIAEME Publication
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOIAEME Publication
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IAEME Publication
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYIAEME Publication
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...IAEME Publication
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEIAEME Publication
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...IAEME Publication
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...IAEME Publication
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...IAEME Publication
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...IAEME Publication
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...IAEME Publication
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...IAEME Publication
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...IAEME Publication
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...IAEME Publication
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTIAEME Publication
 

Mehr von IAEME Publication (20)

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdf
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
 

Kürzlich hochgeladen

2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 

Kürzlich hochgeladen (20)

2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 

50320130403005 2

  • 1. International Journal of Information Technology & Management Information System (IJITMIS), ISSN INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME MANAGEMENT INFORMATION SYSTEM (IJITMIS) ISSN 0976 – 6405(Print) ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), pp. 68-84 © IAEME: http://www.iaeme.com/IJITMIS.asp Journal Impact Factor (2013): 5.2372 (Calculated by GISI) www.jifactor.com IJITMIS ©IAEME DESIGN A TEXT-PROMPT SPEAKER RECOGNITION SYSTEM USING LPC-DERIVED FEATURES Dr. Mustafa Dhiaa Al-Hassani1, Dr. Abdulkareem A. Kadhim2 1 2 Computer Science/Mustansiriyah University, Baghdad, Iraq College of Information Technology/Al-Nahrain University, Baghdad, Iraq ABSTRACT Humans are integrated closer to computers every day, and computers are taking over many services that used to be based on face-to-face contact between humans. This has prompted an active development in the field of biometric systems. The use of biometric information has been known widely for both person identification and security applications. The paper is concerned with the use of speaker features for protection against unauthorized access. A speaker recognition system for 6304 speech samples is presented that relies on LPCderived features. A vocabulary of 46 speech samples is built for 10 speakers, where each authorized person is asked to utter every sample 10 times. Two different modes are considered in identifying individuals according to their speech samples. In the closed-set speaker identification, it is found that all tested LPC-derived features outperform the raw LPC coefficients and 84% to 97% identification rates are achieved. Applying the preprocessing steps to the speech signals (preemphasis, remove DC offset, frame blocking, overlapping, normalization and windowing) improve the representation of speech features, and up to 100% identification rate was obtained using weighted Linear Predictive Cepstral Coefficients (LPCC). In the open-set speaker verification mode of our proposed system model, the system selects randomly a pass phrase of 8-samples length from its database for each trial a speaker is presented to the system. Up to 213 text-prompt trials from 23-different speakers (authorized and unauthorized) are recorded (i.e., 1704 samples) in order to study the system behavior and to generate the optimal threshold in which the speakers are verified or not when compared to those training references of authorized speakers constructed in the first mode, where the best obtained speaker verification rate is greater than 99%. Keywords: Biometric, LPC-derived features, Identification, Speaker Verification, Text-prompt. 68 LSF, Speaker Recognition, Speaker
  • 2. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME I. INTRODUCTION As everyday life is getting more and more computerized, automated security systems are getting more and more important. Today most personal banking tasks can be performed over the Internet and soon they can also be performed on mobile devices such as cell phones and PDAs. The key task of an automated security system is to verify that the users are in fact those who claim to be [1]. Since the level of security breaches and transaction fraud increases, the need for highly secure identification and personal verification technologies is becoming apparent. Biometricbased solutions are able to provide confidential financial transactions and personal data privacy [2]. The need for biometrics can be found in federal, state and local governments, in the military, and in commercial applications [1, 3]. A biometric system is essentially a pattern recognition system that establishes the authenticity of a specific physiological or behavioral characteristic possessed by a user. They are typically based on some single biometric feature of humans, but several hybrid systems also exist [2, 4, 5, 1, 6]. Human voice can serve as a key for any security objects, and it is not easy to lose or forget it. This technique can be used to verify the identity claimed by people accessing systems; that is, it enables control of access to various services by voice [3, 7]. Speaker recognition has received for many years the attention of researchers working in the field of signal processing. This technology has been developed in such a way that it can be used in a number of applications, such as: voice dialing, banking over a telephone network, person authentication, remote access to computers, command and control systems, network security and protection, entry and access control systems, data access/information retrieval, Monitoring, … etc [8, 5, 9, 10, 11]. II. AIM OF THE WORK This work aims to build a speaker recognition (identification/verification) system that automatically authenticate a speaker's identity by his/her voice, according to a random textprompt generated by the system, and then gives only the authorized persons a privilege or an access right to the facility that need to be protected from the intrusion of unauthorized persons. III. THE PROPOSED SPEAKER RECOGNITION SYSTEM MODEL In this section, several linear prediction based methods (LPC, PARCOR, LAR, ASRC, LPCC, and LSF) are tested for text-dependent speaker recognition system in a closed-set mode. The open-set speaker verification mode is also investigated, which involves speaker’s verification according to a randomly text-prompt sentence generated by the system. The block diagram for the proposed speaker recognition system model, shown in Fig. (1), illustrates that the input speech is passed through six preprocessing operations (preemphasis, remove DC offset, frame blocking, overlapping, normalization and windowing) prior to feature extraction phase. If the match is lower than certain threshold, then the identity claims is verified "Accepted", otherwise, the speaker is "Rejected" [1]. 69
  • 3. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME Figure (1): Block-Diagram of the proposed Speaker Recognition System Model 3.1. Speech Recording Any speaker recognition system depends on speech recording samples as input data. The speech signals used for training and testing are recorded in a quiet (but not a soundproof) rooms via high quality built-in microphone and digitized by a sound card of type Crystal − Intel (r) integrated audio using DELL Latitude C400 Notebook and having the following recording features: .wav file format, 11 kHz sampling rate, 2-bytes/sample and single channel [1]. 3.2. Database Construction In this work, database samples were recorded in two modes of operation: Closed-set speaker identification mode Open-set speaker verification mode In order to evaluate the identification/verification performance of the proposed system model, each speaker is asked to utter the vocabulary data sets, shown in Table-1, for a maximum of 10 utterances/sample. The number of repetition R ( 1≤ R < 10 ) can be considered as training set during an enrollment phase to train the speaker’s model of authorized persons, and the other ( 10 – R ) repetitions are considered for testing during a matching phase to classify them with those training references in the database. As a result, the total database size of speaker’s samples for this mode is [1]: Total DB S ize = 10 × No. of Samples × No. of Spea ker s (1) No. of Training Re ferences = R × No.of Samples × No. of Speaker s (2) No. of Test Samples = (10 − R) × No. of Samples × No. of Spea ker s (3) 70
  • 4. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME Table-1: The recorded speech samples Data Sets Speech Samples 1) Digits 0 ... 9 2) Characters 3) Words ‘A’ ... ‘Z’ Accept, Reject, Open, Close, Help, Computer, Yes, No, Copy, Paste For practical purposes, these data sets are very interesting because the similarities between several samples (especially letters) lead to the realization of important problems in speech recognition. In the closed-set speaker identification mode, up to 4600 samples were collected from different persons, whereas 1704 samples were recorded in the open-set speaker verification mode. In the open-set speaker verification mode of our proposed system model, the system selects randomly a pass phrase of 8-samples length from its database for each trial a speaker is presented to the system. Up to 213 text-prompt trials from different speakers (i.e., authorized and unauthorized) are recorded (i.e., 1704 samples) in order to study the system behavior. In fact, the generated text-prompt sentence, shown in Table-2, is a random number between 1 and 46 which corresponds to the samples in the vocabulary shown in Table-1. This is performed in order to study the system behavior and to generate the optimal threshold in which the speakers are verified to be accepted or not when compared to those training references of authorized speakers constructed in the first mode [1]. Table-2: Examples of Randomly Text-Prompt Sentences generated by the System Table-2 illustrates five examples of text-prompt sentences generated by the system where column Si ( i=1,2,...,8 ) stands for sample number i , which compose a sentence in each row [1]. 71
  • 5. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME 3.3. Preprocessing The basic idea behind speech preprocessing is to generate a signal with a fine structure as close as possible to that of the original speech signal. This produces a data reduction facility with easier task analysis [11]. A number of processing techniques adopted in this system model are applied in the following sequence: • Preemphasis Usually the digital speech signal, s[n], is preemphasized first. This is achieved by passing the signal through a high-pass filter. This process emphasis the high frequencies relative to low frequencies, hence, compensating the effect of band limiting the input signal with a low-pass filter in the recording process. The most commonly used preemphasis filter is given by the following transfer function [12, 13, 10, 14]: (4) where α typically lies in the range of 0.9 ≤ α < 1.0 , which controls the slope of the filter that is simply implemented as a first order differentiator: (5) For the proposed system model α is set to 0.95 [1]. • The Removal of DC offset DC offset occurs when hardware, such as a sound card, adds DC current to a recorded audio signal. This current produces a recorded waveform that is not centered on the baseline. Therefore, removing this DC offset is the process of forcing the input signal mean to the baseline by adding a constant value to the samples in the sound file. An illustrative example of removing DC offset from a waveform file is shown in Fig. (2) [1]. Figure (2): Removal of DC offset from a Waveform file (a) Exhibits DC offset, (b) After the removal of DC offset • Frame-Blocking It is the process of blocking or splitting the input speech samples into equal durations of N samples length to carryout frame-wise analysis. The selection of the frame length is a crucial parameter for successful spectral analysis, due to the trade-off between the time and frequency resolutions. The window should be long enough for adequate frequency resolution, but on the other hand, it should be short enough so that it would capture the local spectral properties. Typically a frame length of 10 − 30 milliseconds is used. The signal for the i-th frame is given by [15, 14, 10, 12]: 72
  • 6. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME (6) In this work, a frame length N = 256 samples with a duration of 23.2 milliseconds is used [1]. • Overlapping Usually adjacent frames are overlapped. The frame is shifted forward by a fixed amount, typically 30 – 50 % of the frame length along the signal. The purpose of the overlapping is to avoid losing of information since that each speech sound of the input sequence would be approximately centered at some frame [1, 15, 13, 16]. • Normalization The frames of speech are normalized to make their power equal to unity. This step is very important since the extracted frames have different intensities due to the speaker loudness, speaker distance from the microphone and recording level. The normalization is done by dividing each sample by the square root of the sum of squares of all the samples in the segment as stated below: (7) where S[n] is the speech sample, N is the number of samples in the segment which is 256, and the subscript norm refers to normalization [1]. • Windowing The purpose of windowing is to reduce the effect of spectral-leakage (type of distortion in spectral analysis) that results from the framing process. Windowing involves multiplying a speech signal x(n) by a finite-duration window w(n), which yields a set of speech samples weighted by the shape of the window, as stated by the following equation [1, 15, 13, 17, 12]: (8) where N is the size of the window or frame. There exist many different windowing functions; Table-3 lists the window functions that are used in our experiments and their shapes illustrated in Fig. (3) [1]. Table-3: Rectangular, Hamming and Kaiser Window-Function 73
  • 7. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME Figure (3): Rectangular, Hamming and Kaiser Window-Function of 256 Samples Length [1] 3.4. Feature Extraction Having acquired the testing or training utterances, it is now the role of the feature extractor to extract the features from the speech samples. Feature extraction refers to the process of reducing dimensionality by forming a new “smaller” set of features from the original feature set of the patterns. This can be done by extracting some numerical measurements from raw input patterns [8, 1, 15]. Several linear prediction based features are tested, which include LPC, PARCOR, LAR, ASRC, LPCC, and LSF. • Linear Predictive Coding (LPC) Linear prediction (LP) forms an integral part of almost all modern day speech coding algorithms. The fundamental idea is that a speech sample can be approximated as a linear combination of past samples. Within a signal frame, the weights used to compute the linear combination are found by minimizing the mean-squared prediction error; the resultant weights, or linear prediction coefficients (LPCs), are used to represent the particular frame [18]. The importance of this method lies in its ability to provide extremely accurate estimates of the speech parameters, and in its relative speed of computation [20, 19]. The LPC model, assumes that each sample s(n) at time n, can be approximated by a linear sum of the p previous samples p s[ n ] ≈ ∑ a [ k ] s[ n − k ] k =1 (9) where s[n] is an approximation of the present output, s[n−k] are past outputs, p is the prediction order; and {a[k]}, k = 1...p are the model parameters called the predictor coefficients that need to be determined so that the average prediction error (or residual) is as small as possible [10, 19]. The prediction error for nth sample is given by the difference between the actual sample and its predicted value [1, 13, 20, 10]: p e[ n ] = s[ n] − ∑ a[ k ] s[ n − k ] k =1 74 (10)
  • 8. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME Equivalently, p s[ n ] = ∑ a[ k ] s[ n − k ] + e[ n ] (11) k =1 When the prediction residual e[n] is small, predictor Eq. (9) approximates s[n] well. The total squared prediction error is given by E = ∑ e[ n ] 2 n p = ∑ ( s[ n ] − ∑ a [ k ] s [ n − k ]) 2 (12) k =1 n Minimization of error is achieved by setting the partial derivatives of E with respect to the model parameters {a[k]} to zero: ∂E = 0, k = 1,..., p (13) ∂ a[ k ] By writing out Eq. (13) for k = 1 ... p, the problem of finding the optimal predictor coefficients reduced to solve of so-called (Yule- Walker) AR equations. Depending on the choice of the error minimization interval in Eq. (12), there are two methods for solving the AR equations: covariance method and autocorrelation method [13, 10, 19]. The two methods do not have large difference, but the autocorrelation method is the preferred since it is computationally more efficient and it always guarantees a stable filter. In matrix form, the set of linear equations is represented by Ra = v which can be rewritten as [13, 1, 21]: R a v  R(0) R(1) K R(p−1)   a1   R(1)       R(0) K R(p− 2)  a2   R(2)   R(1) =  M M O M  M   M       R(p−1) R(p− 2) K R(0)   ap   R(p)       (14) where R is a special type of matrix called Toeplitz matrix (symmetric with all diagonal elements equal, this facilitates the solution of the Yule-Walker equations for the LP coefficients {ak} through computationally fast algorithms such as the Levinson – Durbin algorithm), a is the vector of the LPC coefficients and v is the autocorrelation. Both the matrix R and vector v are completely defined by p autocorrelation samples. The autocorrelation sequence of s[n] is defined as [1, 21, 10, 19, 13]: R[ k ] = 1 N N −1− k ∑ s[ n ] s [ n + k ] n=0 (15) where N is the number of data points in the segment. Due to the redundancy in the Yule-Walker (AR) equations, there exists an efficient algorithm for finding the solution, known as Levinson-Durbin recursion [1, 10, 19, 20, 13]. 75
  • 9. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME E ( 0 ) = R( 0 ) ki ai( i ) (16)   =  R( i ) −   = ki i −1 ∑a ( i −1 ) j j =1   R( i − j ) E( i − 1 ) , 1 ≤ i ≤ p   a(j i ) = a(j i −1 ) − ki ai(−i −1 ) , j 1 ≤ j ≤ i −1 E ( i ) = ( 1 − ki2 ) E ( i −1 ) (17) (18) (19) where k i : Partial Correlation Coefficients (PARCOR). a j ( i ) : is the jth predictor (LPC) coefficient after i iterations. E ( i ) : is the prediction error after i iterations. The Levinson-Durbin procedure takes the autocorrelation sequence as its input, and produces the coefficients a[k]; k = 1… p. The time complexity of the procedure is O(p2) as opposed to standard Gaussian elimination method whose complexity is O(p3). Equations (16 – 19) are solved recursively for i = 1, 2, …, p, where p is the order of the LPC analysis and the final solution is given as [13, 1, 10, 20, 19]: aj = aj( p) , 1≤ j ≤ p (20) • Partial Correlation Coefficients (PARCOR) Several alternative representations can be derived from LPC coefficients when the autocorrelation method is used. The Levinson-Durbin algorithm produces the quantities {[k i ]}; i = 1, 2, … p (are in the range of -1 ≤ k i ≤ 1), which is known as the reflection or PARCOR coefficients [13, 1]. • Log Area Ratio (LAR) A new parameter set, which can be derived from the PARCOR coefficients, is obtained by taking the logarithm of the area ratio, yielding log area ratios (LARs) {g i } defined as [19, 20, 22, 10, 1, 13].  1 − ki  g i = log  (21) 1+ k , 1 ≤ i ≤ p   i  • Arcsin Reflection Coefficients (ASRC) An alternative for the log area ratios are arcsin reflection coefficients, simply computed as taking the sine inverse of the reflection coefficients [10, 1, 13]. arcsin i = sin −1 ( k i ) , 76 1≤ i ≤ p (22)
  • 10. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME • Linear Predictive Cepstral Coefficients (LPCC) An important fact is that cepstrum can also be derived directly from the LPC parameter set. The relationship between cepstrum coefficients c n and prediction coefficients a k is represented in the following equations [1, 9, 13]: c1 = a1 n−1 cn = ∑ (1 − k / n ) . ak . cn−k + an , 1 < n ≤ p (23) k =1 where p is a prediction order. It is usually said that the cepstrum, derived in such a way represents the “smoothed” version of the spectrum. Similar to LPC analysis, increasing the number of coefficients results in more details [10, 4]. Because of the sensitivity of the low-order cepstral coefficients to overall spectral slope and the sensitivity of the high-order cepstral coefficients to noise (and other forms of noise-like variability), it has become a standard technique to weight the cepstral coefficients by a tapered window so as to minimize these sensitivities and improving the performance of these coefficients [19, 14, 13, 1]. To achieve the robustness for large values of n, it must consider a more general weighting of the form: where ) c (n) = c(n) × w(n) , 1 ≤ n ≤ p (24)  P πn  w(n) = 1 + sin( ) , p   2  (25) 1≤ n ≤ p This weighting function truncates the computation and deemphasis c n around n = 1 and around n = P [19]. • Line Spectral Frequencies (LSFs) Another representation of the LP parameters of the all-pole spectrum is the set of line spectral frequencies (LSF’s) or line spectrum pairs (LSP’s) [23, 21]. It is proposed to be employed in speech compression and other audio signals, which is the most widely representation of LPC parameters used for quantization and coding but they have been applied with good results to speaker recognition [23, 24, 10, 1]. LSFs are the roots of the following polynomials: P(z) = B(z ) + z -(p+ 1 ) B(z -1 ) Q(z) = B(z ) - z -(p+ 1 ) B(z -1 ) (26) (27) where B(z) = 1/H(z) = 1 − A(z) is the inverse LPC filter. The roots of P(z) and Q(z) are interleaved and occur in complex-conjugate pairs so that only p/2 roots are retained for each of P(z) and Q(z) (p roots in total). Also, the root magnitudes are known to be unity and, therefore, only their angles (frequencies) are needed. Each root of B(z) corresponds to one root in each of P(z) and Q(z). Therefore, if the frequencies of this pair of roots are close, then the original root in B(z) likely represents a formant, and, otherwise, this latter root represents a wide bandwidth feature of the spectrum. These correspondences provide us with an intuitive interpretation of the LSP coefficients [13]. 77
  • 11. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME 3.5. Pattern Matching The resulting test template, which is an N-dimensional feature vector, is compared against the stored reference templates to find the closest match. The process is to find which unknown class matches a predefined class or classes. For the speaker identification task, the unknown speaker is compared to all references in the database. This comparison can be done through Euclidean (E.D.) or city-block (C.D.) distance measures [1, 25, 26], as shown below: N ∑1 ( a i i= E .D . = C .D . = N ∑ i=1 a i (28) − bi ) 2 − bi (29) where A and B are two vectors, such that A = [a1 a2 … aN] and B = [b1 b2 … bN]. 3.6. Decision Rule The decision rule process is to select the pattern that best match the unknown one. The primary methods for the discrimination process are either to measure the difference between the two feature vectors or to measure the similarity. In our approach the minimum distance classifier, by measuring the difference between the two patterns, is used for speaker recognition. This classifier assigns the unknown pattern to the nearest predefined pattern. The bigger distance between the two vectors, is the greater difference. On the other hand, the identity of the unknown speaker was verified by considering the best matched reference in the database where their distance is lower than a certain threshold [1, 25, 26]. IV. EXPERIMENTAL RESULTS Many experiments and test conditions were accomplished to measure the performance of the proposed system with different criterions concerning: preemphasis, frame overlapping, LPC order, window type, cepstral weighting and the text-prompt speaker verification. The identification rate is defined as the ratio of correct identified speakers to the total number of test samples which corresponds to a nearest neighbor decision rule. Identifica tion Rate = No. of Correctly Identified Spea ker s Total No. of Samples Tested × 100 % (30) 4.1. Identification Rate for LP based Coefficients A more appropriate comparison can be made if the entire LP based coefficients methods (LPC, PARCOR, LAR, ASRC, LPCC and LSF) are measured under identical conditions (the order of LP based coefficients P = 15, remove DC offset, no overlap between successive frames, normalization, rectangular window type). The classification results are shown in Table4 and its equivalent chart Fig. (4). 78
  • 12. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME Table-4: Identification Rate for the LP based Coefficients Euclidean Distance City-block Distance (C. D.) (E. D.) 84.173 84.217 LPC 95.173 95.260 PARCOR 94.000 94.652 LAR 94.826 95.608 ASRC 97.087 97.521 LPCC 95.695 95.782 LSF Figure (4): Identification Rate for the LP based Coefficients It is clear from Table-4 and its corresponding chart Fig. (4), that all tested LPC-derived features outperform the raw LPC coefficients which give about 84% identification rates. 4.2. Preemphasis of Speech Signals There is a need to see the effects of preemphasis on digital speech signals before any further preprocessing steps. This is obviously demonstrated in the classification results of Fig. (5) according to the following conditions: preemphasis of speech signals, P = 15, remove DC offset, no overlap between successive frames, normalization, rectangular window type, and City-block distance measure were used. Figure (5): Effect of Preemphasis Speech Samples on Identification Rates Figure (5) clearly indicates the higher improvements in identification rates overall LPCbased systems in the range of 93% to 98% after applying the preemphasis step to the speech signal. 79
  • 13. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME 4.3. LPC Predictor Order (P) The order of the linear prediction analysis (P) is a compromise among spectral accuracy and computation complexity (time/memory). Based on the previous tests; further improvements in identification rates, shown by Table-5 and Fig. (6), can be achieved when LPC predictor order (P) is studied according to different values (P = 15, 30, 45) with overlapping successive frames to 50% of frame size. Table-5: Identification Rates for different LPC Predictor Order (P =15, 30, 45) P = 15 P = 30 P = 45 92.695 96.347 97.565 LPC 97.652 99.347 99.869 PARCOR 95.478 98.782 99.347 LAR 97.434 99.173 99.695 ASRC 99.130 99.956 99.956 LPCC 98.130 99.695 99.826 LSF 100 LPC PARCOR LAR ASRC LPCC LSF Identification Rate % 99 98 97 96 95 94 93 92 15 30 45 LPC Order P (Number of Coefficients) Figure (6): Effect of LPC Predictor Order (P =15, 30, 45) on Identification Rates It is clearly seen from the results of Table-5 and Fig. (6) that the increasing number of predictor order P with the overlap between successive frames give positive influence for most identification rates. Therefore, the predictor order P is taken to be 45 for the next experimental tests. 4.4. Windowing Function After determining the appropriate LPC predictor order, the system behavior for different window types must be studied. Therefore, Table-6 is considered for this purpose according to the following conditions: Rectangular, Hamming and Kaiser window types, overlap successive frames to 50% of frame size, LPCC cepstral weighting. From this experiment, it is clearly indicated that the speaker identification rates are improved further by adopting new window types like Kaiser window. The latter gives the best accuracy when compared to the other two window types used (rectangular and Hamming). 80
  • 14. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME Table-6: Identification Rates for LP based Coefficients with different Window Type Rectangular Hamming Kaiser 97.5652 94.9130 97.6087 LPC 99.8696 99.4783 99.7391 PARCOR 99.3478 99.4783 99.6087 LAR 99.6957 99.4783 99.6522 ASRC 99.9565 99.9565 100 LPCC 99.8261 99.9130 99.9130 LSF 4.5. Text-Prompt Speaker Verification Another test is needed for the sake of verifying speakers identity from a randomly textprompt generated by the system. This is relies on the best results obtained from the previous experiments, it is undoubtedly illustrated that LPCC exhibits paramount results when compared to other LP based coefficients. Therefore, it is selected to be the feature extraction method for speaker verification mode. The advantage of text-prompting is that a possible intruder cannot know beforehand what the phrase will be because the prompt text is changed on each trial. Furthermore, our system takes additional precautions for the recording time by forcing the user to utter the pass phrase within a short time interval (up to 15 seconds), which provides additional difficulty on the intruder to use a device or software that synthesizes the user’s voice. It is worthwhile that the system is automatically split the sentence back to its attribute samples, then a pattern matching process is performed only to those equivalent samples features in the database (with regards to the system security threshold). A total of 213 speakers’ trials (8 samples length each) from 23 persons (authorized and unauthorized) are considered for verification test to obtain the optimum threshold of Crossover Error Rate (CER). This is defined as the point where the False Rejection Rate (FRR) and the False Acceptance Rate (FAR) curves meet in verifying user's identity. Different threshold values were considered in the verification test, as shown in Table-7. Table-7: Text-Prompt Speaker Verification Rates for different Thresholds using Cityblock distance Threshold Successful FAR FRR (θ) Decision 15.05 65.2582 0.0000 34.7418 15.40 71.8310 0.0000 28.1690 15.75 83.5681 0.0000 16.4319 16.10 92.9578 0.0000 7.0422 16.45 97.1831 0.0000 2.8169 16.80 99.5305 0.0000 0.4695 17.15 99.5305 0.0000 0.4695 17.50 97.1831 2.8169 0.0000 17.85 96.2442 3.7558 0.0000 18.20 95.3052 4.6948 0.0000 18.55 95.3052 4.6948 0.0000 The successful decision in Table-7 corresponds to the rate of accepting registered persons and rejecting non-registered ones for all trials. The variation of FAR and FRR with 81
  • 15. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME different threshold values are also shown in Fig. (7), where the obtained CER is approximately 17.15 (which is the most suitable security threshold) for 99.53% successful decision rate. Figure (7): FAR and FRR Performance Curve for different threshold levels using cityblock distance V. CONCLUSION A speaker recognition system for 6304 speech samples is presented that relies on LPCderived features and acceptable results have been obtained. In the closed-set speaker identification, it is found that all tested LPC-derived features outperform the raw LPC coefficients where 84% to 97% identification rates are achieved. An improvement in identification rates with LPC-based systems is obtained in the range of 97% to 99% by applying the preprocessing steps (preemphasis, remove DC offset, frame blocking, overlap successive frames to 50% of frame size, normalization and windowing) to the speech signal and increasing the predictor order (P). According to speaker identification tests performed, one can deduce that LPCC exhibits paramount results when compared to other LPC based coefficients. However, the accuracy can be further improved by weighting the cepstral coefficients to obtain identification rates close to 100%. The open-set speaker verification mode is also presented for 213 trials (randomly textprompt sentences generated by the system) from 23 persons (1704 samples). The obtained verification rates, greater than 99%, using our proposed system model is considered to be quite suitable. VI. REFERENCES [1] Mustafa D. Al-Hassani, “Identification Techniques using Speech Signals and Fingerprints”, Ph.D. Thesis, Department of Computer Science, Al-Nahrain University, Baghdad, Iraq, September 2006. [2] Tiwalade O. Majekodunmi, Francis E. Idachaba, “A Review of the Fingerprint, Speaker Recognition, Face Recognition and Iris Recognition Based Biometric Identification Technologies”, Proceedings of the World Congress on Engineering Vol. II WCE, London, U.K, 2011. [3] M. Eriksson, “Biometrics Fingerprint based identity verification”, M. Sc. Thesis, Department of Computer Science, UMEÅ University, August 2001. 82
  • 16. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME [4] Yuan Yujin, Zhao Peihua, ZhouQun, “Research of Speaker Recognition Based on Combination of LPCC and MFCC”, Electronic Information Engineering, Training and Experimental Center, Handan College, China, 2010. [5] Anil K. Jain and Arun Ross, "Introduction to Biometrics”, Springer Science+Business Media, LLC, USA, 2008. [6] S. Gunnam, “Fingerprint Recognition and Analysis System”, A mini-thesis Presented to Dr. David P. Beach, Dept of Electronics and Computer Technology, Indiana state university, Terre Haute, In Partial Fulfillment of the Requirements for ECT 680, April 2004. [7] E. Hjelms, “Biometric Systems: A Face Recognition Approach”, Department of Informatics, University of Oslo, Oslo, Norway, 2000. [8] Valentin Andrei, Constantin Paleologu, Corneliu Burileanu, “Implementation of a RealTime Text Dependent Speaker Identification System”, University “Politehnica” of Bucharest, Romania, 2011. [9] E. Karpov, “Real-Time Speaker Identification”, M. Sc. Thesis, Department of Computer Science, University of Joensuu, Finland, January 2003. [10] T. Kinnunen, “Spectral Features for Automatic Text-Independent Speaker Recognition”, Ph. D. Thesis, Department of Computer Science, University of Joensuu, Finland, December 2003. [11] T. Chen, “The Past, Present, and Future of Speech Processing”, IEEE Signal Processing Magazine, No.5, May 1998. [12] Biswajit Kar, Sandeep Bhatia & P. K. Dutta, “Audio -Visual Biometric Based Speaker Identification”, International Conference on Computational Intelligence and Multimedia Applications, India, 2007. [13] Antonio M. Peinado, Jos´e C. Segura, “Speech Recognition Over Digital Channels: Robustness and Standards”, John Wiley & Sons Ltd, University of Granada, Spain, 2006. [14] B. R. Wildermoth, “Text-Independent Speaker Recognition using Source Based Features”, M. Sc. Thesis, Griffith University, Australia, January 2001. [15] Ch.Srinivasa Kumar, P. Mallikarjuna Rao ,“Design of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm”, Ch.Srinivasa Kumar et al. / International Journal on Computer Science and Engineering (IJCSE), Vol. 3 No. 8 August 2011. [16] Ciira wa Maina and John MacLaren Walsh, “Log Spectra Enhancement Using Speaker Dependent Priors for Speaker Verification”, Drexel University, Department of Electrical and Computer Engineering, Philadelphia, PA 19104, 2011. [17] Ning WANG, P. C. CHING, and Tan LEE, “Robust Speaker Verification Using Phase Information of Speech”, Department of Electronic Engineering, The Chinese University of Hong Kong, 2010. [18] Wai C. Chu, “Speech Coding Algorithms: Foundation and Evolution of Standardized Coders”, John Wiley & Sons, Inc., California, USA, 2003. [19] L. Rabiner, B.-H. Juang, “Fundamentals of Speech Recognition”, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1993. [20] Yasir. A.-M. Taleb, “Statistical and Wavelet Approaches for Speaker Identification”, M. Sc. Thesis, Department of Computer Engineering, Al-Nahrain University, Iraq, June 2003. [21] N. Batri, “Robust Spectral Parameter Coding in Speech Processing”, M. Sc. Thesis, Department of Electrical Engineering, McGill University, Montreal, Canada, May 1998. 83
  • 17. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online) Volume 4, Issue 3, September - December (2013), © IAEME [22] J. P. Campbell, “Speaker Recognition: A Tutorial”, IEEE Proceedings, Vol. 85, No. 9, 1997. [23] A. K. Khandani and F. Lahouti, “Intra-frame and Inter-frame Coding of Speech: LSF Parameters Using a Trellis Structure”, Department of Electrical and Computer Engineering, University of Waterloo, Ontario, Canada, June 2000. [24] J. Rothweiler, “A Root Finding Algorithm for Line Spectral Frequencies”, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-99), March 15-19, U.S.A., 1999. [25] S. E. Umbaugh, “Computer Vision and Image Processing”, Prentice-Hall, Inc., U.S.A., 1998. [26] R. C. Gonzalez, Richard E. Woods, “Digital Image Processing”, Second Edition, Prentice-Hall Inc., New Jersey, U.S.A., 2002. [27] Lokesh S. Khedekar and Dr.A.S.Alvi, “Advanced Smart Credential Cum Unique Identification and Recognition System. (ASCUIRS)”, International Journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 1, 2013, pp. 97 - 104, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. [28] Pallavi P. Ingale and Dr. S.L. Nalbalwar, “Novel Approach to Text Independent Speaker Identification”, International Journal of Electronics and Communication Engineering & Technology (IJECET), Volume 3, Issue 2, 2012, pp. 87 - 93, ISSN Print: 0976- 6464, ISSN Online: 0976 –6472. [29] Vijay M.Mane, GauravV. Chalkikar and Milind E. Rane, “Multiscale Iris Recognition System”, International Journal of Electronics and Communication Engineering & Technology (IJECET), Volume 3, Issue 1, 2012, pp. 317 - 324, ISSN Print: 0976- 6464, ISSN Online: 0976 –6472. [30] Dr. Mustafa Dhiaa Al-Hassani, Dr. Abdulkareem A. Kadhim and Dr. Venus W. Samawi, “Fingerprint Identification Technique Based on Wavelet-Bands Selection Features (WBSF)”, International Journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 3, 2013, pp. 308 - 323, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. [31] Viplav Gautam, Saurabh Sharma, Swapnil Gautam and Gaurav Sharma, “Identification and Verification of Speaker using Mel Frequency Cepstral Coefficient”, International Journal of Electronics and Communication Engineering & Technology (IJECET), Volume 3, Issue 2, 2012, pp. 413 - 423, ISSN Print: 0976- 6464, ISSN Online: 0976 – 6472. 84