SlideShare ist ein Scribd-Unternehmen logo
1 von 5
Downloaden Sie, um offline zu lesen
IOSR Journal of Computer Engineering (IOSR-JCE)
e-ISSN: 2278-0661, p- ISSN: 2278-8727Volume 11, Issue 6 (May. - Jun. 2013), PP 46-50
www.iosrjournals.org
www.iosrjournals.org 46 | Page
Emotion Recognition Based On Audio Speech
Showkat Ahmad Dar1
, Zahid Khaki2
1
Department of Computer Science and Engineering, Islamic University of Science and Technology, Awantipora
2
Department of Electronics and Communication Engineering, Islamic University of Science and Technology
Jammu and Kashmir 192221, India
Abstract: Emotion recognition aims at automatically identifying the emotional or physical state of a human
being from his or her voice. The emotional and physical states of a speaker are known as emotional aspects of
speech and are included in the so called paralinguistic aspects. Although the emotional state does not alter the
linguistic content, it is an important factor in human communication, because it provides feedback information
in many applications as making a machine to recognize emotions from speech is not a new idea.
This paper presents automatic text independent speaker emotion recognition system using the pattern
classification methods such as the support vector mechanics (SVM) .Acoustic features are derived from the
speech signal at the segmental level. The segmental features are the features extracted from short frames (10-30
ms) of the speech. Acoustic features are derived from the speech signal at the segmental level. Acoustic features
are represented by Mel frequency cepstral coefficients .A 39 dimensional MFCC for each frame is used as
acoustic feature vector. The DFT based cepstral coeffiecients are computed by computing IDFT (inverse DFT)
of the log magnitude short time spectrum of speech signal. Mel wraped cepstrum is obtained by inserting an
intermediate step of transforming the frequencies before computing the IDFT. The Mel scale is based on human
perception of frequency of sound. SVMs are used to construct the optimal separating hyper plane for speech
features .SVMs are used to build the models for each speaker and to compare with the test speaker’s feature
vectors.
Keywords: Inverse discrete Fourier transforms(IDFT), linear prediction coefficients(LPC), linear prediction
cepstral coefficients(LPCC), Support vector mechanics(SVM), Artificial Neural Networks(ANN), Hidden
Markov Model(HMM), Gaussian Mixture Model(GMM).
I. Introduction
Emotion form a significant part of human interaction, and providing computers with the ability to
recognize and make use of such of such non-verbal information could open the way to new and exciting
paradigms in human- computer interaction. The human face is a key element of non-verbal communication, with
our facial emotion acting as a type of social semaphore.
Without the ability to express and recognize emotions , social interaction is less fulfilling and
stimulating , with either or both parties often unable to fully understand the meaning of the other.
This paper deals with the text-independent Speaker emotion recognition using pattern classification
techniques such as SVM (support vector machine).There are two major models for emotion recognition:
Discriminative model and generative model. Discriminative model consists of Artificial Neural Networks
(ANN) and Support Vector Machines (SVM), etc. Generative model consists of Hidden Markov Model (HMM)
and Gaussian Mixture Model (GMM), etc.Each of them can construct speaker models for speaker recognition
task.
II. Speaker Verification And Identification
The speaker recognition system can be operated in either identification mode or verification mode. In
speaker identification, the goal is to identify the speaker of utterances from a given population where as speaker
verification involves validating the identity claim by a person. Speaker recognition systems can be classified
into text dependent and text independent systems. Text dependent systems require the recitation of a
predominant text, where as text independent systems accept speech utterances of unrestricted text.
1. Speaker Emotion Verification
1.1 Speaker identity exits in the physiological and behavioral characteristics of a speaker. The
physiolocal characteristics correspond to the characteristics of the vocal tract system and that of the voice source
.The behavioral characteristics are due to the manner in which speaker have learnt to use their speech production
apparatus.
1.2 Speaker verification use the system or source features for capturing speaker specific information.
Some of the system features used for speaker verification task are formats, linear prediction coefficients (LPC)
Emotion Recognition Based On Audio Speech
www.iosrjournals.org 47 | Page
and linear prediction cepstral coefficients (LPCC). Source features such as Pitch, Intonation, and the linear
prediction residual signal information for speaker verification.
III. Linear Prediction Coefficients
The theory of liner prediction (LP) is closely linked to modeling of the vocal tract system, and relies up
on the fact that a particular speech sample may be predicted by a linear weight sum of the previous samples. The
number of previous samples used for prediction is known as the order of prediction. The weights applied to each
of the previous speech sample are known as linear prediction coefficients (LPC).They is calculated so as to
minimize the prediction error.
A given speech sample at time n, s (n), can be approximated as a linear combination of the past p speech
samples, such that
S (n) ≈ a1s (n − 1) + a2s (n − 2) + a3s (n − 3) + · · · + aps(n − p)
3.1 Preprocessing
To extract the features from the speech signal, the signal must be preprocessed and divided into
successive windows or analysis frames .so the following steps are performed before extracting the features.
Preemphasis: The higher frequencies of the speech signal are generally weak. As a result there may not be high
frequency energy present to extract features at the upper end of the frequency range.
3.2 Preemphasis
Preemphasis is used to boost the energy of the high frequency signals. The output of the preemphasis, ˆs (n) is
related the input s (n) by the difference equation
ˆs (n) = s (n) −αs (n − 1)
The typical value for α is 0.95
3.3 Frame blocking:
Speech analysis usually assumes that the signal properties change relatively slowly with time. This
allows examination of a short time window of speech to extract parameters presumed to remain fixed for the
duration of the window. Thus to model dynamic parameters, we must divide the signal into successive windows
or analysis frames, so that the parameters can be calculated often enough to follow the relevant changes. The
preemphasized speech signal, ˆs (n) is blocked into frames of N samples (frame size), with adjacent frames
being separated by M samples (frame shift). If we denote the ltframe of speech by xl (n), and there are L frames
within the entire speech signal then
xl (n) = ˆs(Ml + n), 0 ≤ n ≤ N − 1, 0 ≤ l ≤ N – 1
3.4 Windowing: The next step in the processing is to window each individual Frame so as to minimize the
signal discontinuities at the beginning and end of the frame. The window must be selected to taper the signal to
zero at the beginning and end of each frame. If we define the window as w (n), 0 ≤ n ≤ N-1, then the result of
windowing the signal is
˜xl(n) = xl(n)w(n), 0 ≤ n ≤ N − 1
IV. Linear Prediction Cepstral Coefficients
In many applications, Euclidean distance is used as a measure of similarity or dissimilarity between
feature vectors. The sharp peak of the LP spectrum may produce large errors in a similarity test, even for a slight
shift in the position of peaks .Hence the linear prediction coefficients are converted into cepstral coefficients
using a recursive relation.Cepstral coefficients represents the long magnitude spectrum,and the first few model
the smooth envelop of log spectrum .These coefficients can be obtained either from linear prediction
coefficients or from inverse discrete fourire transform (IDFT).
Emotion Recognition Based On Audio Speech
www.iosrjournals.org 48 | Page
IDFT (log (|DFT(x)|))
If x is LPC, the cepstral coefficients are known as Linear Prediction Cepstral
Coefficients (LPCC).
V. Acoustic Feature Extraction
MFCC are widely used spectral features for speaker recognition , computation of the MFCC defers
from the basic procedures described earlier, where the long magnitude spectrum is replaced with logarithm of
Mel scale warped spectrum, prior to inverse Fourier transform operation .Hence the MFCC can be represented
by the gross characteristics of the vocal tract system .i.e. IDFT (log|DFT(x)|))
5.1 Process visualizaton
Figure 5.1 will serve as the audio signal intended for the analysis in this case. The next step is to frame the audio
sample into portions of a predetermined size Figure 5.2 shows a frame belonging to a digital audio signal
Figure 5.2
Next a window function is applied to the frame, in this case a Hamming window was used on the individual
frames to smooth out the frame edges and reduce spectral distortion.
Emotion Recognition Based On Audio Speech
www.iosrjournals.org 49 | Page
Then the spectral coefficients are processed with Mel-scale filter to convert these to the Mel scale. The filter
bank used may be viewed in the Fig 5.3 the logarithms of these Mel spectral coefficients are then transformed to
the frequency domain with the inverse DFT.
Fig 5.3 Audio Frame
Fig 5.4 Audio window
VI. Conclusion
This paper allows for asynchrony in the audio and vocabulary states, while preserving the natural
dependency of the audio signals. The audio sequences are treated separately and are no need for the problem.
The advantage of the audio reorganization is confirmed by the experimental results. This model can be
improved applied to a variety of human/machine system. In future work, we will improve current model to
increase the efficiency, besides this, the research on recognition of emotion intensity will be performed through
the analysis of audio feature s, which is different from the approach in. Also, we notice that some new
dimensional reduction and the pattern classification methods like tensor based analysis proposed recently; we
will carry out study on its application in emotion recognition field.
Emotion Recognition Based On Audio Speech
www.iosrjournals.org 50 | Page
Reference
[1]. Pongtep Angkititrakul and John H. L. Hansen, “Discrimination in–Set/out-of-set Speaker Recognition Systems” IEEE
TRANSACTIONS ON AUDIO , SPEECH , AND LANGUAGE PROCESSING, Vol.15, no.2, Feb.2007.
[2]. Ran D. Zilca , Brain Kingsbury, Jiri Navrati, Ganesh N. Ramasamy “PSEUDO PITCH SYNCHRONOUS ANALYSIS OF SPEECH
WITH APPLICATION TO SPEAKER RECOGNITION”, IEEE TRANSACTION ON AUDIO , SPEECH, AND LANGUAGE
PROCESSING Vol.14, no. 2, Mar.2006.
[3]. “yildirim, S. Bulut, M., Lee , C.M., kazemzadeh, A., Busso, C., Deng, Z., Lee, S., Narayanan, S., 2004. An Acoustic Study Of
Emotions Expressed In Speech. In: Proc. Internat. Conf. On Spoken Language Processing (ICSLP ’
04), Korea, Vol.
[4]. K. Sri Rama Murty and B. Yengnanarayana “ Combining Evidence from Residual Phase and MFCC Features for Speaker Emotion
Recognition” IEEE signals processing letters, Vol.13,no. , jan.2006
[5]. Volunteers in Technical Assistance (VITA).
[6]. Guillermo Garcia, Sung-kyo Jung, ”A Statistical Approach To Performance Evaluation Of Speaker Emotion Recognition
Systems” TECH/SSTP Lab., France Telecom R&D 22307 Lannion, France. W.B. Mikhel and Pravinkumar Premakanthan, “An
Improved Speaker Emotion Identification Technique Employing Multiple Representations of LPC”, in proceedings of IEEE ,
march 2000
[7]. Sachin S. Kajarekar “Four Weighing And Fusion : A Cepstral –SVM System For Speaker Emotion Recognition VOL. 23, NO. 3,
AUGUST 2008. .

Weitere ähnliche Inhalte

Was ist angesagt?

Emotion based music player
Emotion based music playerEmotion based music player
Emotion based music playerNizam Muhammed
 
Introduction to emotion detection
Introduction to emotion detectionIntroduction to emotion detection
Introduction to emotion detectionTyler Schnoebelen
 
Face recognition using artificial neural network
Face recognition using artificial neural networkFace recognition using artificial neural network
Face recognition using artificial neural networkSumeet Kakani
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminarDiptimaya Sarangi
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentationhimanshubhatti
 
Single image haze removal
Single image haze removalSingle image haze removal
Single image haze removalMohsinGhazi2
 
Hand gesture recognition
Hand gesture recognitionHand gesture recognition
Hand gesture recognitionBhawana Singh
 
Speech emotion recognition from audio converted
Speech emotion recognition from audio convertedSpeech emotion recognition from audio converted
Speech emotion recognition from audio converteddataalcott
 
Emotion detection using cnn.pptx
Emotion detection using cnn.pptxEmotion detection using cnn.pptx
Emotion detection using cnn.pptxRADO7900
 
Natural Language Processing in AI
Natural Language Processing in AINatural Language Processing in AI
Natural Language Processing in AISaurav Shrestha
 
Deep Learning For Speech Recognition
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognitionananth
 
Brain Tumour Detection.pptx
Brain Tumour Detection.pptxBrain Tumour Detection.pptx
Brain Tumour Detection.pptxRevolverRaja2
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognitionCharu Joshi
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Synops emotion recognize
Synops emotion recognizeSynops emotion recognize
Synops emotion recognizeAvdhesh Gupta
 
Facial Expression Recognition System using Deep Convolutional Neural Networks.
Facial Expression Recognition  System using Deep Convolutional Neural Networks.Facial Expression Recognition  System using Deep Convolutional Neural Networks.
Facial Expression Recognition System using Deep Convolutional Neural Networks.Sandeep Wakchaure
 
digital image processing
digital image processingdigital image processing
digital image processingAbinaya B
 

Was ist angesagt? (20)

Emotion based music player
Emotion based music playerEmotion based music player
Emotion based music player
 
Introduction to emotion detection
Introduction to emotion detectionIntroduction to emotion detection
Introduction to emotion detection
 
Face recognition using artificial neural network
Face recognition using artificial neural networkFace recognition using artificial neural network
Face recognition using artificial neural network
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminar
 
EMOTION DETECTION USING AI
EMOTION DETECTION USING AIEMOTION DETECTION USING AI
EMOTION DETECTION USING AI
 
Human Emotion Recognition
Human Emotion RecognitionHuman Emotion Recognition
Human Emotion Recognition
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentation
 
Single image haze removal
Single image haze removalSingle image haze removal
Single image haze removal
 
Hand gesture recognition
Hand gesture recognitionHand gesture recognition
Hand gesture recognition
 
Speech emotion recognition from audio converted
Speech emotion recognition from audio convertedSpeech emotion recognition from audio converted
Speech emotion recognition from audio converted
 
Emotion detection using cnn.pptx
Emotion detection using cnn.pptxEmotion detection using cnn.pptx
Emotion detection using cnn.pptx
 
Natural Language Processing in AI
Natural Language Processing in AINatural Language Processing in AI
Natural Language Processing in AI
 
Deep Learning For Speech Recognition
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognition
 
Brain Tumour Detection.pptx
Brain Tumour Detection.pptxBrain Tumour Detection.pptx
Brain Tumour Detection.pptx
 
Noise Models
Noise ModelsNoise Models
Noise Models
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Synops emotion recognize
Synops emotion recognizeSynops emotion recognize
Synops emotion recognize
 
Facial Expression Recognition System using Deep Convolutional Neural Networks.
Facial Expression Recognition  System using Deep Convolutional Neural Networks.Facial Expression Recognition  System using Deep Convolutional Neural Networks.
Facial Expression Recognition System using Deep Convolutional Neural Networks.
 
digital image processing
digital image processingdigital image processing
digital image processing
 

Andere mochten auch

A hybrid English to Malayalam machine translator for Malayalam content creati...
A hybrid English to Malayalam machine translator for Malayalam content creati...A hybrid English to Malayalam machine translator for Malayalam content creati...
A hybrid English to Malayalam machine translator for Malayalam content creati...IOSR Journals
 
A Study of Various Graphical Passwords Authentication Schemes Using Ai Hans P...
A Study of Various Graphical Passwords Authentication Schemes Using Ai Hans P...A Study of Various Graphical Passwords Authentication Schemes Using Ai Hans P...
A Study of Various Graphical Passwords Authentication Schemes Using Ai Hans P...IOSR Journals
 
Application of SVM Technique for Three Phase Three Leg Ac/Ac Converter Topology
Application of SVM Technique for Three Phase Three Leg Ac/Ac Converter TopologyApplication of SVM Technique for Three Phase Three Leg Ac/Ac Converter Topology
Application of SVM Technique for Three Phase Three Leg Ac/Ac Converter TopologyIOSR Journals
 
Environmental health Effect and Air Pollution from cigarette smokers in Cross...
Environmental health Effect and Air Pollution from cigarette smokers in Cross...Environmental health Effect and Air Pollution from cigarette smokers in Cross...
Environmental health Effect and Air Pollution from cigarette smokers in Cross...IOSR Journals
 
A Novel Method to Improve the Control Performance of Multilevel Inverter by M...
A Novel Method to Improve the Control Performance of Multilevel Inverter by M...A Novel Method to Improve the Control Performance of Multilevel Inverter by M...
A Novel Method to Improve the Control Performance of Multilevel Inverter by M...IOSR Journals
 
Experimental Studies on Bioregeneration of Activated Carbon Contaminated With...
Experimental Studies on Bioregeneration of Activated Carbon Contaminated With...Experimental Studies on Bioregeneration of Activated Carbon Contaminated With...
Experimental Studies on Bioregeneration of Activated Carbon Contaminated With...IOSR Journals
 
Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimension...
Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimension...Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimension...
Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimension...IOSR Journals
 
Improved Fuzzy Control Strategy for Power Quality in Distributed Generation’s...
Improved Fuzzy Control Strategy for Power Quality in Distributed Generation’s...Improved Fuzzy Control Strategy for Power Quality in Distributed Generation’s...
Improved Fuzzy Control Strategy for Power Quality in Distributed Generation’s...IOSR Journals
 
A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...IOSR Journals
 
A Combined Approach of Software Metrics and Software Fault Analysis to Estima...
A Combined Approach of Software Metrics and Software Fault Analysis to Estima...A Combined Approach of Software Metrics and Software Fault Analysis to Estima...
A Combined Approach of Software Metrics and Software Fault Analysis to Estima...IOSR Journals
 
Propose Data Mining AR-GA Model to Advance Crime analysis
Propose Data Mining AR-GA Model to Advance Crime analysisPropose Data Mining AR-GA Model to Advance Crime analysis
Propose Data Mining AR-GA Model to Advance Crime analysisIOSR Journals
 
Mobility Contrast Effect on Environment in Manet as Mobility Model
Mobility Contrast Effect on Environment in Manet as Mobility ModelMobility Contrast Effect on Environment in Manet as Mobility Model
Mobility Contrast Effect on Environment in Manet as Mobility ModelIOSR Journals
 
Some Aspects of Stress Distribution and Effect of Voids Having Different Gase...
Some Aspects of Stress Distribution and Effect of Voids Having Different Gase...Some Aspects of Stress Distribution and Effect of Voids Having Different Gase...
Some Aspects of Stress Distribution and Effect of Voids Having Different Gase...IOSR Journals
 
A Soft-Switching DC/DC Converter with High Voltage Gain
A Soft-Switching DC/DC Converter with High Voltage GainA Soft-Switching DC/DC Converter with High Voltage Gain
A Soft-Switching DC/DC Converter with High Voltage GainIOSR Journals
 
A New Single-Stage Multilevel Type Full-Bridge Converter Applied to closed lo...
A New Single-Stage Multilevel Type Full-Bridge Converter Applied to closed lo...A New Single-Stage Multilevel Type Full-Bridge Converter Applied to closed lo...
A New Single-Stage Multilevel Type Full-Bridge Converter Applied to closed lo...IOSR Journals
 
Auto Finding and Resolving Distributed Firewall Policy
Auto Finding and Resolving Distributed Firewall PolicyAuto Finding and Resolving Distributed Firewall Policy
Auto Finding and Resolving Distributed Firewall PolicyIOSR Journals
 
Generation and Implementation of Barker and Nested Binary codes
Generation and Implementation of Barker and Nested Binary codesGeneration and Implementation of Barker and Nested Binary codes
Generation and Implementation of Barker and Nested Binary codesIOSR Journals
 
Open Source Software Survivability Analysis Using Communication Pattern Valid...
Open Source Software Survivability Analysis Using Communication Pattern Valid...Open Source Software Survivability Analysis Using Communication Pattern Valid...
Open Source Software Survivability Analysis Using Communication Pattern Valid...IOSR Journals
 

Andere mochten auch (20)

A hybrid English to Malayalam machine translator for Malayalam content creati...
A hybrid English to Malayalam machine translator for Malayalam content creati...A hybrid English to Malayalam machine translator for Malayalam content creati...
A hybrid English to Malayalam machine translator for Malayalam content creati...
 
K0956568
K0956568K0956568
K0956568
 
E0432733
E0432733E0432733
E0432733
 
A Study of Various Graphical Passwords Authentication Schemes Using Ai Hans P...
A Study of Various Graphical Passwords Authentication Schemes Using Ai Hans P...A Study of Various Graphical Passwords Authentication Schemes Using Ai Hans P...
A Study of Various Graphical Passwords Authentication Schemes Using Ai Hans P...
 
Application of SVM Technique for Three Phase Three Leg Ac/Ac Converter Topology
Application of SVM Technique for Three Phase Three Leg Ac/Ac Converter TopologyApplication of SVM Technique for Three Phase Three Leg Ac/Ac Converter Topology
Application of SVM Technique for Three Phase Three Leg Ac/Ac Converter Topology
 
Environmental health Effect and Air Pollution from cigarette smokers in Cross...
Environmental health Effect and Air Pollution from cigarette smokers in Cross...Environmental health Effect and Air Pollution from cigarette smokers in Cross...
Environmental health Effect and Air Pollution from cigarette smokers in Cross...
 
A Novel Method to Improve the Control Performance of Multilevel Inverter by M...
A Novel Method to Improve the Control Performance of Multilevel Inverter by M...A Novel Method to Improve the Control Performance of Multilevel Inverter by M...
A Novel Method to Improve the Control Performance of Multilevel Inverter by M...
 
Experimental Studies on Bioregeneration of Activated Carbon Contaminated With...
Experimental Studies on Bioregeneration of Activated Carbon Contaminated With...Experimental Studies on Bioregeneration of Activated Carbon Contaminated With...
Experimental Studies on Bioregeneration of Activated Carbon Contaminated With...
 
Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimension...
Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimension...Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimension...
Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimension...
 
Improved Fuzzy Control Strategy for Power Quality in Distributed Generation’s...
Improved Fuzzy Control Strategy for Power Quality in Distributed Generation’s...Improved Fuzzy Control Strategy for Power Quality in Distributed Generation’s...
Improved Fuzzy Control Strategy for Power Quality in Distributed Generation’s...
 
A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...
 
A Combined Approach of Software Metrics and Software Fault Analysis to Estima...
A Combined Approach of Software Metrics and Software Fault Analysis to Estima...A Combined Approach of Software Metrics and Software Fault Analysis to Estima...
A Combined Approach of Software Metrics and Software Fault Analysis to Estima...
 
Propose Data Mining AR-GA Model to Advance Crime analysis
Propose Data Mining AR-GA Model to Advance Crime analysisPropose Data Mining AR-GA Model to Advance Crime analysis
Propose Data Mining AR-GA Model to Advance Crime analysis
 
Mobility Contrast Effect on Environment in Manet as Mobility Model
Mobility Contrast Effect on Environment in Manet as Mobility ModelMobility Contrast Effect on Environment in Manet as Mobility Model
Mobility Contrast Effect on Environment in Manet as Mobility Model
 
Some Aspects of Stress Distribution and Effect of Voids Having Different Gase...
Some Aspects of Stress Distribution and Effect of Voids Having Different Gase...Some Aspects of Stress Distribution and Effect of Voids Having Different Gase...
Some Aspects of Stress Distribution and Effect of Voids Having Different Gase...
 
A Soft-Switching DC/DC Converter with High Voltage Gain
A Soft-Switching DC/DC Converter with High Voltage GainA Soft-Switching DC/DC Converter with High Voltage Gain
A Soft-Switching DC/DC Converter with High Voltage Gain
 
A New Single-Stage Multilevel Type Full-Bridge Converter Applied to closed lo...
A New Single-Stage Multilevel Type Full-Bridge Converter Applied to closed lo...A New Single-Stage Multilevel Type Full-Bridge Converter Applied to closed lo...
A New Single-Stage Multilevel Type Full-Bridge Converter Applied to closed lo...
 
Auto Finding and Resolving Distributed Firewall Policy
Auto Finding and Resolving Distributed Firewall PolicyAuto Finding and Resolving Distributed Firewall Policy
Auto Finding and Resolving Distributed Firewall Policy
 
Generation and Implementation of Barker and Nested Binary codes
Generation and Implementation of Barker and Nested Binary codesGeneration and Implementation of Barker and Nested Binary codes
Generation and Implementation of Barker and Nested Binary codes
 
Open Source Software Survivability Analysis Using Communication Pattern Valid...
Open Source Software Survivability Analysis Using Communication Pattern Valid...Open Source Software Survivability Analysis Using Communication Pattern Valid...
Open Source Software Survivability Analysis Using Communication Pattern Valid...
 

Ähnlich wie Emotion Recognition Based On Audio Speech

Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnnAutomatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnnijcsa
 
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...cscpconf
 
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...Cemal Ardil
 
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueCSCJournals
 
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...IDES Editor
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...AIRCC Publishing Corporation
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...ijcsit
 
Malayalam Isolated Digit Recognition using HMM and PLP cepstral coefficient
Malayalam Isolated Digit Recognition using HMM and PLP cepstral coefficientMalayalam Isolated Digit Recognition using HMM and PLP cepstral coefficient
Malayalam Isolated Digit Recognition using HMM and PLP cepstral coefficientijait
 
Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features  Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features ijsc
 
Speaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home ApplicationsSpeaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home ApplicationsRoger Gomes
 
Broad phoneme classification using signal based features
Broad phoneme classification using signal based featuresBroad phoneme classification using signal based features
Broad phoneme classification using signal based featuresijsc
 
Isolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationIsolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationeSAT Journals
 
Isolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationIsolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationeSAT Publishing House
 
Voice biometric recognition
Voice biometric recognitionVoice biometric recognition
Voice biometric recognitionphyuhsan
 
Human Emotion Recognition From Speech
Human Emotion Recognition From SpeechHuman Emotion Recognition From Speech
Human Emotion Recognition From SpeechIJERA Editor
 
Audio/Speech Signal Analysis for Depression
Audio/Speech Signal Analysis for DepressionAudio/Speech Signal Analysis for Depression
Audio/Speech Signal Analysis for Depressionijsrd.com
 

Ähnlich wie Emotion Recognition Based On Audio Speech (20)

D111823
D111823D111823
D111823
 
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnnAutomatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
 
A017410108
A017410108A017410108
A017410108
 
A017410108
A017410108A017410108
A017410108
 
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
 
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
 
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
 
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 
Malayalam Isolated Digit Recognition using HMM and PLP cepstral coefficient
Malayalam Isolated Digit Recognition using HMM and PLP cepstral coefficientMalayalam Isolated Digit Recognition using HMM and PLP cepstral coefficient
Malayalam Isolated Digit Recognition using HMM and PLP cepstral coefficient
 
Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features  Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features
 
Speaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home ApplicationsSpeaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home Applications
 
Broad phoneme classification using signal based features
Broad phoneme classification using signal based featuresBroad phoneme classification using signal based features
Broad phoneme classification using signal based features
 
Ijetcas14 426
Ijetcas14 426Ijetcas14 426
Ijetcas14 426
 
Isolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationIsolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantization
 
Isolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationIsolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantization
 
Voice biometric recognition
Voice biometric recognitionVoice biometric recognition
Voice biometric recognition
 
Human Emotion Recognition From Speech
Human Emotion Recognition From SpeechHuman Emotion Recognition From Speech
Human Emotion Recognition From Speech
 
Audio/Speech Signal Analysis for Depression
Audio/Speech Signal Analysis for DepressionAudio/Speech Signal Analysis for Depression
Audio/Speech Signal Analysis for Depression
 

Mehr von IOSR Journals (20)

A011140104
A011140104A011140104
A011140104
 
M0111397100
M0111397100M0111397100
M0111397100
 
L011138596
L011138596L011138596
L011138596
 
K011138084
K011138084K011138084
K011138084
 
J011137479
J011137479J011137479
J011137479
 
I011136673
I011136673I011136673
I011136673
 
G011134454
G011134454G011134454
G011134454
 
H011135565
H011135565H011135565
H011135565
 
F011134043
F011134043F011134043
F011134043
 
E011133639
E011133639E011133639
E011133639
 
D011132635
D011132635D011132635
D011132635
 
C011131925
C011131925C011131925
C011131925
 
B011130918
B011130918B011130918
B011130918
 
A011130108
A011130108A011130108
A011130108
 
I011125160
I011125160I011125160
I011125160
 
H011124050
H011124050H011124050
H011124050
 
G011123539
G011123539G011123539
G011123539
 
F011123134
F011123134F011123134
F011123134
 
E011122530
E011122530E011122530
E011122530
 
D011121524
D011121524D011121524
D011121524
 

Kürzlich hochgeladen

Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stageAbc194748
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdfKamal Acharya
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...HenryBriggs2
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesRAJNEESHKUMAR341697
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086anil_gaur
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfsmsksolar
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersMairaAshraf6
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsvanyagupta248
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxSCMS School of Architecture
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksMagic Marks
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Servicemeghakumariji156
 

Kürzlich hochgeladen (20)

Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stage
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planes
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdf
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic Marks
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 

Emotion Recognition Based On Audio Speech

  • 1. IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN: 2278-0661, p- ISSN: 2278-8727Volume 11, Issue 6 (May. - Jun. 2013), PP 46-50 www.iosrjournals.org www.iosrjournals.org 46 | Page Emotion Recognition Based On Audio Speech Showkat Ahmad Dar1 , Zahid Khaki2 1 Department of Computer Science and Engineering, Islamic University of Science and Technology, Awantipora 2 Department of Electronics and Communication Engineering, Islamic University of Science and Technology Jammu and Kashmir 192221, India Abstract: Emotion recognition aims at automatically identifying the emotional or physical state of a human being from his or her voice. The emotional and physical states of a speaker are known as emotional aspects of speech and are included in the so called paralinguistic aspects. Although the emotional state does not alter the linguistic content, it is an important factor in human communication, because it provides feedback information in many applications as making a machine to recognize emotions from speech is not a new idea. This paper presents automatic text independent speaker emotion recognition system using the pattern classification methods such as the support vector mechanics (SVM) .Acoustic features are derived from the speech signal at the segmental level. The segmental features are the features extracted from short frames (10-30 ms) of the speech. Acoustic features are derived from the speech signal at the segmental level. Acoustic features are represented by Mel frequency cepstral coefficients .A 39 dimensional MFCC for each frame is used as acoustic feature vector. The DFT based cepstral coeffiecients are computed by computing IDFT (inverse DFT) of the log magnitude short time spectrum of speech signal. Mel wraped cepstrum is obtained by inserting an intermediate step of transforming the frequencies before computing the IDFT. The Mel scale is based on human perception of frequency of sound. SVMs are used to construct the optimal separating hyper plane for speech features .SVMs are used to build the models for each speaker and to compare with the test speaker’s feature vectors. Keywords: Inverse discrete Fourier transforms(IDFT), linear prediction coefficients(LPC), linear prediction cepstral coefficients(LPCC), Support vector mechanics(SVM), Artificial Neural Networks(ANN), Hidden Markov Model(HMM), Gaussian Mixture Model(GMM). I. Introduction Emotion form a significant part of human interaction, and providing computers with the ability to recognize and make use of such of such non-verbal information could open the way to new and exciting paradigms in human- computer interaction. The human face is a key element of non-verbal communication, with our facial emotion acting as a type of social semaphore. Without the ability to express and recognize emotions , social interaction is less fulfilling and stimulating , with either or both parties often unable to fully understand the meaning of the other. This paper deals with the text-independent Speaker emotion recognition using pattern classification techniques such as SVM (support vector machine).There are two major models for emotion recognition: Discriminative model and generative model. Discriminative model consists of Artificial Neural Networks (ANN) and Support Vector Machines (SVM), etc. Generative model consists of Hidden Markov Model (HMM) and Gaussian Mixture Model (GMM), etc.Each of them can construct speaker models for speaker recognition task. II. Speaker Verification And Identification The speaker recognition system can be operated in either identification mode or verification mode. In speaker identification, the goal is to identify the speaker of utterances from a given population where as speaker verification involves validating the identity claim by a person. Speaker recognition systems can be classified into text dependent and text independent systems. Text dependent systems require the recitation of a predominant text, where as text independent systems accept speech utterances of unrestricted text. 1. Speaker Emotion Verification 1.1 Speaker identity exits in the physiological and behavioral characteristics of a speaker. The physiolocal characteristics correspond to the characteristics of the vocal tract system and that of the voice source .The behavioral characteristics are due to the manner in which speaker have learnt to use their speech production apparatus. 1.2 Speaker verification use the system or source features for capturing speaker specific information. Some of the system features used for speaker verification task are formats, linear prediction coefficients (LPC)
  • 2. Emotion Recognition Based On Audio Speech www.iosrjournals.org 47 | Page and linear prediction cepstral coefficients (LPCC). Source features such as Pitch, Intonation, and the linear prediction residual signal information for speaker verification. III. Linear Prediction Coefficients The theory of liner prediction (LP) is closely linked to modeling of the vocal tract system, and relies up on the fact that a particular speech sample may be predicted by a linear weight sum of the previous samples. The number of previous samples used for prediction is known as the order of prediction. The weights applied to each of the previous speech sample are known as linear prediction coefficients (LPC).They is calculated so as to minimize the prediction error. A given speech sample at time n, s (n), can be approximated as a linear combination of the past p speech samples, such that S (n) ≈ a1s (n − 1) + a2s (n − 2) + a3s (n − 3) + · · · + aps(n − p) 3.1 Preprocessing To extract the features from the speech signal, the signal must be preprocessed and divided into successive windows or analysis frames .so the following steps are performed before extracting the features. Preemphasis: The higher frequencies of the speech signal are generally weak. As a result there may not be high frequency energy present to extract features at the upper end of the frequency range. 3.2 Preemphasis Preemphasis is used to boost the energy of the high frequency signals. The output of the preemphasis, ˆs (n) is related the input s (n) by the difference equation ˆs (n) = s (n) −αs (n − 1) The typical value for α is 0.95 3.3 Frame blocking: Speech analysis usually assumes that the signal properties change relatively slowly with time. This allows examination of a short time window of speech to extract parameters presumed to remain fixed for the duration of the window. Thus to model dynamic parameters, we must divide the signal into successive windows or analysis frames, so that the parameters can be calculated often enough to follow the relevant changes. The preemphasized speech signal, ˆs (n) is blocked into frames of N samples (frame size), with adjacent frames being separated by M samples (frame shift). If we denote the ltframe of speech by xl (n), and there are L frames within the entire speech signal then xl (n) = ˆs(Ml + n), 0 ≤ n ≤ N − 1, 0 ≤ l ≤ N – 1 3.4 Windowing: The next step in the processing is to window each individual Frame so as to minimize the signal discontinuities at the beginning and end of the frame. The window must be selected to taper the signal to zero at the beginning and end of each frame. If we define the window as w (n), 0 ≤ n ≤ N-1, then the result of windowing the signal is ˜xl(n) = xl(n)w(n), 0 ≤ n ≤ N − 1 IV. Linear Prediction Cepstral Coefficients In many applications, Euclidean distance is used as a measure of similarity or dissimilarity between feature vectors. The sharp peak of the LP spectrum may produce large errors in a similarity test, even for a slight shift in the position of peaks .Hence the linear prediction coefficients are converted into cepstral coefficients using a recursive relation.Cepstral coefficients represents the long magnitude spectrum,and the first few model the smooth envelop of log spectrum .These coefficients can be obtained either from linear prediction coefficients or from inverse discrete fourire transform (IDFT).
  • 3. Emotion Recognition Based On Audio Speech www.iosrjournals.org 48 | Page IDFT (log (|DFT(x)|)) If x is LPC, the cepstral coefficients are known as Linear Prediction Cepstral Coefficients (LPCC). V. Acoustic Feature Extraction MFCC are widely used spectral features for speaker recognition , computation of the MFCC defers from the basic procedures described earlier, where the long magnitude spectrum is replaced with logarithm of Mel scale warped spectrum, prior to inverse Fourier transform operation .Hence the MFCC can be represented by the gross characteristics of the vocal tract system .i.e. IDFT (log|DFT(x)|)) 5.1 Process visualizaton Figure 5.1 will serve as the audio signal intended for the analysis in this case. The next step is to frame the audio sample into portions of a predetermined size Figure 5.2 shows a frame belonging to a digital audio signal Figure 5.2 Next a window function is applied to the frame, in this case a Hamming window was used on the individual frames to smooth out the frame edges and reduce spectral distortion.
  • 4. Emotion Recognition Based On Audio Speech www.iosrjournals.org 49 | Page Then the spectral coefficients are processed with Mel-scale filter to convert these to the Mel scale. The filter bank used may be viewed in the Fig 5.3 the logarithms of these Mel spectral coefficients are then transformed to the frequency domain with the inverse DFT. Fig 5.3 Audio Frame Fig 5.4 Audio window VI. Conclusion This paper allows for asynchrony in the audio and vocabulary states, while preserving the natural dependency of the audio signals. The audio sequences are treated separately and are no need for the problem. The advantage of the audio reorganization is confirmed by the experimental results. This model can be improved applied to a variety of human/machine system. In future work, we will improve current model to increase the efficiency, besides this, the research on recognition of emotion intensity will be performed through the analysis of audio feature s, which is different from the approach in. Also, we notice that some new dimensional reduction and the pattern classification methods like tensor based analysis proposed recently; we will carry out study on its application in emotion recognition field.
  • 5. Emotion Recognition Based On Audio Speech www.iosrjournals.org 50 | Page Reference [1]. Pongtep Angkititrakul and John H. L. Hansen, “Discrimination in–Set/out-of-set Speaker Recognition Systems” IEEE TRANSACTIONS ON AUDIO , SPEECH , AND LANGUAGE PROCESSING, Vol.15, no.2, Feb.2007. [2]. Ran D. Zilca , Brain Kingsbury, Jiri Navrati, Ganesh N. Ramasamy “PSEUDO PITCH SYNCHRONOUS ANALYSIS OF SPEECH WITH APPLICATION TO SPEAKER RECOGNITION”, IEEE TRANSACTION ON AUDIO , SPEECH, AND LANGUAGE PROCESSING Vol.14, no. 2, Mar.2006. [3]. “yildirim, S. Bulut, M., Lee , C.M., kazemzadeh, A., Busso, C., Deng, Z., Lee, S., Narayanan, S., 2004. An Acoustic Study Of Emotions Expressed In Speech. In: Proc. Internat. Conf. On Spoken Language Processing (ICSLP ’ 04), Korea, Vol. [4]. K. Sri Rama Murty and B. Yengnanarayana “ Combining Evidence from Residual Phase and MFCC Features for Speaker Emotion Recognition” IEEE signals processing letters, Vol.13,no. , jan.2006 [5]. Volunteers in Technical Assistance (VITA). [6]. Guillermo Garcia, Sung-kyo Jung, ”A Statistical Approach To Performance Evaluation Of Speaker Emotion Recognition Systems” TECH/SSTP Lab., France Telecom R&D 22307 Lannion, France. W.B. Mikhel and Pravinkumar Premakanthan, “An Improved Speaker Emotion Identification Technique Employing Multiple Representations of LPC”, in proceedings of IEEE , march 2000 [7]. Sachin S. Kajarekar “Four Weighing And Fusion : A Cepstral –SVM System For Speaker Emotion Recognition VOL. 23, NO. 3, AUGUST 2008. .