A study of EMG based Speech Recognition

Mr.T.JAYASANKAr
ASSITANT PrOFESSOr
DEPArTMENT OF ECE
ANNA UNIVErSITY, BIT CAMPUS,
TIrUCHIrAPALLI.
A STUDY ONA STUDY ON
ELECTROMYOGRAPHYELECTROMYOGRAPHY
BASEDSPEECHBASEDSPEECH
RECOGNITIONRECOGNITION
K.SrIrAM, D.VETrIVEL, S.VENGATESH
U.G. SCHOLAr, DEPArTMENT OF ECE,
ANNA UNIVErSITY,
BIT CAMPUS, TIrUCHIrAPALLI.

ABSTrACT
We present our recent study on EMG based speech
recognition by using Silent Speech Interface (SSI). An
Electromyography (EMG) is the one which detects the
electric potential generated by muscle cells when these
cells are electrically activated. This paper helps in
choosing the technique along with their relative merits &
demerits and also gives technique developed in each
stage of speech recognition. It concludes with the brief
explanation in matching techniques and the factor to
analyze the performance of systems.

PHONETICS
• Phonetics is the precise study of human speech
sounds.
• An appropriate knowledge of phonetics enables a
person to acquire a correct knowledge of
pronunciation and describes how sounds are made.
BrANCHES:
• Articulatory phonetics
• Acoustic phonetics
• Auditory phonetics

WHY WE GO FOr SSI…
• Even the best speech recognition systems sometimes make
errors. If there is noise or some other sound in the room
(e.g. the television or a kettle boiling), the number of errors
will increase.
• Speech Recognition works best if the microphone is close to
the user. More distant microphones (e.g. on a table or wall)
will tend to increase the number of errors.
• Confidential and private communication in public places is
difficult due to the clearly audible speech.

• Silent Speech Interface is an electronic device
that supports speech communication to take
place without the necessity of emitting an
audible acoustic signal by a human being.
• As such it is a type of electronic lip reading.

ELECTrOMYOGrAPHY
• The Silent Speech Interface uses electromyography,
monitoring tiny muscular movements that occur
when we speak.
• Monitored signals are converted into electrical
pulses that can then be turned into speech, without
a sound uttered.
• It is a technique which monitors tiny muscular
movements and pulses generated by it . The
transducers involved converts the pulses into electric
signals .

• Electromyography sensors attached to the face
records the electric signals produced by the facial
muscles, compare them with pre recorded signal
pattern of spoken words .
• When there is a match that sound is recognized by
the system and performing any required task.

STAGES IN SPEECH RECOGNITION
ACQUIRING THE EMG SIGNAL
PRE PROCESSING
SIGNAL PROCESSING
FEATURE EXTRACTION

ACQUIRING EMG SIGNAL
• silver/silver chloride (Ag/Ag-Cl) surface electrodes.
• Three channels of electromyogram recording system,
 M. digastrics,
 M. zygomaticus major
 M.orbicularis
• Six channels of electromyogram recording system,
the lavatorangulioris,
the zygomaticus major,
the platysma,
the depresserrangulioris ,
the interiod belly of the digastic, and
the tongue.

SIX CHANNEL EMG DATA ACQUISITION SYSTEM
HIGH PASS FILTER
60Hz
SAMPLER
600HzEMG SIGNAL
FILTERED EMG
SIGNALS

FOUR CHANNEL EMG DATA ACQUISITION SYSTEM
NOTCH FILTER
60Hz
SAMPLER
2KHz
EMG SIGNAL
FILTERED EM
SIGNALS
• right and left area of throat near the chin cleft
• 1-0.5 centimeter from the left and right side of the
larynx.

PREPROCESSING
FILTERS:
•LPF
•HPF
•BPF
•NOTCH FILTERS
– Power line interference
WAVELET TRANSFORM:
•wavelet transform is not only a very promising technique
for time –frequency analysis but also a noise reduction
method.

ACCURATE RECOGNITION
• Principle Component Analysis (PCA)
 Used Orthogonal Transformation to convert a correlated
Variables into Uncorrelated variables.
• Linear Predictive Coding (LPC)
LPC act as a tool used mostly in audio signal processing for representing
the spectral envelope of a digital signal of speech in compressed form.
 Current speech can be closely approximated as a linear combination of
past samples.

SIGNAL PROCESSING
• FILTERING
 50/60 Hz Notch filter
• NORMALIZATIOM
 The EMG signal is very sensitive to the changes in the changes in the electrodes
position and temperature issues.
 Hence to make a comparison of possible amplitudes it is very important to apply a
normalization process at each recording in order to compensate these changes.
• SIGNAL INTERPRETATION
» Short Time Fourier Transform (STFT)
» Wavelet Transform (WT)
» Wavelet Packed Transform (WPT)

FEATURE EXTRACTION
• Feature extraction to reduce the dimensionality of the
data.
• Thus feature extraction is the process of isolating the
most useful components of the data for further study
while discarding the less useful aspects.
• It reduces the number of variables that must be
examined, thereby saving time and resources.

FEATURE EXTRACTION
SPEECH SIGNAL MEL SCALE FILTERING
LOG
FEATURE VECTOR DERIVATIVES
DISCRETE COSINE
TRANSFORM
FAST FOURIER
TRANSFORM
Mel Frequency Cepstral Coefficients(MFCC)
spectrum
Mel frequency
spectrum
Cepstral
coefficient

PATTERN RECOGNITION
APPROACH
• 2 steps:
– Pattern Training
– Pattern Comparison
• Goal to determine identity of unknown speech according
to how well patterns match

METHODS IN PATTERN
COMPARISON APPROACH
• Template Based Approach
• Stochastic Approach (HMM)
– Probabilistic Models
– Uncertainty and Incompleteness

TEMPlATE-BASED ASR
• Originally only worked for isolated words
• For each word we want to recognize, we store a template or
example based on actual data
• Patterns stored as dictionary of words
• Each test utterance is checked against the templates to find
the best match
• Uses the Dynamic Time Warping (DTW) algorithm

DyNAMIC TIME WARPINg
• Dynamic time Warping founds an optimal match
between two sequences of feature vectors which allow
for stretched and compressed section of the
sequence.

STOCHASTIC APPROACH
• System that changes over time in an uncertain manner. It
entails the use of probabilistic models to deal with uncertain or
incomplete information.
HIDDEN MARKOV MODEl
• A hidden Markov model (HMM) is a statistical model,in which the
system being modeled is assumed to be a Markov process (Memoryless
process: its future and past are independent ) with hidden state.
• To understand the use of HMM for speech modeling, let us take this
Example,Consider the word "again" with two possible pronunciation:
(1)Again: AXGEHN
(2)Again: AXGEYN

HIDDEN MARKOV MODEl
ENDN
EH
EY
AXBEGIN
HMM FOR WORD AGAIN
G
P{X/AX} P{X/G}
P{X/EY}
P{X/EH}
P{X/N}

MATCHINg TECHNIQUE
WHOLE-WORD MATCHING:
» Compares the incoming digital-audio signal against a prerecorded
template.
» Requires a large amount of storage space.
SUB-WORD MATCHING:
» Looks for sub-words – usually phonemes and then performs
further pattern recognition
» Requires much less storage

EMg SPEECH RECOgNIZER
• Session Dependent (SD)
• Session Independent (SD)
• Multi Sessions (MS)
• Session Adaptive Systems (SAS)

SIlENT SOUND TECHNOlOgy
….Silence is the best answer for all
the situations …even your mobile
understands !

ApplicAtions
The Technology opens up a host of
application such as mentioned below:
• As we know in space there is no
medium for sound to travel therefore
this technology can be best utilized by
astronauts.
• Helping people who have lost their
voice due to illness or accident.
• We can make silent calls even if we are
standing in a crowded place.

• Telling a trusted friend your PIN number over
the phone without anyone eavesdropping —
assuming no lip-readers are around.
• Silent Sound Techniques is applied in Military
for communicating secret/confidential matters
to others.
• Since the electrical signals are universal they
can be translated into any language. Native
speakers can translate it before sending it to
the other side. Hence it can be converted into
any language of choice currently being
German, English & French.

REstRictions
• Translation into majority of languages but for
languages such as Chinese different tone holds
different meaning, facial movements being the
same. Hence this technology is difficult to apply in
such situations.
• From security point of view recognising who you
are talking to gets complicated.
• Even differentiating between people and emotions
cannot be done. This means you will always feel
you are talking to a robot.
• This device presently needs nine leads to be
attached to our face which is quite impractical to
make it usable.

FUtURE pRospEcts
• Silent sound technology gives way to a bright future to
speech recognition technology.
• Without having electrodes hanging all around your
face, these electrodes will be incorporated into systems
.
• It may have features like lip reading based on image
recognition & processing rather than
electromyography.
• An electric signals are universal. So it can be translated
into any languages. Now a days only the translation
between English, French and German are available.

A study of EMG based Speech Recognition

A study of EMG based Speech Recognition

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie A study of EMG based Speech Recognition

Ähnlich wie A study of EMG based Speech Recognition (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

A study of EMG based Speech Recognition