Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Gestures and Lip Shape Integration for Cued Speech Recognition
1. Gestures and Lip Shape Integration
for
Cued Speech Recognition
Seminar By: Seminar Coordinator:
Mohammed Musfir Mr. Rino P. C.
ECE-B, 08104131 Assistant Professor, ECE
Seminar Guide:
Mr. Edet Bijoy K.
Assistant Professor, ECE
5. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
Overview of Presentation
Objective
Introduction
ASR Techniques
Lip Reading – AVSR
Cued Speech
Integrated Recognition
Conclusion
02/12/2011 5
6. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
Objective
Developments in ASR technique
AVSR Accessibility solution
Lip Detection
Cued Speech detection
Integration of both
02/12/2011 6
7. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
02/12/2011
INTRODUCTION
7
8. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
Briefing ASR
First successful system in 1970
Consist of two systems
ASR – Transcribe
SU- Understand transcription
Knowledge Intensive
02/12/2011 8
9. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
02/12/2011
ASR TECHNIQUES
9
10. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
ASR Industry
Industry pioneers – NUANCE, NTT Labs, AT
& T labs
MIT and GPL – Vox Forge, Gvoice
Desktop Dictation -1990
Types of ASR
DVI – Word or phrase spotting
LVCSR- Several thousands words
02/12/2011 10
11. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
Techniques
Sequence of sounds
ASR involves
Acquisition - Recording
Feature Extraction – Spectral analysis
Pattern matching and decoding
02/12/2011 11
12. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
02/12/2011
Techniques
12
13. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
Approaches
Template Based
Knowledge Based
Statistical
Learning based
Artificial Intelligence
02/12/2011 13
14. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
02/12/2011
LIP READING
14
15. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
02/12/2011
Front end Lips detection
Lip Reading - AVSR
15
16. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
Localisation and Tracking
ROI determination – Sobel Edge Filtering
Kalman Filter – Tracking
Principal Component Analysis – Feature
Coefficients
Audio feature - MFCC
02/12/2011 16
17. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
02/12/2011
CUED SPEECH
17
18. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
02/12/2011
Overview of Cued Speech
18
19. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
02/12/2011
INTEGRATION
19
20. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
Steps
Lip feature extraction
Audio Synchronization with the Image
Multistream HMM Fusion – State Synchronous
Decision
Automatic Image Processing to record the CUEs
Lip Width, Aperture, Area, Upper pinch and
Lower Pinch
Modeling - 8 lip parameters and 10 hand
parameters
02/12/2011 20
22. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
Conclusion
Cued Speech Recognition – 80% accuracy
Outstands ASR in normal environment
Visual mode – Education of the hearing impaired
Phoneme recognition successful
Another product over SIRI
02/12/2011 22
23. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
Reference
1. Baum L.E., Petrie T., “Statistical Inference for Probabilistic functions of Finite-State Markov
Chains”, Annotated Mathematical Statistics, Volume 37, Number 6, pp.1554-1563, 1966
2. XiaoZheng Zhang, Charles C. Broun, Russell M. Mersereau, Mark A. Clements, “Automatic
speech reading with applications to human computer interfaces”, Eurasip Journal on Applied
Signal Processing, Volume 2002, Issue 11, pp. 1228-1247.
3. Jian-Ming Zhang, Liang-Min Wang, De-Jiao Niu,Yong-Zhao Zhan, “Research and
implementation of a real time approach to lip detection in video sequence”, International
Conference on Machine Learning and Cybernetics, IEEE, 2003.
4. Md. Rashidul Hasan, Mustafa Jamil, Md. Golam Rabbani Md Saifur Rahman, “Speaker
identification using Mel frequency cepstral coefficients”, 3rd International Conference on
Electrical And Computer Engineering, ICECE 2004.
5. P. Dreuw, D. Rybach, T. Deselaers, M. Zahedi, and H. Ney, “Speech recognition techniques
for a sign language recognition system,” In Proceedings of Interspeech, pp. 2513–2516, 2007.
6. A. A. Montgomery and P. L. Jackson, “Physical characteristics of the lips underlying vowel lip
reading performance,” Journal of the Acoustical Society of America, Volume 73, Number 6,
pp. 2134–2144, 1983.
7. J. Leybaert, “Phonology acquired through the eyes and spelling in deaf children,” Journal of
Experimental Child Psychology, Volume 75, pp. 291–318, 2000.
02/12/2011 23
24. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION
02/12/2011
THANK YOU
24