SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Convolutional Neural Networks for
Speech Controlled Prosthetic Hands
Date: 09/26/2019
2019 First International Conference on Transdisciplinary AI (TransAI)
Mohsen Jafarzadeh
Department of Electrical and Computer Engineering
The University of Texas at Dallas
Richardson, TX, USA
Mohsen.Jafarzadeh@utdallas.edu
Yonas Tadesse
Department of Mechanical Engineering
The University of Texas at Dallas
Richardson, TX, USA
Yonas.Tadesse@utdallas.edu
Content
•Introduction
•Proposed Method
•Results
•Discussion
•Conclusion
2
Introduction
•~94,000 upper limb amputees in Europe
• S. Micera, J. Carpaneto, and S. Raspopovic, “Control of Hand Prostheses Using Peripheral
Information,” IEEE Reviews in Biomedical Engineering, vol. 3, pp. 48–68, 2010.
•~41,000 upper limb amputees in the United States
• K. Ziegler-Graham, E. J. MacKenzie, P. L. Ephraim, T. G. Travison, and R. Brookmeyer,
“Estimating the prevalence of limb loss in the United States: 2005 to 2050,” Arch Phys Med
Rehabil, vol. 89, no. 3, pp. 422–429, Mar. 2008.
•About 40 million amputees in the world
• M. Marinoet al., “Access to prosthetic devices in developing countries:Pathways and
challenges,” inProc. IEEE Annu. Global HumanitarianTechnol. Conf., 2015, pp. 45–51.
3
Ways to command a prosthetic hand
• Push-buttons
• Joystick
• Keyboard
• Vision
• Electroencephalography (EEG)
• Electroneurography (ENG)
• Electromyography (EMG)
• Speech
4
Speech commanded prosthetic hands
1. Automatic speech recognition (ASR) System
• maps speech to text
2. Look-up table
• maps text to command
3. Low-level controller & driver
• maps commands and sensors data to electrical voltages
5
Automatic Speech Recognition (ASR) Systems
•Traditional ASR systems have 4 subsystems
• Preprocessing
• Feature extraction
• Language model
• Classifier
• Combination of Gaussian mixture model and the hidden Markov model (GMM-HMM)
• Combination of artificial neural networks and hidden Markov model (ANN-HMM)
•Recent ASR system are end-to-end
6
Traditional speech commanded prosthetic hands
7
Related Works
•CMU Sphinx is used to control a surgical robot used
• K. Zinchenko, C.-Y. Wu, and K.-T. Song, “A Study on Speech Recognition Control for a Surgical Robot,” IEEE
Transactions on Industrial Informatics, vol. 13, no. 2, pp. 607–615, 2017.
•Control a hand exoskeleton
• combination of discrete wavelet transforms and hidden Markov models
• S. Guo, Z. Wang, J. Guo, Q. Fu, and N. Li, “Design of the Speech Control System for a Upper Limb
Rehabilitation Robot Based on Wavelet De-noising,” 2018 IEEE International Conference on Mechatronics
and Automation (ICMA), 2018.
•Control a robotic hand
• A multi-layer perceptron
• 13 speech (five time domain + eight features frequency domain)
• R. Ismail, M. Ariyanto, W. Caesarendra, I. Haryanto, H. K. Dewoto, and Paryanto, “Speech control of robotic
hand augmented with 3D animation using neural network,” 2016 IEEE EMBS Conference on Biomedical
Engineering and Sciences (IECBES), 2016.
8
Deep learning speech
•GPGPU + Dataset
•Very deep
•Slow for embedded devices
9
T. Tan, Y. Qian, H. Hu, Y. Zhou, W. Ding, and K. Yu, "Adaptive
very deep convolutional residual network for noise robust
speech recognition," IEEE/ACM Transactions on Audio,
Speech, and Language Processing, vol. 26, no. 8, pp. 1393-
1405, 2018.
Embedded GPGPU
10
Company Google NVIDIA NVIDIA NVIDIA
Model Coral Jetson Nano Jetson TX2 AGX Xavier
GPU Vivante GC7000 Lite 16 core Maxwell 128 core Pascal 256 core Volta 512 core
TPU Google Edge - - -
CPU 4 core Cortex-A53 4 core Cortex-A57
4 core Cortex-A57 + 2 core
Denver
8 core Carmel
RAM 1 GB 4 GB 8 GB 16 GB
Storage 8 GB 16 GB 32 GB 32 GB
GFLOPS 32 236 559 1300
GPIO 8 5 8 4
USB 1 x USB 3.0 + 1 x USB C 4 x USB 3.0 1 x USB 3.0 +1 x USB 2.0 2 x USB C [3.1]
UART 2 1 1 1
I2C 2 2 4 2
SPI 1 with 2 CS 2 with 2 CS 1 with 2 CS 1 with 2 CS
CAN 0 0 1 1
I2S 1 1 2 1
Size (mm) 88 x 60 x 24 100 x 80 x 29 170 x 170 x 51 105 x 105 x 85
Weight 227 g 244 g 1.5 Kg 630g
Price ($) 150 100 400 700
Related Works
•Reduce neural networks size
•by changing some weight of the network to zero
• To create sparse Network
• Excellent in case of FPGA
•by pruning neurons or even layers
•Useful but not sufficient
11
Contribution
•Control of prosthetic hands with speech input
•Using a convolutional neural network (CNN)
•Maps 2D features of speech input to text
•Without hidden Markov model
•Minimize the size of the CNN
•Real-time in an embedded GPGPU
12
Proposed Method
13
Proposed Method
14
Proposed Method
15
Laye
r
Type
Number of
filters
Filter
size
Strid
e
Activati
on
Output
shape
Number of
Parameters
0
Input (Log of
spectrogram)
- - - - 129 x 71 x 1 0
1 Convolution 2D 8 10 x 7 1 ReLU 120 x 65 x 8 568
2 Pooling 2D - 7 x 5 1 Max 17 x 13 x 8 0
3 Batch normalization - - - 17 x 13 x 8 32
4 Convolution 2D 32 7 x 5 1 ReLU 11 x 9 x 32 8992
5 Pooling 2D - 5 x 3 1 Max 2 x 3 x 32 0
6 Batch normalization - - - - 2 x 3 x 32 128
7 Flatten - - - - 192 0
8 Dense - - - ReLU 64 12352
9 Drop out - - - - 64 0
10 Dense - - - SoftMax 9 585
Proposed Method
16
Data set
17
•Google speech command data set
•Open-source - Creative Commons BY 4.0 license
•35 words
•105,829 utterances
•Each utterance is one-second or shorter
•WAV format files
•16 kHz rate with linear 16-bit single-channel PCM values
•Several minutes long various kinds of background noise
Results
• We used the 8 classes
(words), which are "zero",
"one", "two", "three", "four",
and "five", "on" and "off".
• We used the rest of the
words as a class, "unknown".
• Adam optimizer
• Keras (TensorFlow back-end)
• Logarithm of spectrogram
18
Discussion
• One-hot encoding
• Dimension output vector = number of words +1
• Increasing the depth of the network has little effect on overall accuracy
• If the number of words increases significantly, user should increase the
number of filters
• Real time ~ 2 ms on NVIDIA Jetson TX2 developer kit (embedded GPGPU)
• Increasing speed
• NVIDIA AGX Xavier
• Using C++ and TensorRT
• WAV2LETTER++
19
Future works
• Investigate a CNN that detects the owner’s voice from other speakers
• The proposed CNN is robust to accent, speed, noise, etc.
• By combining these two CNNs
• Unconditional and Conditional Teacher-Student training
• Instead of one-hot-coding
• Teacher (bigger network) & student (smaller network)
• Comparing different type of 2D feature
• Current experiment: logarithm of speech spectrogram
• Future experiment: power-normalized cepstral coefficients (PNCC)
• CNNs with raw speech input
20
Conclusion
• Speech control of a prosthetic hand using a convolutional neural network (CNN)
• Without hidden Markov model (HMM)
• Proposed CNN maps 2D feature to text (classes, one-hot encoding)
• Look-up table to map text to the trajectory (command) for hand low-level controller and
driver
• Real-time performance (~ 2ms) on NVIDIA Jetson TX2 developer kit (an embedded GPGPU)
• Accuracy of 91% in a noisy environment
• We can increase speed by NVIDIA AGX Xavier, using either C++ and TensorRT or
WAV2LETTER++
• Future works: CNN that detects the owner’s voice from other speakers, unconditional and
conditional teacher-student training, comparing different type of 2D feature such as PNCC,
and investigating CNNs with raw speech input
21
?

Weitere ähnliche Inhalte

Ähnlich wie Convolutional neural networks for speech controlled prosthetic hands

Barcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de RiquezaBarcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de RiquezaFacultad de Informática UCM
 
high performance computing exposed
high performance computing exposedhigh performance computing exposed
high performance computing exposedericwilliammarshall
 
Rethinking computation: A processor architecture for machine intelligence
Rethinking computation: A processor architecture for machine intelligenceRethinking computation: A processor architecture for machine intelligence
Rethinking computation: A processor architecture for machine intelligenceIntel Nervana
 
AI is Impacting HPC Everywhere
AI is Impacting HPC EverywhereAI is Impacting HPC Everywhere
AI is Impacting HPC Everywhereinside-BigData.com
 
Valladolid final-septiembre-2010
Valladolid final-septiembre-2010Valladolid final-septiembre-2010
Valladolid final-septiembre-2010TELECOM I+D
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)DonghyunKang12
 
Advancements in the Real-Time Simulation of Large Active Distribution Systems...
Advancements in the Real-Time Simulation of Large Active Distribution Systems...Advancements in the Real-Time Simulation of Large Active Distribution Systems...
Advancements in the Real-Time Simulation of Large Active Distribution Systems...OPAL-RT TECHNOLOGIES
 
Breaking New Frontiers in Robotics and Edge Computing with AI
Breaking New Frontiers in Robotics and Edge Computing with AIBreaking New Frontiers in Robotics and Edge Computing with AI
Breaking New Frontiers in Robotics and Edge Computing with AIDustin Franklin
 
Implementing AI: High Performance Architectures: A Universal Accelerated Comp...
Implementing AI: High Performance Architectures: A Universal Accelerated Comp...Implementing AI: High Performance Architectures: A Universal Accelerated Comp...
Implementing AI: High Performance Architectures: A Universal Accelerated Comp...KTN
 
Microcontrollers and intro to real time programming 1
Microcontrollers and intro to real time programming 1Microcontrollers and intro to real time programming 1
Microcontrollers and intro to real time programming 1SSGMCE SHEGAON
 
Real time intrusion detection in network traffic using adaptive and auto-scal...
Real time intrusion detection in network traffic using adaptive and auto-scal...Real time intrusion detection in network traffic using adaptive and auto-scal...
Real time intrusion detection in network traffic using adaptive and auto-scal...Gobinath Loganathan
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
Creating smaller, faster, production-ready mobile machine learning models.
Creating smaller, faster, production-ready mobile machine learning models.Creating smaller, faster, production-ready mobile machine learning models.
Creating smaller, faster, production-ready mobile machine learning models.Jameson Toole
 
Remote authentication via biometrics1
Remote authentication via biometrics1Remote authentication via biometrics1
Remote authentication via biometrics1Omkar Salunke
 
BioPig for scalable analysis of big sequencing data
BioPig for scalable analysis of big sequencing dataBioPig for scalable analysis of big sequencing data
BioPig for scalable analysis of big sequencing dataZhong Wang
 
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...OPAL-RT TECHNOLOGIES
 
[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...
[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...
[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...Rakuten Group, Inc.
 

Ähnlich wie Convolutional neural networks for speech controlled prosthetic hands (20)

Barcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de RiquezaBarcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de Riqueza
 
What is 3d torus
What is 3d torusWhat is 3d torus
What is 3d torus
 
high performance computing exposed
high performance computing exposedhigh performance computing exposed
high performance computing exposed
 
Rethinking computation: A processor architecture for machine intelligence
Rethinking computation: A processor architecture for machine intelligenceRethinking computation: A processor architecture for machine intelligence
Rethinking computation: A processor architecture for machine intelligence
 
AI is Impacting HPC Everywhere
AI is Impacting HPC EverywhereAI is Impacting HPC Everywhere
AI is Impacting HPC Everywhere
 
Valladolid final-septiembre-2010
Valladolid final-septiembre-2010Valladolid final-septiembre-2010
Valladolid final-septiembre-2010
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
Advancements in the Real-Time Simulation of Large Active Distribution Systems...
Advancements in the Real-Time Simulation of Large Active Distribution Systems...Advancements in the Real-Time Simulation of Large Active Distribution Systems...
Advancements in the Real-Time Simulation of Large Active Distribution Systems...
 
Breaking New Frontiers in Robotics and Edge Computing with AI
Breaking New Frontiers in Robotics and Edge Computing with AIBreaking New Frontiers in Robotics and Edge Computing with AI
Breaking New Frontiers in Robotics and Edge Computing with AI
 
CNN Dataflow implementation on FPGAs
CNN Dataflow implementation on FPGAsCNN Dataflow implementation on FPGAs
CNN Dataflow implementation on FPGAs
 
Implementing AI: High Performance Architectures: A Universal Accelerated Comp...
Implementing AI: High Performance Architectures: A Universal Accelerated Comp...Implementing AI: High Performance Architectures: A Universal Accelerated Comp...
Implementing AI: High Performance Architectures: A Universal Accelerated Comp...
 
From Trill to Quill and Beyond
From Trill to Quill and BeyondFrom Trill to Quill and Beyond
From Trill to Quill and Beyond
 
Microcontrollers and intro to real time programming 1
Microcontrollers and intro to real time programming 1Microcontrollers and intro to real time programming 1
Microcontrollers and intro to real time programming 1
 
Real time intrusion detection in network traffic using adaptive and auto-scal...
Real time intrusion detection in network traffic using adaptive and auto-scal...Real time intrusion detection in network traffic using adaptive and auto-scal...
Real time intrusion detection in network traffic using adaptive and auto-scal...
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Creating smaller, faster, production-ready mobile machine learning models.
Creating smaller, faster, production-ready mobile machine learning models.Creating smaller, faster, production-ready mobile machine learning models.
Creating smaller, faster, production-ready mobile machine learning models.
 
Remote authentication via biometrics1
Remote authentication via biometrics1Remote authentication via biometrics1
Remote authentication via biometrics1
 
BioPig for scalable analysis of big sequencing data
BioPig for scalable analysis of big sequencing dataBioPig for scalable analysis of big sequencing data
BioPig for scalable analysis of big sequencing data
 
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
 
[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...
[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...
[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...
 

Kürzlich hochgeladen

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 

Kürzlich hochgeladen (20)

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 

Convolutional neural networks for speech controlled prosthetic hands

  • 1. Convolutional Neural Networks for Speech Controlled Prosthetic Hands Date: 09/26/2019 2019 First International Conference on Transdisciplinary AI (TransAI) Mohsen Jafarzadeh Department of Electrical and Computer Engineering The University of Texas at Dallas Richardson, TX, USA Mohsen.Jafarzadeh@utdallas.edu Yonas Tadesse Department of Mechanical Engineering The University of Texas at Dallas Richardson, TX, USA Yonas.Tadesse@utdallas.edu
  • 3. Introduction •~94,000 upper limb amputees in Europe • S. Micera, J. Carpaneto, and S. Raspopovic, “Control of Hand Prostheses Using Peripheral Information,” IEEE Reviews in Biomedical Engineering, vol. 3, pp. 48–68, 2010. •~41,000 upper limb amputees in the United States • K. Ziegler-Graham, E. J. MacKenzie, P. L. Ephraim, T. G. Travison, and R. Brookmeyer, “Estimating the prevalence of limb loss in the United States: 2005 to 2050,” Arch Phys Med Rehabil, vol. 89, no. 3, pp. 422–429, Mar. 2008. •About 40 million amputees in the world • M. Marinoet al., “Access to prosthetic devices in developing countries:Pathways and challenges,” inProc. IEEE Annu. Global HumanitarianTechnol. Conf., 2015, pp. 45–51. 3
  • 4. Ways to command a prosthetic hand • Push-buttons • Joystick • Keyboard • Vision • Electroencephalography (EEG) • Electroneurography (ENG) • Electromyography (EMG) • Speech 4
  • 5. Speech commanded prosthetic hands 1. Automatic speech recognition (ASR) System • maps speech to text 2. Look-up table • maps text to command 3. Low-level controller & driver • maps commands and sensors data to electrical voltages 5
  • 6. Automatic Speech Recognition (ASR) Systems •Traditional ASR systems have 4 subsystems • Preprocessing • Feature extraction • Language model • Classifier • Combination of Gaussian mixture model and the hidden Markov model (GMM-HMM) • Combination of artificial neural networks and hidden Markov model (ANN-HMM) •Recent ASR system are end-to-end 6
  • 7. Traditional speech commanded prosthetic hands 7
  • 8. Related Works •CMU Sphinx is used to control a surgical robot used • K. Zinchenko, C.-Y. Wu, and K.-T. Song, “A Study on Speech Recognition Control for a Surgical Robot,” IEEE Transactions on Industrial Informatics, vol. 13, no. 2, pp. 607–615, 2017. •Control a hand exoskeleton • combination of discrete wavelet transforms and hidden Markov models • S. Guo, Z. Wang, J. Guo, Q. Fu, and N. Li, “Design of the Speech Control System for a Upper Limb Rehabilitation Robot Based on Wavelet De-noising,” 2018 IEEE International Conference on Mechatronics and Automation (ICMA), 2018. •Control a robotic hand • A multi-layer perceptron • 13 speech (five time domain + eight features frequency domain) • R. Ismail, M. Ariyanto, W. Caesarendra, I. Haryanto, H. K. Dewoto, and Paryanto, “Speech control of robotic hand augmented with 3D animation using neural network,” 2016 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES), 2016. 8
  • 9. Deep learning speech •GPGPU + Dataset •Very deep •Slow for embedded devices 9 T. Tan, Y. Qian, H. Hu, Y. Zhou, W. Ding, and K. Yu, "Adaptive very deep convolutional residual network for noise robust speech recognition," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 8, pp. 1393- 1405, 2018.
  • 10. Embedded GPGPU 10 Company Google NVIDIA NVIDIA NVIDIA Model Coral Jetson Nano Jetson TX2 AGX Xavier GPU Vivante GC7000 Lite 16 core Maxwell 128 core Pascal 256 core Volta 512 core TPU Google Edge - - - CPU 4 core Cortex-A53 4 core Cortex-A57 4 core Cortex-A57 + 2 core Denver 8 core Carmel RAM 1 GB 4 GB 8 GB 16 GB Storage 8 GB 16 GB 32 GB 32 GB GFLOPS 32 236 559 1300 GPIO 8 5 8 4 USB 1 x USB 3.0 + 1 x USB C 4 x USB 3.0 1 x USB 3.0 +1 x USB 2.0 2 x USB C [3.1] UART 2 1 1 1 I2C 2 2 4 2 SPI 1 with 2 CS 2 with 2 CS 1 with 2 CS 1 with 2 CS CAN 0 0 1 1 I2S 1 1 2 1 Size (mm) 88 x 60 x 24 100 x 80 x 29 170 x 170 x 51 105 x 105 x 85 Weight 227 g 244 g 1.5 Kg 630g Price ($) 150 100 400 700
  • 11. Related Works •Reduce neural networks size •by changing some weight of the network to zero • To create sparse Network • Excellent in case of FPGA •by pruning neurons or even layers •Useful but not sufficient 11
  • 12. Contribution •Control of prosthetic hands with speech input •Using a convolutional neural network (CNN) •Maps 2D features of speech input to text •Without hidden Markov model •Minimize the size of the CNN •Real-time in an embedded GPGPU 12
  • 15. Proposed Method 15 Laye r Type Number of filters Filter size Strid e Activati on Output shape Number of Parameters 0 Input (Log of spectrogram) - - - - 129 x 71 x 1 0 1 Convolution 2D 8 10 x 7 1 ReLU 120 x 65 x 8 568 2 Pooling 2D - 7 x 5 1 Max 17 x 13 x 8 0 3 Batch normalization - - - 17 x 13 x 8 32 4 Convolution 2D 32 7 x 5 1 ReLU 11 x 9 x 32 8992 5 Pooling 2D - 5 x 3 1 Max 2 x 3 x 32 0 6 Batch normalization - - - - 2 x 3 x 32 128 7 Flatten - - - - 192 0 8 Dense - - - ReLU 64 12352 9 Drop out - - - - 64 0 10 Dense - - - SoftMax 9 585
  • 17. Data set 17 •Google speech command data set •Open-source - Creative Commons BY 4.0 license •35 words •105,829 utterances •Each utterance is one-second or shorter •WAV format files •16 kHz rate with linear 16-bit single-channel PCM values •Several minutes long various kinds of background noise
  • 18. Results • We used the 8 classes (words), which are "zero", "one", "two", "three", "four", and "five", "on" and "off". • We used the rest of the words as a class, "unknown". • Adam optimizer • Keras (TensorFlow back-end) • Logarithm of spectrogram 18
  • 19. Discussion • One-hot encoding • Dimension output vector = number of words +1 • Increasing the depth of the network has little effect on overall accuracy • If the number of words increases significantly, user should increase the number of filters • Real time ~ 2 ms on NVIDIA Jetson TX2 developer kit (embedded GPGPU) • Increasing speed • NVIDIA AGX Xavier • Using C++ and TensorRT • WAV2LETTER++ 19
  • 20. Future works • Investigate a CNN that detects the owner’s voice from other speakers • The proposed CNN is robust to accent, speed, noise, etc. • By combining these two CNNs • Unconditional and Conditional Teacher-Student training • Instead of one-hot-coding • Teacher (bigger network) & student (smaller network) • Comparing different type of 2D feature • Current experiment: logarithm of speech spectrogram • Future experiment: power-normalized cepstral coefficients (PNCC) • CNNs with raw speech input 20
  • 21. Conclusion • Speech control of a prosthetic hand using a convolutional neural network (CNN) • Without hidden Markov model (HMM) • Proposed CNN maps 2D feature to text (classes, one-hot encoding) • Look-up table to map text to the trajectory (command) for hand low-level controller and driver • Real-time performance (~ 2ms) on NVIDIA Jetson TX2 developer kit (an embedded GPGPU) • Accuracy of 91% in a noisy environment • We can increase speed by NVIDIA AGX Xavier, using either C++ and TensorRT or WAV2LETTER++ • Future works: CNN that detects the owner’s voice from other speakers, unconditional and conditional teacher-student training, comparing different type of 2D feature such as PNCC, and investigating CNNs with raw speech input 21
  • 22. ?