SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Project : Micro-speech Recognition
Command
Recognizer
“No”
“Yes”
Phase 2 :
Deploy to a Microcontroller
T
Command Recognizer
Recognize what people said.
3
Training
.wav data
To FFT Trained model
FFT
Feature
Command
Recognizer
Model
Get
.wav data
To FFT
FFT
Feature “Yes”
Training
Inference
https://bit.ly/2XBdE4q
Overall flow to this project
ADC PCM FFT and
pre-process
Audio
Spectrum
CNN model
output tensor
silence
unknown
yes
no
audio_provider feature_provider
Copy into input tensor
PopulateFeatureData
Interpreter
Invoke()
softmax
RecognizeCommands::
ProcessLatestResults
RespondToCommand
The audio features themselves are a two-
dimensional array, made up of horizontal slices
representing the frequencies at one point in time,
stacked on top of each other to form a spectrogram
showing how those frequencies changed over time.
How to get audio features ?
Fourier Transform on sound
Frequencies in sound
The magnitude spectrum of the signal
A magnitude spectrogram is a
visualization of the frequencies
in sound over time, and can be
useful as a feature for neural
network recognition on noise or
speech.
Examine the spectrogram “audio images"
Audio spectrum representants audio features
You can see how the 30-ms
sample window is moved
forward by 20 ms each time
until it has covered the full
one-second sample.
40
49
feature buffer(1 second)
we combine the results of running the FFT on 49 consecutive 30-ms slices
of audio, and this will pass into the model
each FFT row represents a
30ms sample of audio split into
40 frequency buckets.
int(
𝑙𝑒𝑛𝑔𝑡ℎ−𝑤𝑖𝑛𝑑𝑜𝑤_𝑠𝑖𝑧𝑒
𝑠𝑡𝑟𝑖𝑑𝑒
) + 1
30+48*20=990ms
running an FFT across a 30ms section of the audio
sample data
FFT FFT
Audio Recognition Model (CNN Model)
CNN Model
silence
unknown
yes
no
1 second audio=40x49 pixels image
40
49
Our Model
CNN
Model
Input
output
(1,49,40,1) (1,4)
Type: int8 Type: int8 (-128~127)
Input byte: (1x49x40)x1 byte=1960 0 1 2
unknown
silence yes no
3
1 second audio spectrogram (49x40)
tensorflow/lite/micro/examples/micro_speech
Project File Structure
main_function.cc Tensorflow Lite 框架主要程式
recognize_commands.cc  對推論結果進行處理
micro_features/model.cc  Tflite model
XXXX_test.cc  以_test.cc 為檔名結尾
是一些可以在開發主機上進行的測試程式
arduino, sparkfun_edge, zephyr_riscv,..
裡頭為特定硬體的處理檔案, 若在編譯時指定
TARGET=XXX, 則會以資料匣內的檔案取代原檔案
├── sparkfun_edge
| ├── command_responder.cc
| └── audio_provider.cc
├── micro_features
GetAudioSamples()
GenerateMicroFeatures()
Project Flow
程式流程
Audio Spectrum
ADC
PCM
sparkfun_edge/audio_provider.cc
GetAudioSamples ()
GenerateMicroFeatures() 40
49
kFeatureSliceCount
kFeatureSliceSize
kFeatureElementCount=49x40
1 second window
performs the FFT and returns the audio
frequency information.
feature_provider.cc FeatureProvider::PopulateFeatureData
model input
main_functions.cc
feature_provider.cc
The feature provider converts raw audio, obtained
from the audio provider, into spectrograms that can
be fed into our model. It is called during the main
loop
FeatureProvider::PopulateFeatureData() : Fills the
feature data with information from audio inputs,
and returns how many feature slices were updated.
The Feature Provider
PopulateFeatureData()
每次都是1秒鐘的語音資
料, 但不用每次又全部重
算FFT , 只針對有新的
audio slice計算其FFT 即可,
以節省計算量及時間
feature_provider.cc
PopulateFeatureData()
1 second window
it first requests audio for that slice from
the audio provider using GetAudioSamples()
, and then it calls GenerateMicroFeatures() to
perform the FFT and returns the audio
frequency information .
feature_provider.cc
1 second window
audio_samples
_size: 512
audio_samples
feature_data_
FFT
feature_provider.cc
micro_features/micro_model_settings.h
sparkfun_edge/audio_provider.cc
GetAudioSamples () is expected to return an array of
14-bit pulse code modulated (PCM) audio data.
The Audio Provider
audio_samples
FFT
Size: 512
20ms 40ms 60ms 80ms 100ms
Digital audio format
14 bit PCM(Pulse-Code Modulation)
kAudioSampleFrequency=16KHz
 audio sample size=16000 samples/second
=16 samples/ 1ms
Generating the Sample Rate for the ADC
Trigger frequency
am_hal_ctimer_period_set(3, AM_HAL_CTIMER_TIMERA, 750, 0);
12MHz/750 = 16KHz (sampling rate)
audio_provider.cc
d
MIC1
MIC0
Timer A3
GPIO11/ADC2
GPIO29/ADC1
14bit ADC
12MHz
32K
SRAM
DMA
FIFO
ADC set up as a repeat scan mode
trigger ADC periodically
slot number+ Sampling data
Microphone
GPIO29/ADC1
GPIO11/ADC2
the channel select bit field specifies
which one of the analog
multiplexer channels will be used
for the conversions requested for
an individual slot.
When each active slot obtains a
sample from the ADC, it is added to
the value in its accumulator.
All slots write their accumulated
results to the FIFO
sparkfun_edge/audio_provider.cc
Copy (size:kAdcSampleBufferSize)
GetAudioSamples()
sparkfun_edge/audio_provider.cc
g_ui32ADCSampleBuffer1 [kAdcSampleBufferSize]
g_audio_capture_buffer
g_audio_capture_buffer[g_audio_capture_buffer_start]
= temp.ui32Sample;
Copy(size: duration_ms)
30ms PCM audio data
GetAudioSamples
(int start_ms, int duration_ms)
g_audio_output_buffer
Copy when ADC Interrupt occurs
ui32Slot
ui32Sample
ADC data (Slot 1 +Slot2 )
g_ui32ADCSampleBuffer0 [kAdcSampleBufferSize]
ui32TargetAddress
kAdcSampleBufferSize =2 slot* 1024 samples per slot
16000
512
Audio data is transferred by
DMA transfer
GetAudioSamples()
start_ms
start_ms+duration_ms
g_audio_capture_buffer
g_audio_output_buffer
當ISR發生一次, time stamp 就加1, 16 次ISR 表示共讀了16 * 1000 samples, , 約略經過1ms
Time stamp 計算方式
16000
g_audio_output_buffer[kMaxAudioSampleSize]
kMaxAudioSampleSize =512 ( power of two)
Part of the word “yes” being captured in our window
One Problem : Audio is live streaming
YES
??
CNN model
output tensor
silence
unknown
yes
no
Interpreter
Invoke()
softmax
RecognizeCommands::
ProcessLatestResults
RespondToCommand
The length of the averaging window
(average_window_duration_ms)
The minimum average score that counts as a detection
(detection_threshold)
The amount of time we’ll wait after hearing a command
before recognizing a second one (suppression_ms)
The minimum number of inferences required in the window
for a result to count (3)
RecognizeCommands
recognize_commands.cc
產生燒錄檔 micro_speech_wire.bin
寫入燒錄檔到板子
Hands – on
https://drive.google.com/drive/folders/1FhkM
DQ5xZoQS8GLkPZJPoVvT3dD3pk3g
Study
tensorflow/lite/micro/examples/micro_speech
main_function.cc
feature_provider.cc
recognize_commands.cc
/sparkfun_edge/command_responder.cc
開啓終端機 (baud rate: 115200bps)
Demo 終端機會輸出以下訊息
將 Sparkfun edge 透過 USB 連接電源後
會看到有藍光一直在閃 ,表示此時板子在
正等待語音輸入

Weitere ähnliche Inhalte

Was ist angesagt?

高位合成ツールVivado hlsのopen cv対応
高位合成ツールVivado hlsのopen cv対応高位合成ツールVivado hlsのopen cv対応
高位合成ツールVivado hlsのopen cv対応marsee101
 
[嵌入式系統] 嵌入式系統進階
[嵌入式系統] 嵌入式系統進階[嵌入式系統] 嵌入式系統進階
[嵌入式系統] 嵌入式系統進階Simen Li
 
ARM CPUにおけるSIMDを用いた高速計算入門
ARM CPUにおけるSIMDを用いた高速計算入門ARM CPUにおけるSIMDを用いた高速計算入門
ARM CPUにおけるSIMDを用いた高速計算入門Fixstars Corporation
 
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIKernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIAnne Nicolas
 
Part-1 : Mastering microcontroller with embedded driver development
Part-1 : Mastering microcontroller with embedded driver development Part-1 : Mastering microcontroller with embedded driver development
Part-1 : Mastering microcontroller with embedded driver development FastBit Embedded Brain Academy
 
腦波分析疲勞駕駛預警系統
腦波分析疲勞駕駛預警系統腦波分析疲勞駕駛預警系統
腦波分析疲勞駕駛預警系統艾鍗科技
 
最新C++事情 C++14-C++20 (2018年10月)
最新C++事情 C++14-C++20 (2018年10月)最新C++事情 C++14-C++20 (2018年10月)
最新C++事情 C++14-C++20 (2018年10月)Akihiko Matuura
 
C++のビルド高速化について
C++のビルド高速化についてC++のビルド高速化について
C++のビルド高速化についてAimingStudy
 
Deflate
DeflateDeflate
Deflate7shi
 
DSP vs ASCI vs FPGA (étude théorique-Univ-MSILA_S.T.N) ver.francais!
DSP vs ASCI vs FPGA (étude théorique-Univ-MSILA_S.T.N) ver.francais!DSP vs ASCI vs FPGA (étude théorique-Univ-MSILA_S.T.N) ver.francais!
DSP vs ASCI vs FPGA (étude théorique-Univ-MSILA_S.T.N) ver.francais!Univ. of M'sila
 
Wiresharkの解析プラグインを作る ssmjp 201409
Wiresharkの解析プラグインを作る ssmjp 201409Wiresharkの解析プラグインを作る ssmjp 201409
Wiresharkの解析プラグインを作る ssmjp 201409稔 小林
 
【Unity道場スペシャル 2017京都】最適化をする前に覚えておきたい技術
【Unity道場スペシャル 2017京都】最適化をする前に覚えておきたい技術【Unity道場スペシャル 2017京都】最適化をする前に覚えておきたい技術
【Unity道場スペシャル 2017京都】最適化をする前に覚えておきたい技術Unity Technologies Japan K.K.
 
高位合成におけるC++テンプレートメタプログラミングの効果
高位合成におけるC++テンプレートメタプログラミングの効果高位合成におけるC++テンプレートメタプログラミングの効果
高位合成におけるC++テンプレートメタプログラミングの効果Kenichiro MITSUDA
 
[ZigBee 嵌入式系統] ZigBee 應用實作 - 使用 TI Z-Stack Firmware
[ZigBee 嵌入式系統] ZigBee 應用實作 - 使用 TI Z-Stack Firmware[ZigBee 嵌入式系統] ZigBee 應用實作 - 使用 TI Z-Stack Firmware
[ZigBee 嵌入式系統] ZigBee 應用實作 - 使用 TI Z-Stack FirmwareSimen Li
 
いまさら聞けないarmを使ったNEONの基礎と活用事例
いまさら聞けないarmを使ったNEONの基礎と活用事例いまさら聞けないarmを使ったNEONの基礎と活用事例
いまさら聞けないarmを使ったNEONの基礎と活用事例Fixstars Corporation
 
Zynq mp勉強会資料
Zynq mp勉強会資料Zynq mp勉強会資料
Zynq mp勉強会資料一路 川染
 
高速な倍精度指数関数expの実装
高速な倍精度指数関数expの実装高速な倍精度指数関数expの実装
高速な倍精度指数関数expの実装MITSUNARI Shigeo
 
constexpr関数はコンパイル時処理。これはいい。実行時が霞んで見える。cpuの嬌声が聞こえてきそうだ
constexpr関数はコンパイル時処理。これはいい。実行時が霞んで見える。cpuの嬌声が聞こえてきそうだconstexpr関数はコンパイル時処理。これはいい。実行時が霞んで見える。cpuの嬌声が聞こえてきそうだ
constexpr関数はコンパイル時処理。これはいい。実行時が霞んで見える。cpuの嬌声が聞こえてきそうだGenya Murakami
 

Was ist angesagt? (20)

高位合成ツールVivado hlsのopen cv対応
高位合成ツールVivado hlsのopen cv対応高位合成ツールVivado hlsのopen cv対応
高位合成ツールVivado hlsのopen cv対応
 
[嵌入式系統] 嵌入式系統進階
[嵌入式系統] 嵌入式系統進階[嵌入式系統] 嵌入式系統進階
[嵌入式系統] 嵌入式系統進階
 
ARM CPUにおけるSIMDを用いた高速計算入門
ARM CPUにおけるSIMDを用いた高速計算入門ARM CPUにおけるSIMDを用いた高速計算入門
ARM CPUにおけるSIMDを用いた高速計算入門
 
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIKernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
 
Part-1 : Mastering microcontroller with embedded driver development
Part-1 : Mastering microcontroller with embedded driver development Part-1 : Mastering microcontroller with embedded driver development
Part-1 : Mastering microcontroller with embedded driver development
 
腦波分析疲勞駕駛預警系統
腦波分析疲勞駕駛預警系統腦波分析疲勞駕駛預警系統
腦波分析疲勞駕駛預警系統
 
最新C++事情 C++14-C++20 (2018年10月)
最新C++事情 C++14-C++20 (2018年10月)最新C++事情 C++14-C++20 (2018年10月)
最新C++事情 C++14-C++20 (2018年10月)
 
C++のビルド高速化について
C++のビルド高速化についてC++のビルド高速化について
C++のビルド高速化について
 
Deflate
DeflateDeflate
Deflate
 
DSP vs ASCI vs FPGA (étude théorique-Univ-MSILA_S.T.N) ver.francais!
DSP vs ASCI vs FPGA (étude théorique-Univ-MSILA_S.T.N) ver.francais!DSP vs ASCI vs FPGA (étude théorique-Univ-MSILA_S.T.N) ver.francais!
DSP vs ASCI vs FPGA (étude théorique-Univ-MSILA_S.T.N) ver.francais!
 
Wiresharkの解析プラグインを作る ssmjp 201409
Wiresharkの解析プラグインを作る ssmjp 201409Wiresharkの解析プラグインを作る ssmjp 201409
Wiresharkの解析プラグインを作る ssmjp 201409
 
【Unity道場スペシャル 2017京都】最適化をする前に覚えておきたい技術
【Unity道場スペシャル 2017京都】最適化をする前に覚えておきたい技術【Unity道場スペシャル 2017京都】最適化をする前に覚えておきたい技術
【Unity道場スペシャル 2017京都】最適化をする前に覚えておきたい技術
 
高位合成におけるC++テンプレートメタプログラミングの効果
高位合成におけるC++テンプレートメタプログラミングの効果高位合成におけるC++テンプレートメタプログラミングの効果
高位合成におけるC++テンプレートメタプログラミングの効果
 
Linux Device Tree
Linux Device TreeLinux Device Tree
Linux Device Tree
 
[ZigBee 嵌入式系統] ZigBee 應用實作 - 使用 TI Z-Stack Firmware
[ZigBee 嵌入式系統] ZigBee 應用實作 - 使用 TI Z-Stack Firmware[ZigBee 嵌入式系統] ZigBee 應用實作 - 使用 TI Z-Stack Firmware
[ZigBee 嵌入式系統] ZigBee 應用實作 - 使用 TI Z-Stack Firmware
 
いまさら聞けないarmを使ったNEONの基礎と活用事例
いまさら聞けないarmを使ったNEONの基礎と活用事例いまさら聞けないarmを使ったNEONの基礎と活用事例
いまさら聞けないarmを使ったNEONの基礎と活用事例
 
Unreal Engine 4を使って地球を衛る方法
Unreal Engine 4を使って地球を衛る方法Unreal Engine 4を使って地球を衛る方法
Unreal Engine 4を使って地球を衛る方法
 
Zynq mp勉強会資料
Zynq mp勉強会資料Zynq mp勉強会資料
Zynq mp勉強会資料
 
高速な倍精度指数関数expの実装
高速な倍精度指数関数expの実装高速な倍精度指数関数expの実装
高速な倍精度指数関数expの実装
 
constexpr関数はコンパイル時処理。これはいい。実行時が霞んで見える。cpuの嬌声が聞こえてきそうだ
constexpr関数はコンパイル時処理。これはいい。実行時が霞んで見える。cpuの嬌声が聞こえてきそうだconstexpr関数はコンパイル時処理。これはいい。実行時が霞んで見える。cpuの嬌声が聞こえてきそうだ
constexpr関数はコンパイル時処理。これはいい。実行時が霞んで見える。cpuの嬌声が聞こえてきそうだ
 

Ähnlich wie TinyML - 4 speech recognition

Fyp Final Presentation E1 Tapping
Fyp Final Presentation E1 TappingFyp Final Presentation E1 Tapping
Fyp Final Presentation E1 TappingFacebook Guru
 
Applications - embedded systems
Applications - embedded systemsApplications - embedded systems
Applications - embedded systemsDr.YNM
 
Emergency Service Provide by Mobile
Emergency Service Provide by MobileEmergency Service Provide by Mobile
Emergency Service Provide by MobileSamiul Hoque
 
igorFreire_UCI_real-time-dsp_reports
igorFreire_UCI_real-time-dsp_reportsigorFreire_UCI_real-time-dsp_reports
igorFreire_UCI_real-time-dsp_reportsIgor Freire
 
Lect1a_ basics of DSP.pptx
Lect1a_ basics of DSP.pptxLect1a_ basics of DSP.pptx
Lect1a_ basics of DSP.pptxVarsha506533
 
The evolution of TMS, family of DSP\'s
The evolution of TMS, family of DSP\'sThe evolution of TMS, family of DSP\'s
The evolution of TMS, family of DSP\'sRitul Sonania
 
Usb Controlled Function Generator
Usb Controlled Function GeneratorUsb Controlled Function Generator
Usb Controlled Function GeneratorKent Schonert
 
Melp codec optimization using DSP kit
Melp codec optimization using DSP kitMelp codec optimization using DSP kit
Melp codec optimization using DSP kitsohaibaslam207
 
Fpga video capturing
Fpga video capturingFpga video capturing
Fpga video capturingshehryar88
 
Sudhir tms 320 f 2812
Sudhir tms 320 f 2812 Sudhir tms 320 f 2812
Sudhir tms 320 f 2812 vijaydeepakg
 
Fault Tolerant Parallel Filters Based On Bch Codes
Fault Tolerant Parallel Filters Based On Bch CodesFault Tolerant Parallel Filters Based On Bch Codes
Fault Tolerant Parallel Filters Based On Bch CodesIJERA Editor
 

Ähnlich wie TinyML - 4 speech recognition (20)

Fyp Final Presentation E1 Tapping
Fyp Final Presentation E1 TappingFyp Final Presentation E1 Tapping
Fyp Final Presentation E1 Tapping
 
Applications - embedded systems
Applications - embedded systemsApplications - embedded systems
Applications - embedded systems
 
dsp.pdf
dsp.pdfdsp.pdf
dsp.pdf
 
DSP_Assign_1
DSP_Assign_1DSP_Assign_1
DSP_Assign_1
 
Emergency Service Provide by Mobile
Emergency Service Provide by MobileEmergency Service Provide by Mobile
Emergency Service Provide by Mobile
 
XMC4000 Brochure | Infineon Technologies
XMC4000 Brochure | Infineon TechnologiesXMC4000 Brochure | Infineon Technologies
XMC4000 Brochure | Infineon Technologies
 
igorFreire_UCI_real-time-dsp_reports
igorFreire_UCI_real-time-dsp_reportsigorFreire_UCI_real-time-dsp_reports
igorFreire_UCI_real-time-dsp_reports
 
Mixer v1.0.3
Mixer v1.0.3Mixer v1.0.3
Mixer v1.0.3
 
3D-DRESD ASIDA
3D-DRESD ASIDA3D-DRESD ASIDA
3D-DRESD ASIDA
 
Lect1a_ basics of DSP.pptx
Lect1a_ basics of DSP.pptxLect1a_ basics of DSP.pptx
Lect1a_ basics of DSP.pptx
 
My Project
My ProjectMy Project
My Project
 
The evolution of TMS, family of DSP\'s
The evolution of TMS, family of DSP\'sThe evolution of TMS, family of DSP\'s
The evolution of TMS, family of DSP\'s
 
Usb Controlled Function Generator
Usb Controlled Function GeneratorUsb Controlled Function Generator
Usb Controlled Function Generator
 
Melp codec optimization using DSP kit
Melp codec optimization using DSP kitMelp codec optimization using DSP kit
Melp codec optimization using DSP kit
 
Fpga video capturing
Fpga video capturingFpga video capturing
Fpga video capturing
 
Sudhir tms 320 f 2812
Sudhir tms 320 f 2812 Sudhir tms 320 f 2812
Sudhir tms 320 f 2812
 
SDH and TDM telecom
SDH and TDM telecomSDH and TDM telecom
SDH and TDM telecom
 
PC based oscilloscope
PC based oscilloscopePC based oscilloscope
PC based oscilloscope
 
Fault Tolerant Parallel Filters Based On Bch Codes
Fault Tolerant Parallel Filters Based On Bch CodesFault Tolerant Parallel Filters Based On Bch Codes
Fault Tolerant Parallel Filters Based On Bch Codes
 
FPGA Implementation of High Speed FIR Filters and less power consumption stru...
FPGA Implementation of High Speed FIR Filters and less power consumption stru...FPGA Implementation of High Speed FIR Filters and less power consumption stru...
FPGA Implementation of High Speed FIR Filters and less power consumption stru...
 

Mehr von 艾鍗科技

Appendix 1 Goolge colab
Appendix 1 Goolge colabAppendix 1 Goolge colab
Appendix 1 Goolge colab艾鍗科技
 
Project-IOT於餐館系統的應用
Project-IOT於餐館系統的應用Project-IOT於餐館系統的應用
Project-IOT於餐館系統的應用艾鍗科技
 
02 IoT implementation
02 IoT implementation02 IoT implementation
02 IoT implementation艾鍗科技
 
2. 機器學習簡介
2. 機器學習簡介2. 機器學習簡介
2. 機器學習簡介艾鍗科技
 
5.MLP(Multi-Layer Perceptron)
5.MLP(Multi-Layer Perceptron) 5.MLP(Multi-Layer Perceptron)
5.MLP(Multi-Layer Perceptron) 艾鍗科技
 
心率血氧檢測與運動促進
心率血氧檢測與運動促進心率血氧檢測與運動促進
心率血氧檢測與運動促進艾鍗科技
 
利用音樂&情境燈幫助放鬆
利用音樂&情境燈幫助放鬆利用音樂&情境燈幫助放鬆
利用音樂&情境燈幫助放鬆艾鍗科技
 
IoT感測器驅動程式 在樹莓派上實作
IoT感測器驅動程式在樹莓派上實作IoT感測器驅動程式在樹莓派上實作
IoT感測器驅動程式 在樹莓派上實作艾鍗科技
 
無線聲控遙控車
無線聲控遙控車無線聲控遙控車
無線聲控遙控車艾鍗科技
 
最佳光源的研究和實作
最佳光源的研究和實作最佳光源的研究和實作
最佳光源的研究和實作 艾鍗科技
 
無線監控網路攝影機與控制自走車
無線監控網路攝影機與控制自走車無線監控網路攝影機與控制自走車
無線監控網路攝影機與控制自走車 艾鍗科技
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning艾鍗科技
 
人臉辨識考勤系統
人臉辨識考勤系統人臉辨識考勤系統
人臉辨識考勤系統艾鍗科技
 
智慧家庭Smart Home
智慧家庭Smart Home智慧家庭Smart Home
智慧家庭Smart Home艾鍗科技
 
雲端智能盆栽
雲端智能盆栽雲端智能盆栽
雲端智能盆栽艾鍗科技
 
How to -- Goolge colab
How to -- Goolge colabHow to -- Goolge colab
How to -- Goolge colab艾鍗科技
 

Mehr von 艾鍗科技 (20)

Appendix 1 Goolge colab
Appendix 1 Goolge colabAppendix 1 Goolge colab
Appendix 1 Goolge colab
 
Project-IOT於餐館系統的應用
Project-IOT於餐館系統的應用Project-IOT於餐館系統的應用
Project-IOT於餐館系統的應用
 
02 IoT implementation
02 IoT implementation02 IoT implementation
02 IoT implementation
 
Openvino ncs2
Openvino ncs2Openvino ncs2
Openvino ncs2
 
Step motor
Step motorStep motor
Step motor
 
2. 機器學習簡介
2. 機器學習簡介2. 機器學習簡介
2. 機器學習簡介
 
5.MLP(Multi-Layer Perceptron)
5.MLP(Multi-Layer Perceptron) 5.MLP(Multi-Layer Perceptron)
5.MLP(Multi-Layer Perceptron)
 
3. data features
3. data features3. data features
3. data features
 
心率血氧檢測與運動促進
心率血氧檢測與運動促進心率血氧檢測與運動促進
心率血氧檢測與運動促進
 
利用音樂&情境燈幫助放鬆
利用音樂&情境燈幫助放鬆利用音樂&情境燈幫助放鬆
利用音樂&情境燈幫助放鬆
 
IoT感測器驅動程式 在樹莓派上實作
IoT感測器驅動程式在樹莓派上實作IoT感測器驅動程式在樹莓派上實作
IoT感測器驅動程式 在樹莓派上實作
 
無線聲控遙控車
無線聲控遙控車無線聲控遙控車
無線聲控遙控車
 
最佳光源的研究和實作
最佳光源的研究和實作最佳光源的研究和實作
最佳光源的研究和實作
 
無線監控網路攝影機與控制自走車
無線監控網路攝影機與控制自走車無線監控網路攝影機與控制自走車
無線監控網路攝影機與控制自走車
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
人臉辨識考勤系統
人臉辨識考勤系統人臉辨識考勤系統
人臉辨識考勤系統
 
智慧家庭Smart Home
智慧家庭Smart Home智慧家庭Smart Home
智慧家庭Smart Home
 
智能健身
智能健身智能健身
智能健身
 
雲端智能盆栽
雲端智能盆栽雲端智能盆栽
雲端智能盆栽
 
How to -- Goolge colab
How to -- Goolge colabHow to -- Goolge colab
How to -- Goolge colab
 

Kürzlich hochgeladen

Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptNANDHAKUMARA10
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfRagavanV2
 

Kürzlich hochgeladen (20)

Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 

TinyML - 4 speech recognition

  • 1. Project : Micro-speech Recognition Command Recognizer “No” “Yes” Phase 2 : Deploy to a Microcontroller
  • 2. T Command Recognizer Recognize what people said. 3 Training .wav data To FFT Trained model FFT Feature Command Recognizer Model Get .wav data To FFT FFT Feature “Yes” Training Inference https://bit.ly/2XBdE4q Overall flow to this project ADC PCM FFT and pre-process Audio Spectrum CNN model output tensor silence unknown yes no audio_provider feature_provider Copy into input tensor PopulateFeatureData Interpreter Invoke() softmax RecognizeCommands:: ProcessLatestResults RespondToCommand
  • 3. The audio features themselves are a two- dimensional array, made up of horizontal slices representing the frequencies at one point in time, stacked on top of each other to form a spectrogram showing how those frequencies changed over time. How to get audio features ? Fourier Transform on sound Frequencies in sound
  • 4. The magnitude spectrum of the signal A magnitude spectrogram is a visualization of the frequencies in sound over time, and can be useful as a feature for neural network recognition on noise or speech. Examine the spectrogram “audio images"
  • 5. Audio spectrum representants audio features You can see how the 30-ms sample window is moved forward by 20 ms each time until it has covered the full one-second sample. 40 49 feature buffer(1 second) we combine the results of running the FFT on 49 consecutive 30-ms slices of audio, and this will pass into the model each FFT row represents a 30ms sample of audio split into 40 frequency buckets. int( 𝑙𝑒𝑛𝑔𝑡ℎ−𝑤𝑖𝑛𝑑𝑜𝑤_𝑠𝑖𝑧𝑒 𝑠𝑡𝑟𝑖𝑑𝑒 ) + 1 30+48*20=990ms running an FFT across a 30ms section of the audio sample data FFT FFT Audio Recognition Model (CNN Model) CNN Model silence unknown yes no 1 second audio=40x49 pixels image 40 49
  • 6. Our Model CNN Model Input output (1,49,40,1) (1,4) Type: int8 Type: int8 (-128~127) Input byte: (1x49x40)x1 byte=1960 0 1 2 unknown silence yes no 3 1 second audio spectrogram (49x40) tensorflow/lite/micro/examples/micro_speech Project File Structure main_function.cc Tensorflow Lite 框架主要程式 recognize_commands.cc  對推論結果進行處理 micro_features/model.cc  Tflite model XXXX_test.cc  以_test.cc 為檔名結尾 是一些可以在開發主機上進行的測試程式 arduino, sparkfun_edge, zephyr_riscv,.. 裡頭為特定硬體的處理檔案, 若在編譯時指定 TARGET=XXX, 則會以資料匣內的檔案取代原檔案 ├── sparkfun_edge | ├── command_responder.cc | └── audio_provider.cc ├── micro_features GetAudioSamples() GenerateMicroFeatures()
  • 7. Project Flow 程式流程 Audio Spectrum ADC PCM sparkfun_edge/audio_provider.cc GetAudioSamples () GenerateMicroFeatures() 40 49 kFeatureSliceCount kFeatureSliceSize kFeatureElementCount=49x40 1 second window performs the FFT and returns the audio frequency information. feature_provider.cc FeatureProvider::PopulateFeatureData model input
  • 8. main_functions.cc feature_provider.cc The feature provider converts raw audio, obtained from the audio provider, into spectrograms that can be fed into our model. It is called during the main loop FeatureProvider::PopulateFeatureData() : Fills the feature data with information from audio inputs, and returns how many feature slices were updated. The Feature Provider
  • 9. PopulateFeatureData() 每次都是1秒鐘的語音資 料, 但不用每次又全部重 算FFT , 只針對有新的 audio slice計算其FFT 即可, 以節省計算量及時間 feature_provider.cc PopulateFeatureData() 1 second window it first requests audio for that slice from the audio provider using GetAudioSamples() , and then it calls GenerateMicroFeatures() to perform the FFT and returns the audio frequency information . feature_provider.cc
  • 10. 1 second window audio_samples _size: 512 audio_samples feature_data_ FFT feature_provider.cc micro_features/micro_model_settings.h
  • 11. sparkfun_edge/audio_provider.cc GetAudioSamples () is expected to return an array of 14-bit pulse code modulated (PCM) audio data. The Audio Provider audio_samples FFT Size: 512 20ms 40ms 60ms 80ms 100ms Digital audio format 14 bit PCM(Pulse-Code Modulation) kAudioSampleFrequency=16KHz  audio sample size=16000 samples/second =16 samples/ 1ms Generating the Sample Rate for the ADC Trigger frequency am_hal_ctimer_period_set(3, AM_HAL_CTIMER_TIMERA, 750, 0); 12MHz/750 = 16KHz (sampling rate) audio_provider.cc d MIC1 MIC0 Timer A3 GPIO11/ADC2 GPIO29/ADC1 14bit ADC 12MHz 32K SRAM DMA FIFO ADC set up as a repeat scan mode trigger ADC periodically slot number+ Sampling data
  • 12. Microphone GPIO29/ADC1 GPIO11/ADC2 the channel select bit field specifies which one of the analog multiplexer channels will be used for the conversions requested for an individual slot. When each active slot obtains a sample from the ADC, it is added to the value in its accumulator. All slots write their accumulated results to the FIFO
  • 13. sparkfun_edge/audio_provider.cc Copy (size:kAdcSampleBufferSize) GetAudioSamples() sparkfun_edge/audio_provider.cc g_ui32ADCSampleBuffer1 [kAdcSampleBufferSize] g_audio_capture_buffer g_audio_capture_buffer[g_audio_capture_buffer_start] = temp.ui32Sample; Copy(size: duration_ms) 30ms PCM audio data GetAudioSamples (int start_ms, int duration_ms) g_audio_output_buffer Copy when ADC Interrupt occurs ui32Slot ui32Sample ADC data (Slot 1 +Slot2 ) g_ui32ADCSampleBuffer0 [kAdcSampleBufferSize] ui32TargetAddress kAdcSampleBufferSize =2 slot* 1024 samples per slot 16000 512 Audio data is transferred by DMA transfer
  • 14. GetAudioSamples() start_ms start_ms+duration_ms g_audio_capture_buffer g_audio_output_buffer 當ISR發生一次, time stamp 就加1, 16 次ISR 表示共讀了16 * 1000 samples, , 約略經過1ms Time stamp 計算方式 16000 g_audio_output_buffer[kMaxAudioSampleSize] kMaxAudioSampleSize =512 ( power of two) Part of the word “yes” being captured in our window One Problem : Audio is live streaming YES ??
  • 15. CNN model output tensor silence unknown yes no Interpreter Invoke() softmax RecognizeCommands:: ProcessLatestResults RespondToCommand The length of the averaging window (average_window_duration_ms) The minimum average score that counts as a detection (detection_threshold) The amount of time we’ll wait after hearing a command before recognizing a second one (suppression_ms) The minimum number of inferences required in the window for a result to count (3) RecognizeCommands
  • 16. recognize_commands.cc 產生燒錄檔 micro_speech_wire.bin 寫入燒錄檔到板子 Hands – on https://drive.google.com/drive/folders/1FhkM DQ5xZoQS8GLkPZJPoVvT3dD3pk3g Study tensorflow/lite/micro/examples/micro_speech main_function.cc feature_provider.cc recognize_commands.cc /sparkfun_edge/command_responder.cc
  • 17. 開啓終端機 (baud rate: 115200bps) Demo 終端機會輸出以下訊息 將 Sparkfun edge 透過 USB 連接電源後 會看到有藍光一直在閃 ,表示此時板子在 正等待語音輸入