SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
Sample based generative models
for speech synthesis
Дмитро Бєлєвцов @ IBDI
Frame-based business
1.Split the waveform into overlapping frames
Frame-based business
1.Split the waveform into overlapping frames
2.Extract spectral features from each frame
Frame-based business
1.Split the waveform into overlapping frames
2.Extract spectral features from each frame
3.Model the distribution of these parameters
Frame-based business
1.Split the waveform into overlapping frames
2.Extract spectral features from each frame
3.Model the distribution of these parameters
4.Generate parameters
Frame-based business
1.Split the waveform into overlapping frames
2.Extract spectral features from each frame
3.Model the distribution of these parameters
4.Generate parameters
5.Convert parameters back to the waveform
Frame-based business
● 100x lower time resolution
● Phase-invariant
● Naturally motivated
● Highly compressed
● Separated from pitch
Pros:
Frame-based business
● 100x lower time resolution
● Phase-invariant
● Naturally motivated
● Highly compressed
● Separated from pitch
Pros:
● Highly compressed
● Synthesis introduces
unnaturalness
Cons:
WaveNet
WaveNet
● Deep
WaveNet
● Deep
● Residual
WaveNet
● Deep
● Residual
● Convolutional
WaveNet
● Deep
● Residual
● Convolutional
● Sample-based
WaveNet
● Deep
● Residual
● Convolutional
● Sample-based
● Probabilistic
WaveNet
● Deep
● Residual
● Convolutional
● Sample-based
● Probabilistic
● Conditional
WaveNet
● Deep
● Residual
● Convolutional
● Sample-based
● Probabilistic
● Conditional
● Generative
WaveNet
● Deep
● Residual
● Convolutional
● Sample-based
● Probabilistic
● Conditional
● Generative
● Auto-regressive
WaveNet
● Deep
● Residual
● Convolutional
● Sample-based
● Probabilistic
● Conditional
● Generative
● Auto-regressive
How does it work?
dilated causal convolutions
How does it work?
Trained like a CNN
Generates like an RNN
(with limited memory)
So how is it?
● Direct waveform generation
● State-of-the-art timbre quality
● CNN-like training
Pros:
So how is it?
● Direct waveform generation
● State-of-the-art timbre quality
● CNN-like training
Pros:
● Slow generation (40x
slower than realtime on
commodity CPU) *
● Sensitive to local condition
● Large memory footprint
● Hard to interpret
● Missing details
Cons:
Top layer activation
SampleRNN
SampleRNN
SampleRNN
SampleRNN
● Direct waveform generation
● Great long-range
dependencies modelling
● Reference impl. available
● Clear distinction between slow
and fast time scales
Pros:
● Training RNN on very long
sequences can be tricky
● ???
Cons:
Papers to check out
● WaveNet: A Generative Model for Raw Audio (Oord et al. 2016)
● Fast Wavenet Generation Algorithm (Paine et al. 2016)
● Deep Voice: Real-time Neural Text-to-Speech (Arik et al. 2017)
● A Neural Parametric Singing Synthesizer (Blaauw et al. 2017)
● SamplerRNN: An Unconditional End-To-End Neural Audio
Generation Model (Mehri et al. 2017)
● Char2wav: End-To-End Speech Synthesis (Sotelo et al. 2017)
Thanks!

Weitere ähnliche Inhalte

Ähnlich wie DataScienceLab2017_Блиц-доклад

Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Alpen-Adria-Universität
 
Hsff 6months
Hsff 6monthsHsff 6months
Hsff 6months
matrixphagwara
 
vlsi training in chennai
vlsi training in chennaivlsi training in chennai
vlsi training in chennai
matrixphagwara
 
vlsi training in pune
vlsi training in punevlsi training in pune
vlsi training in pune
matrixphagwara
 

Ähnlich wie DataScienceLab2017_Блиц-доклад (20)

REAL-TIME PIPELINE BATCH INTERFACE DETECTION & TRANSMIX REDUCTION
REAL-TIME PIPELINE BATCH INTERFACE DETECTION & TRANSMIX REDUCTIONREAL-TIME PIPELINE BATCH INTERFACE DETECTION & TRANSMIX REDUCTION
REAL-TIME PIPELINE BATCH INTERFACE DETECTION & TRANSMIX REDUCTION
 
Speaker Dependent WaveNet Vocoder
Speaker Dependent WaveNet VocoderSpeaker Dependent WaveNet Vocoder
Speaker Dependent WaveNet Vocoder
 
Plots spectroscope sasta presentation 2013
Plots spectroscope sasta presentation 2013Plots spectroscope sasta presentation 2013
Plots spectroscope sasta presentation 2013
 
Convolutional Neural Network and RNN for OCR problem.
Convolutional Neural Network and RNN for OCR problem.Convolutional Neural Network and RNN for OCR problem.
Convolutional Neural Network and RNN for OCR problem.
 
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
 
Taras Sereda "Waveglow. Generative modeling for audio synthesis"
Taras Sereda "Waveglow. Generative modeling for audio synthesis"Taras Sereda "Waveglow. Generative modeling for audio synthesis"
Taras Sereda "Waveglow. Generative modeling for audio synthesis"
 
Analysis Of Ofdm Parameters Using Cyclostationary Spectrum Sensing
Analysis Of Ofdm Parameters Using Cyclostationary Spectrum SensingAnalysis Of Ofdm Parameters Using Cyclostationary Spectrum Sensing
Analysis Of Ofdm Parameters Using Cyclostationary Spectrum Sensing
 
Design in Motion: Video Production Workflow
Design in Motion: Video Production WorkflowDesign in Motion: Video Production Workflow
Design in Motion: Video Production Workflow
 
Renice Nand Flash Analyzer NFA100 introduction
Renice Nand Flash Analyzer NFA100 introductionRenice Nand Flash Analyzer NFA100 introduction
Renice Nand Flash Analyzer NFA100 introduction
 
Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)
 
New
NewNew
New
 
Hsff 6months
Hsff 6monthsHsff 6months
Hsff 6months
 
Hsff training in bangalore
Hsff training in bangaloreHsff training in bangalore
Hsff training in bangalore
 
6 weeks summer training in hfss,jalandhar
6 weeks summer training in hfss,jalandhar6 weeks summer training in hfss,jalandhar
6 weeks summer training in hfss,jalandhar
 
6 weeks summer training in hfss,ludhiana
6 weeks summer training in hfss,ludhiana6 weeks summer training in hfss,ludhiana
6 weeks summer training in hfss,ludhiana
 
vlsi training in chennai
vlsi training in chennaivlsi training in chennai
vlsi training in chennai
 
vlsi training in pune
vlsi training in punevlsi training in pune
vlsi training in pune
 
data mining training in chennai
data mining training in chennaidata mining training in chennai
data mining training in chennai
 
6months industrial training in hfss, ludhiana
6months industrial training in hfss, ludhiana6months industrial training in hfss, ludhiana
6months industrial training in hfss, ludhiana
 
6months industrial training in hfss, jalandhar
6months industrial training in hfss, jalandhar6months industrial training in hfss, jalandhar
6months industrial training in hfss, jalandhar
 

Mehr von GeeksLab Odessa

DataScience Lab2017_Коррекция геометрических искажений оптических спутниковых...
DataScience Lab2017_Коррекция геометрических искажений оптических спутниковых...DataScience Lab2017_Коррекция геометрических искажений оптических спутниковых...
DataScience Lab2017_Коррекция геометрических искажений оптических спутниковых...
GeeksLab Odessa
 
DataScienceLab2017_Cервинг моделей, построенных на больших данных с помощью A...
DataScienceLab2017_Cервинг моделей, построенных на больших данных с помощью A...DataScienceLab2017_Cервинг моделей, построенных на больших данных с помощью A...
DataScienceLab2017_Cервинг моделей, построенных на больших данных с помощью A...
GeeksLab Odessa
 
DataScienceLab2017_Высокопроизводительные вычислительные возможности для сист...
DataScienceLab2017_Высокопроизводительные вычислительные возможности для сист...DataScienceLab2017_Высокопроизводительные вычислительные возможности для сист...
DataScienceLab2017_Высокопроизводительные вычислительные возможности для сист...
GeeksLab Odessa
 
DataScience Lab 2017_Графические вероятностные модели для принятия решений в ...
DataScience Lab 2017_Графические вероятностные модели для принятия решений в ...DataScience Lab 2017_Графические вероятностные модели для принятия решений в ...
DataScience Lab 2017_Графические вероятностные модели для принятия решений в ...
GeeksLab Odessa
 
JS Lab 2017_Mapbox GL: как работают современные интерактивные карты_Владимир ...
JS Lab 2017_Mapbox GL: как работают современные интерактивные карты_Владимир ...JS Lab 2017_Mapbox GL: как работают современные интерактивные карты_Владимир ...
JS Lab 2017_Mapbox GL: как работают современные интерактивные карты_Владимир ...
GeeksLab Odessa
 

Mehr von GeeksLab Odessa (20)

DataScience Lab2017_Коррекция геометрических искажений оптических спутниковых...
DataScience Lab2017_Коррекция геометрических искажений оптических спутниковых...DataScience Lab2017_Коррекция геометрических искажений оптических спутниковых...
DataScience Lab2017_Коррекция геометрических искажений оптических спутниковых...
 
DataScience Lab 2017_Kappa Architecture: How to implement a real-time streami...
DataScience Lab 2017_Kappa Architecture: How to implement a real-time streami...DataScience Lab 2017_Kappa Architecture: How to implement a real-time streami...
DataScience Lab 2017_Kappa Architecture: How to implement a real-time streami...
 
DataScience Lab 2017_Блиц-доклад_Турский Виктор
DataScience Lab 2017_Блиц-доклад_Турский ВикторDataScience Lab 2017_Блиц-доклад_Турский Виктор
DataScience Lab 2017_Блиц-доклад_Турский Виктор
 
DataScience Lab 2017_Обзор методов детекции лиц на изображение
DataScience Lab 2017_Обзор методов детекции лиц на изображениеDataScience Lab 2017_Обзор методов детекции лиц на изображение
DataScience Lab 2017_Обзор методов детекции лиц на изображение
 
DataScienceLab2017_Сходство пациентов: вычистка дубликатов и предсказание про...
DataScienceLab2017_Сходство пациентов: вычистка дубликатов и предсказание про...DataScienceLab2017_Сходство пациентов: вычистка дубликатов и предсказание про...
DataScienceLab2017_Сходство пациентов: вычистка дубликатов и предсказание про...
 
DataScienceLab2017_Блиц-доклад
DataScienceLab2017_Блиц-докладDataScienceLab2017_Блиц-доклад
DataScienceLab2017_Блиц-доклад
 
DataScienceLab2017_Блиц-доклад
DataScienceLab2017_Блиц-докладDataScienceLab2017_Блиц-доклад
DataScienceLab2017_Блиц-доклад
 
DataScienceLab2017_Cервинг моделей, построенных на больших данных с помощью A...
DataScienceLab2017_Cервинг моделей, построенных на больших данных с помощью A...DataScienceLab2017_Cервинг моделей, построенных на больших данных с помощью A...
DataScienceLab2017_Cервинг моделей, построенных на больших данных с помощью A...
 
DataScienceLab2017_BioVec: Word2Vec в задачах анализа геномных данных и биоин...
DataScienceLab2017_BioVec: Word2Vec в задачах анализа геномных данных и биоин...DataScienceLab2017_BioVec: Word2Vec в задачах анализа геномных данных и биоин...
DataScienceLab2017_BioVec: Word2Vec в задачах анализа геномных данных и биоин...
 
DataScienceLab2017_Data Sciences и Big Data в Телекоме_Александр Саенко
DataScienceLab2017_Data Sciences и Big Data в Телекоме_Александр Саенко DataScienceLab2017_Data Sciences и Big Data в Телекоме_Александр Саенко
DataScienceLab2017_Data Sciences и Big Data в Телекоме_Александр Саенко
 
DataScienceLab2017_Высокопроизводительные вычислительные возможности для сист...
DataScienceLab2017_Высокопроизводительные вычислительные возможности для сист...DataScienceLab2017_Высокопроизводительные вычислительные возможности для сист...
DataScienceLab2017_Высокопроизводительные вычислительные возможности для сист...
 
DataScience Lab 2017_Мониторинг модных трендов с помощью глубокого обучения и...
DataScience Lab 2017_Мониторинг модных трендов с помощью глубокого обучения и...DataScience Lab 2017_Мониторинг модных трендов с помощью глубокого обучения и...
DataScience Lab 2017_Мониторинг модных трендов с помощью глубокого обучения и...
 
DataScience Lab 2017_Кто здесь? Автоматическая разметка спикеров на телефонны...
DataScience Lab 2017_Кто здесь? Автоматическая разметка спикеров на телефонны...DataScience Lab 2017_Кто здесь? Автоматическая разметка спикеров на телефонны...
DataScience Lab 2017_Кто здесь? Автоматическая разметка спикеров на телефонны...
 
DataScience Lab 2017_From bag of texts to bag of clusters_Терпиль Евгений / П...
DataScience Lab 2017_From bag of texts to bag of clusters_Терпиль Евгений / П...DataScience Lab 2017_From bag of texts to bag of clusters_Терпиль Евгений / П...
DataScience Lab 2017_From bag of texts to bag of clusters_Терпиль Евгений / П...
 
DataScience Lab 2017_Графические вероятностные модели для принятия решений в ...
DataScience Lab 2017_Графические вероятностные модели для принятия решений в ...DataScience Lab 2017_Графические вероятностные модели для принятия решений в ...
DataScience Lab 2017_Графические вероятностные модели для принятия решений в ...
 
DataScienceLab2017_Оптимизация гиперпараметров машинного обучения при помощи ...
DataScienceLab2017_Оптимизация гиперпараметров машинного обучения при помощи ...DataScienceLab2017_Оптимизация гиперпараметров машинного обучения при помощи ...
DataScienceLab2017_Оптимизация гиперпараметров машинного обучения при помощи ...
 
DataScienceLab2017_Как знать всё о покупателях (или почти всё)?_Дарина Перемот
DataScienceLab2017_Как знать всё о покупателях (или почти всё)?_Дарина Перемот DataScienceLab2017_Как знать всё о покупателях (или почти всё)?_Дарина Перемот
DataScienceLab2017_Как знать всё о покупателях (или почти всё)?_Дарина Перемот
 
JS Lab 2017_Mapbox GL: как работают современные интерактивные карты_Владимир ...
JS Lab 2017_Mapbox GL: как работают современные интерактивные карты_Владимир ...JS Lab 2017_Mapbox GL: как работают современные интерактивные карты_Владимир ...
JS Lab 2017_Mapbox GL: как работают современные интерактивные карты_Владимир ...
 
JS Lab2017_Под микроскопом: блеск и нищета микросервисов на node.js
JS Lab2017_Под микроскопом: блеск и нищета микросервисов на node.js JS Lab2017_Под микроскопом: блеск и нищета микросервисов на node.js
JS Lab2017_Под микроскопом: блеск и нищета микросервисов на node.js
 
JS Lab2017_Redux: время двигаться дальше?_Екатерина Лизогубова
JS Lab2017_Redux: время двигаться дальше?_Екатерина ЛизогубоваJS Lab2017_Redux: время двигаться дальше?_Екатерина Лизогубова
JS Lab2017_Redux: время двигаться дальше?_Екатерина Лизогубова
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

DataScienceLab2017_Блиц-доклад