SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
n G
n
n
n -
n A C
2
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
n a
/ []eb A D a C
/ D 4 1 /0 , 1 6
3
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
n I N
)
n A
B B
(
(
(
(
(
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
5
3 0 2 . /3 0 7 3 1 . 0 0 7 3
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
6
3 0 2 . /3 0 7 3 1 . 0 0 7 3
:
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
n
A
B A
7
B
( A )
( ; F0)
( ; bap)
B
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
8
(F0 )
bap
•
→ Vocoder
•
• STRAIGHT [Kawahara+; ’99]
• WORLD [Morise+; ’16]
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
Vocoder
9
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
vocoder
10
F0bap
F0bap
F0bap
1 frame
frame
Frame
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
[Abe+; ’90][stylianou+; ’98]
n
A B
11
F0bap
F0bap
GMM DNN
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
[Abe+; ’90][stylianou+; ’98]
n
B
12
AF0bap
AF0bap
GMM DNN
• F0, bap
→ A
• F0 bap
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
13
Parallel-data
B
A
frame
A
B
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
o s e P P
• 6C C E C AA 6-- ( y
• 6-- K K E 2; ] p e
d r aKvt 6-- (
3 E AA A ; G
• g K V P[P PN g kO
• E AA A ; G P Po - A 1 70 C + )8
i h ced
• nQc i h ʻ] d l ]SP
• nQc 6 6 . 7 ; 2CE; + )8
14
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
ig l hV vNVQ] V
• 3 7 7 E C 7;6 Nk
16 6 6 6Nyo
• Ns V PV cK
• A6 6 6 6 V i + 7 - .6 G
n a N []
• n a r pdNʻ O
• t V e [n 32 3 , E6 0 G
15
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
Voice Conversion Challenge 2016
n
n 7
7
7
n 5 5
n 01
16
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
Results of listening tests in VCC 2016
17
cf. http://vc-challenge.org/vcc2016/summary.html
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
ig l hV vNVQ] V
• 3 7 7 E C 7;6 Nk
16 6 6 6Nyo
• Ns V PV cK
• A6 6 6 6 V i + 7 - .6 G
n a N []
• n a r pdNʻ O
• t V e [n 32 3 , E6 0 G
18
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
CycleGAN [Zhu+; ’17]
n
19
cf. https://junyanz.github.io/CycleGAN/
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
CycleGAN [Zhu+; ’17]
n
n
Forward-inverse mapping Inverse-forward mapping
GX→Y GY→X G L real/fake loss
[Kaneko+; ‘17]
M
mapping loss
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
CycleGAN [Zhu+; ’17]
n
n
21
Forward-inverse mapping Inverse-forward mapping
GX→Y adversarial loss
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
CycleGAN [Zhu+; ’17]
n
n
22
Forward-inverse mapping Inverse-forward mapping
= "#~%&'(' # log,- . + "0~%&'(' 0 log 1 − ,- 34→- .
GY→X adversarial loss
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
CycleGAN [Zhu+; ’17]
n
n
23
Forward-inverse mapping Inverse-forward mapping
L1loss
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
CycleGAN [Zhu+; ’17]
n
n
24
Forward-inverse mapping Inverse-forward mapping
λcyc 10.0
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
CycleGAN parallel-data-free [Kaneko+, ’17]
n NG NG
n
C
25
CycleGAN
copy
A
A
A
A
A
t1 t2 tTbap
bap
bap
bap
bap
F0
F0
F0
F0
F0
bap
bap
bap
bap
bap
F0
F0
F0
F0
F0
A
A
A
A
A
t1 t2 tT
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
VC
. 1 1 2 r
• d c U R l U
• c U
• t pt t G l em
- 1 1 a ) ( A . (1 1 ( l
• y sv cG X
• g I ni L UI o
26
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
Network architecture
n . / - . /
n :
. / / . / / / .
27
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
ig l hV vNVQ] V
• 3 7 7 E C 7;6 Nk
16 6 6 6Nyo
• Ns V PV cK
• A6 6 6 6 V i + 7 - .6 G
n a N []
• n a r pdNʻ O
• t V e [n 32 3 , E6 0 G
28
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
Variational autoencoder (VAE) [Hinton+; '06]
n
z
29
x
Encoder
qθ(z|X)
Decoder
pθ(X|z)
z
!"
# $; 0, 1
Input feature Generated feature
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
VQ-VAE [van den Oord+, ‘17]
n
- -( ) E V A
n
30
x
Encoder
p(ze(x)|x)
Decoder
p(x|zq(x))
ze(x)
!"
A
A
e1 e2 e3 eK
zq(x)
x LQ loss VQ loss Encoder loss
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
VQ-VAE
n [van den Oord+; ’16]
t v o G a r
• N x h G d
• λ W l lg d
• l m r e
31
! " # = %
&'(
)
* +&|+&-),+&-)/0, ⋯ +&-0, #
λ : d lg c d
, " = +(, +0, ⋯ +&-0
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
VQ-VAE [van den Oord+, ’17]
n
32
Encoder WaveNet
ze(x)
e1 e2 e3 eK
zq(x)
id
• zq(x) id
• ze(x) zq(x) id
• zq(x)
( )
https://avdnoord.github.io/homepage/vqvae/
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
VQ-VAE [van den Oord+; ‘17]
n
33
Encoder WaveNet
ze(x)
e1 e2 e3 eK
zq(x)
id
cf. https://www.slideshare.net/YukiSaito8/saito18sp03
• zq(x) id
• ze(x) zq(x) id
• zq(x)
( )
https://avdnoord.github.io/homepage/vqvae/
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
n A
n C
-
34
Copyright © DeNA Co.,Ltd. All Rights Reserved.
Strictly confidential
H9J J JZJ d.-I 6 9J J JZJ 7 J[]MJ 9J[][N JVM 0 MN 1 N NRPVN e N[ Z]L ]ZRVP [XNNL ZNXZN[NV J RWV[ ][RVP J
XR L JMJX R N RUN OZNY]NVLa [UWW RVP JVM JV RV[ JV JVNW][ OZNY]NVLa KJ[NM 4 N ZJL RWV W[[RKTN ZWTN WO J ZNX R R N
[ Z]L ]ZN RV [W]VM[ f XNNL 1WUU]VRLJ RWV , XX -, , ...
H WZR[N c +I WZR[N 4 FWSWUWZR JVM 9 bJ J eD :2 J WLWMNZ KJ[NM RP Y]JTR a [XNNL [aV N[R[ [a[ NU OWZ
ZNJT RUN JXXTRLJ RWV[ f 73713 ZJV[JL RWV[ WV RVOWZUJ RWV JVM [a[ NU[ WT 3.. 2 VW , XX -,, --) +
H0KN . I 0KN JSJU]ZJ 9 RSJVW JVM 6 9] JKJZJ eCWRLN LWV NZ[RWV ZW]P NL WZ Y]JV RbJ RWV f
, ,+
H[ aTRJVW] c.-I F aTRJVW] 1JXXh JVM 3 W]TRVN[ e1WV RV]W][ XZWKJKRTR[ RL ZJV[OWZU OWZ WRLN LWV NZ[RWV f
( ) ..-
H9JVNSW ,I A 9JVNSW JVM 6 9JUNWSJ f JZJTTNT 2J J 4ZNN CWRLN 1WV NZ[RWV [RVP 1aLTN 1WV[R[ NV 0M NZ[JZRJT
N WZS[ f JZER ,
HG ] c ,I 8 F G ] A JZS 7[WTJ JVM 0 0 3OZW[ e VXJRZNM RUJPN W RUJPN ZJV[TJ RWV ][RVP LaLTN LWV[R[ NV
JM NZ[JZRJT VN WZS[ f
H6RV WV c +I 5 3 6RV WV JVM JTJS ] MRVW e NM]LRVP N MRUNV[RWVJTR a WO MJ J R VN]ZJT
VN WZS[ f ,-+ ) , +
H JV MNV WZM c ,I 0 JV MNV WZM JVM CRVaJT[ e N]ZJT MR[LZN N ZNXZN[NV J RWV TNJZVRVP f 7V
XX +( . +( - ,
H JV MNV WZM d +I 0 JV 2NV WZM 2RNTNUJV 6 GNV 9 RUWVaJV CRVaJT[ 0 5ZJ N[ JVM 9 9J ]SL]WPT]
eDJ NVN 0 PNVNZJ R N UWMNT OWZ ZJ J]MRW f
35

Weitere ähnliche Inhalte

Was ist angesagt?

[DL輪読会]Diffusion-based Voice Conversion with Fast Maximum Likelihood Samplin...
[DL輪読会]Diffusion-based Voice Conversion with Fast  Maximum Likelihood Samplin...[DL輪読会]Diffusion-based Voice Conversion with Fast  Maximum Likelihood Samplin...
[DL輪読会]Diffusion-based Voice Conversion with Fast Maximum Likelihood Samplin...Deep Learning JP
 
[DL輪読会]Wavenet a generative model for raw audio
[DL輪読会]Wavenet a generative model for raw audio[DL輪読会]Wavenet a generative model for raw audio
[DL輪読会]Wavenet a generative model for raw audioDeep Learning JP
 
Neural text-to-speech and voice conversion
Neural text-to-speech and voice conversionNeural text-to-speech and voice conversion
Neural text-to-speech and voice conversionYuki Saito
 
A Method of Speech Waveform Synthesis based on WaveNet considering Speech Gen...
A Method of Speech Waveform Synthesis based on WaveNet considering Speech Gen...A Method of Speech Waveform Synthesis based on WaveNet considering Speech Gen...
A Method of Speech Waveform Synthesis based on WaveNet considering Speech Gen...Akira Tamamori
 
Fisher Vectorによる画像認識
Fisher Vectorによる画像認識Fisher Vectorによる画像認識
Fisher Vectorによる画像認識Takao Yamanaka
 
論文紹介 wav2vec: Unsupervised Pre-training for Speech Recognition
論文紹介  wav2vec: Unsupervised Pre-training for Speech Recognition論文紹介  wav2vec: Unsupervised Pre-training for Speech Recognition
論文紹介 wav2vec: Unsupervised Pre-training for Speech RecognitionYosukeKashiwagi1
 
深層学習を利用した音声強調
深層学習を利用した音声強調深層学習を利用した音声強調
深層学習を利用した音声強調Yuma Koizumi
 
Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)Yamato OKAMOTO
 
複数話者WaveNetボコーダに関する調査
複数話者WaveNetボコーダに関する調査複数話者WaveNetボコーダに関する調査
複数話者WaveNetボコーダに関する調査Tomoki Hayashi
 
[DL輪読会]Parallel WaveNet: Fast High-Fidelity Speech Synthesis
[DL輪読会]Parallel WaveNet: Fast High-Fidelity Speech Synthesis[DL輪読会]Parallel WaveNet: Fast High-Fidelity Speech Synthesis
[DL輪読会]Parallel WaveNet: Fast High-Fidelity Speech SynthesisDeep Learning JP
 
全体セミナーWfst
全体セミナーWfst全体セミナーWfst
全体セミナーWfstJiro Nishitoba
 
Variational AutoEncoder
Variational AutoEncoderVariational AutoEncoder
Variational AutoEncoderKazuki Nitta
 
音声感情認識の分野動向と実用化に向けたNTTの取り組み
音声感情認識の分野動向と実用化に向けたNTTの取り組み音声感情認識の分野動向と実用化に向けたNTTの取り組み
音声感情認識の分野動向と実用化に向けたNTTの取り組みAtsushi_Ando
 
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
[DL輪読会]Recent Advances in Autoencoder-Based Representation LearningDeep Learning JP
 
[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder
[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder
[DL輪読会]NVAE: A Deep Hierarchical Variational AutoencoderDeep Learning JP
 
深層生成モデルに基づく音声合成技術
深層生成モデルに基づく音声合成技術深層生成モデルに基づく音声合成技術
深層生成モデルに基づく音声合成技術NU_I_TODALAB
 
深層学習と音響信号処理
深層学習と音響信号処理深層学習と音響信号処理
深層学習と音響信号処理Yuma Koizumi
 
深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~
深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~
深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~Yui Sudo
 
[DL輪読会]Towards End-to-End Prosody Transfer for Expressive Speech Synthesis wi...
[DL輪読会]Towards End-to-End Prosody Transfer for Expressive Speech Synthesis wi...[DL輪読会]Towards End-to-End Prosody Transfer for Expressive Speech Synthesis wi...
[DL輪読会]Towards End-to-End Prosody Transfer for Expressive Speech Synthesis wi...Deep Learning JP
 

Was ist angesagt? (20)

[DL輪読会]Diffusion-based Voice Conversion with Fast Maximum Likelihood Samplin...
[DL輪読会]Diffusion-based Voice Conversion with Fast  Maximum Likelihood Samplin...[DL輪読会]Diffusion-based Voice Conversion with Fast  Maximum Likelihood Samplin...
[DL輪読会]Diffusion-based Voice Conversion with Fast Maximum Likelihood Samplin...
 
[DL輪読会]Wavenet a generative model for raw audio
[DL輪読会]Wavenet a generative model for raw audio[DL輪読会]Wavenet a generative model for raw audio
[DL輪読会]Wavenet a generative model for raw audio
 
Neural text-to-speech and voice conversion
Neural text-to-speech and voice conversionNeural text-to-speech and voice conversion
Neural text-to-speech and voice conversion
 
A Method of Speech Waveform Synthesis based on WaveNet considering Speech Gen...
A Method of Speech Waveform Synthesis based on WaveNet considering Speech Gen...A Method of Speech Waveform Synthesis based on WaveNet considering Speech Gen...
A Method of Speech Waveform Synthesis based on WaveNet considering Speech Gen...
 
Fisher Vectorによる画像認識
Fisher Vectorによる画像認識Fisher Vectorによる画像認識
Fisher Vectorによる画像認識
 
論文紹介 wav2vec: Unsupervised Pre-training for Speech Recognition
論文紹介  wav2vec: Unsupervised Pre-training for Speech Recognition論文紹介  wav2vec: Unsupervised Pre-training for Speech Recognition
論文紹介 wav2vec: Unsupervised Pre-training for Speech Recognition
 
深層学習を利用した音声強調
深層学習を利用した音声強調深層学習を利用した音声強調
深層学習を利用した音声強調
 
Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)
 
複数話者WaveNetボコーダに関する調査
複数話者WaveNetボコーダに関する調査複数話者WaveNetボコーダに関する調査
複数話者WaveNetボコーダに関する調査
 
[DL輪読会]Parallel WaveNet: Fast High-Fidelity Speech Synthesis
[DL輪読会]Parallel WaveNet: Fast High-Fidelity Speech Synthesis[DL輪読会]Parallel WaveNet: Fast High-Fidelity Speech Synthesis
[DL輪読会]Parallel WaveNet: Fast High-Fidelity Speech Synthesis
 
全体セミナーWfst
全体セミナーWfst全体セミナーWfst
全体セミナーWfst
 
Variational AutoEncoder
Variational AutoEncoderVariational AutoEncoder
Variational AutoEncoder
 
音声感情認識の分野動向と実用化に向けたNTTの取り組み
音声感情認識の分野動向と実用化に向けたNTTの取り組み音声感情認識の分野動向と実用化に向けたNTTの取り組み
音声感情認識の分野動向と実用化に向けたNTTの取り組み
 
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
 
[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder
[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder
[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder
 
深層生成モデルに基づく音声合成技術
深層生成モデルに基づく音声合成技術深層生成モデルに基づく音声合成技術
深層生成モデルに基づく音声合成技術
 
実装レベルで学ぶVQVAE
実装レベルで学ぶVQVAE実装レベルで学ぶVQVAE
実装レベルで学ぶVQVAE
 
深層学習と音響信号処理
深層学習と音響信号処理深層学習と音響信号処理
深層学習と音響信号処理
 
深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~
深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~
深層学習を用いた音源定位、音源分離、クラス分類の統合~環境音セグメンテーション手法の紹介~
 
[DL輪読会]Towards End-to-End Prosody Transfer for Expressive Speech Synthesis wi...
[DL輪読会]Towards End-to-End Prosody Transfer for Expressive Speech Synthesis wi...[DL輪読会]Towards End-to-End Prosody Transfer for Expressive Speech Synthesis wi...
[DL輪読会]Towards End-to-End Prosody Transfer for Expressive Speech Synthesis wi...
 

Ähnlich wie 声質変換の概要と最新手法の紹介

CODE FESTIVAL 2015 予選A 解説
CODE FESTIVAL 2015 予選A 解説CODE FESTIVAL 2015 予選A 解説
CODE FESTIVAL 2015 予選A 解説AtCoder Inc.
 
Orb における Cassandra への取り組み
Orb における Cassandra への取り組みOrb における Cassandra への取り組み
Orb における Cassandra への取り組みOrb, Inc.
 
0.47 inch LCD Micro Dispalay 800x600 Resolution RGB Interface LCD Screen
0.47 inch LCD Micro Dispalay 800x600 Resolution RGB Interface LCD Screen0.47 inch LCD Micro Dispalay 800x600 Resolution RGB Interface LCD Screen
0.47 inch LCD Micro Dispalay 800x600 Resolution RGB Interface LCD ScreenShawn Lee
 
20170322_ICON21技術セミナー1_加藤
20170322_ICON21技術セミナー1_加藤20170322_ICON21技術セミナー1_加藤
20170322_ICON21技術セミナー1_加藤ICT_CONNECT_21
 
Attention-Based Adaptive Selection of Operations for Image Restoration in the...
Attention-Based Adaptive Selection of Operations for Image Restoration in the...Attention-Based Adaptive Selection of Operations for Image Restoration in the...
Attention-Based Adaptive Selection of Operations for Image Restoration in the...MasanoriSuganuma
 
stackconf 2022: Are all programming languages in english?
stackconf 2022: Are all programming languages in english?stackconf 2022: Are all programming languages in english?
stackconf 2022: Are all programming languages in english?NETWAYS
 
Tensorflow and python : fault detection system - PyCon Taiwan 2017
Tensorflow and python : fault detection system - PyCon Taiwan 2017Tensorflow and python : fault detection system - PyCon Taiwan 2017
Tensorflow and python : fault detection system - PyCon Taiwan 2017Eric Ahn
 
Linear Algebra Previous Year Questions of Csir Net Mathematical Science and t...
Linear Algebra Previous Year Questions of Csir Net Mathematical Science and t...Linear Algebra Previous Year Questions of Csir Net Mathematical Science and t...
Linear Algebra Previous Year Questions of Csir Net Mathematical Science and t...Santoshi Family
 
Introduction to Artificial Neural Networks (ANNs) - Step-by-Step Training & T...
Introduction to Artificial Neural Networks (ANNs) - Step-by-Step Training & T...Introduction to Artificial Neural Networks (ANNs) - Step-by-Step Training & T...
Introduction to Artificial Neural Networks (ANNs) - Step-by-Step Training & T...Ahmed Gad
 
【ECCV 2018】CornerNet: Detecting Objects as Paired Keypoints
【ECCV 2018】CornerNet: Detecting Objects as Paired Keypoints【ECCV 2018】CornerNet: Detecting Objects as Paired Keypoints
【ECCV 2018】CornerNet: Detecting Objects as Paired Keypointscvpaper. challenge
 
Safe Reinforcement Learning
Safe Reinforcement LearningSafe Reinforcement Learning
Safe Reinforcement LearningDongmin Lee
 
Stargz Snapshotter: イメージのpullを省略してcontainerdでコンテナを高速に起動する
Stargz Snapshotter: イメージのpullを省略してcontainerdでコンテナを高速に起動するStargz Snapshotter: イメージのpullを省略してcontainerdでコンテナを高速に起動する
Stargz Snapshotter: イメージのpullを省略してcontainerdでコンテナを高速に起動するKohei Tokunaga
 
Systems and methods for visual presentation and selection of ivr menu
Systems and methods for visual presentation and selection of ivr menuSystems and methods for visual presentation and selection of ivr menu
Systems and methods for visual presentation and selection of ivr menuTal Lavian Ph.D.
 
FIWARE Global Summit - Smart City / Community Services and Infrastructures
FIWARE Global Summit - Smart City / Community Services and InfrastructuresFIWARE Global Summit - Smart City / Community Services and Infrastructures
FIWARE Global Summit - Smart City / Community Services and InfrastructuresFIWARE
 

Ähnlich wie 声質変換の概要と最新手法の紹介 (20)

CODE FESTIVAL 2015 予選A 解説
CODE FESTIVAL 2015 予選A 解説CODE FESTIVAL 2015 予選A 解説
CODE FESTIVAL 2015 予選A 解説
 
Gadgteteer clean code
Gadgteteer   clean codeGadgteteer   clean code
Gadgteteer clean code
 
Orb における Cassandra への取り組み
Orb における Cassandra への取り組みOrb における Cassandra への取り組み
Orb における Cassandra への取り組み
 
0.47 inch LCD Micro Dispalay 800x600 Resolution RGB Interface LCD Screen
0.47 inch LCD Micro Dispalay 800x600 Resolution RGB Interface LCD Screen0.47 inch LCD Micro Dispalay 800x600 Resolution RGB Interface LCD Screen
0.47 inch LCD Micro Dispalay 800x600 Resolution RGB Interface LCD Screen
 
20170322_ICON21技術セミナー1_加藤
20170322_ICON21技術セミナー1_加藤20170322_ICON21技術セミナー1_加藤
20170322_ICON21技術セミナー1_加藤
 
Attention-Based Adaptive Selection of Operations for Image Restoration in the...
Attention-Based Adaptive Selection of Operations for Image Restoration in the...Attention-Based Adaptive Selection of Operations for Image Restoration in the...
Attention-Based Adaptive Selection of Operations for Image Restoration in the...
 
stackconf 2022: Are all programming languages in english?
stackconf 2022: Are all programming languages in english?stackconf 2022: Are all programming languages in english?
stackconf 2022: Are all programming languages in english?
 
Tensorflow and python : fault detection system - PyCon Taiwan 2017
Tensorflow and python : fault detection system - PyCon Taiwan 2017Tensorflow and python : fault detection system - PyCon Taiwan 2017
Tensorflow and python : fault detection system - PyCon Taiwan 2017
 
2937
29372937
2937
 
Hong.bas
Hong.basHong.bas
Hong.bas
 
Hong.bas
Hong.basHong.bas
Hong.bas
 
Linear Algebra Previous Year Questions of Csir Net Mathematical Science and t...
Linear Algebra Previous Year Questions of Csir Net Mathematical Science and t...Linear Algebra Previous Year Questions of Csir Net Mathematical Science and t...
Linear Algebra Previous Year Questions of Csir Net Mathematical Science and t...
 
Introduction to Artificial Neural Networks (ANNs) - Step-by-Step Training & T...
Introduction to Artificial Neural Networks (ANNs) - Step-by-Step Training & T...Introduction to Artificial Neural Networks (ANNs) - Step-by-Step Training & T...
Introduction to Artificial Neural Networks (ANNs) - Step-by-Step Training & T...
 
Salesforce Big Object 最前線
Salesforce Big Object 最前線Salesforce Big Object 最前線
Salesforce Big Object 最前線
 
【ECCV 2018】CornerNet: Detecting Objects as Paired Keypoints
【ECCV 2018】CornerNet: Detecting Objects as Paired Keypoints【ECCV 2018】CornerNet: Detecting Objects as Paired Keypoints
【ECCV 2018】CornerNet: Detecting Objects as Paired Keypoints
 
Safe Reinforcement Learning
Safe Reinforcement LearningSafe Reinforcement Learning
Safe Reinforcement Learning
 
Stargz Snapshotter: イメージのpullを省略してcontainerdでコンテナを高速に起動する
Stargz Snapshotter: イメージのpullを省略してcontainerdでコンテナを高速に起動するStargz Snapshotter: イメージのpullを省略してcontainerdでコンテナを高速に起動する
Stargz Snapshotter: イメージのpullを省略してcontainerdでコンテナを高速に起動する
 
Systems and methods for visual presentation and selection of ivr menu
Systems and methods for visual presentation and selection of ivr menuSystems and methods for visual presentation and selection of ivr menu
Systems and methods for visual presentation and selection of ivr menu
 
Project management
Project managementProject management
Project management
 
FIWARE Global Summit - Smart City / Community Services and Infrastructures
FIWARE Global Summit - Smart City / Community Services and InfrastructuresFIWARE Global Summit - Smart City / Community Services and Infrastructures
FIWARE Global Summit - Smart City / Community Services and Infrastructures
 

Mehr von Kentaro Tachibana

ICASSP2020音声&音響読み会Mellotron
ICASSP2020音声&音響読み会MellotronICASSP2020音声&音響読み会Mellotron
ICASSP2020音声&音響読み会MellotronKentaro Tachibana
 
Interspeech2019読み会 音声生成
Interspeech2019読み会 音声生成Interspeech2019読み会 音声生成
Interspeech2019読み会 音声生成Kentaro Tachibana
 
ICASSP2019 音声&音響読み会 テーマ発表音声生成
ICASSP2019 音声&音響読み会 テーマ発表音声生成ICASSP2019 音声&音響読み会 テーマ発表音声生成
ICASSP2019 音声&音響読み会 テーマ発表音声生成Kentaro Tachibana
 
Icml2018読み会_overview&GANs
Icml2018読み会_overview&GANsIcml2018読み会_overview&GANs
Icml2018読み会_overview&GANsKentaro Tachibana
 
Icassp2018 発表参加報告 FFTNet, Tactron2紹介
Icassp2018 発表参加報告 FFTNet, Tactron2紹介Icassp2018 発表参加報告 FFTNet, Tactron2紹介
Icassp2018 発表参加報告 FFTNet, Tactron2紹介Kentaro Tachibana
 

Mehr von Kentaro Tachibana (6)

ICASSP2020音声&音響読み会Mellotron
ICASSP2020音声&音響読み会MellotronICASSP2020音声&音響読み会Mellotron
ICASSP2020音声&音響読み会Mellotron
 
Interspeech2019読み会 音声生成
Interspeech2019読み会 音声生成Interspeech2019読み会 音声生成
Interspeech2019読み会 音声生成
 
190910 SHIBUYA synapse
190910 SHIBUYA synapse190910 SHIBUYA synapse
190910 SHIBUYA synapse
 
ICASSP2019 音声&音響読み会 テーマ発表音声生成
ICASSP2019 音声&音響読み会 テーマ発表音声生成ICASSP2019 音声&音響読み会 テーマ発表音声生成
ICASSP2019 音声&音響読み会 テーマ発表音声生成
 
Icml2018読み会_overview&GANs
Icml2018読み会_overview&GANsIcml2018読み会_overview&GANs
Icml2018読み会_overview&GANs
 
Icassp2018 発表参加報告 FFTNet, Tactron2紹介
Icassp2018 発表参加報告 FFTNet, Tactron2紹介Icassp2018 発表参加報告 FFTNet, Tactron2紹介
Icassp2018 発表参加報告 FFTNet, Tactron2紹介
 

Kürzlich hochgeladen

Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...navyadasi1992
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomyDrAnita Sharma
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxkumarsanjai28051
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxmaryFF1
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxnoordubaliya2003
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXDole Philippines School
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 

Kürzlich hochgeladen (20)

Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomy
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptx
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 

声質変換の概要と最新手法の紹介

  • 1. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential
  • 2. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential n G n n n - n A C 2
  • 3. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential n a / []eb A D a C / D 4 1 /0 , 1 6 3
  • 4. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential n I N ) n A B B ( ( ( ( (
  • 5. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential 5 3 0 2 . /3 0 7 3 1 . 0 0 7 3
  • 6. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential 6 3 0 2 . /3 0 7 3 1 . 0 0 7 3 :
  • 7. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential n A B A 7 B ( A ) ( ; F0) ( ; bap) B
  • 8. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential 8 (F0 ) bap • → Vocoder • • STRAIGHT [Kawahara+; ’99] • WORLD [Morise+; ’16]
  • 9. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential Vocoder 9
  • 10. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential vocoder 10 F0bap F0bap F0bap 1 frame frame Frame
  • 11. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential [Abe+; ’90][stylianou+; ’98] n A B 11 F0bap F0bap GMM DNN
  • 12. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential [Abe+; ’90][stylianou+; ’98] n B 12 AF0bap AF0bap GMM DNN • F0, bap → A • F0 bap
  • 13. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential 13 Parallel-data B A frame A B
  • 14. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential o s e P P • 6C C E C AA 6-- ( y • 6-- K K E 2; ] p e d r aKvt 6-- ( 3 E AA A ; G • g K V P[P PN g kO • E AA A ; G P Po - A 1 70 C + )8 i h ced • nQc i h ʻ] d l ]SP • nQc 6 6 . 7 ; 2CE; + )8 14
  • 15. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential ig l hV vNVQ] V • 3 7 7 E C 7;6 Nk 16 6 6 6Nyo • Ns V PV cK • A6 6 6 6 V i + 7 - .6 G n a N [] • n a r pdNʻ O • t V e [n 32 3 , E6 0 G 15
  • 16. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential Voice Conversion Challenge 2016 n n 7 7 7 n 5 5 n 01 16
  • 17. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential Results of listening tests in VCC 2016 17 cf. http://vc-challenge.org/vcc2016/summary.html
  • 18. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential ig l hV vNVQ] V • 3 7 7 E C 7;6 Nk 16 6 6 6Nyo • Ns V PV cK • A6 6 6 6 V i + 7 - .6 G n a N [] • n a r pdNʻ O • t V e [n 32 3 , E6 0 G 18
  • 19. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential CycleGAN [Zhu+; ’17] n 19 cf. https://junyanz.github.io/CycleGAN/
  • 20. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential CycleGAN [Zhu+; ’17] n n Forward-inverse mapping Inverse-forward mapping GX→Y GY→X G L real/fake loss [Kaneko+; ‘17] M mapping loss
  • 21. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential CycleGAN [Zhu+; ’17] n n 21 Forward-inverse mapping Inverse-forward mapping GX→Y adversarial loss
  • 22. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential CycleGAN [Zhu+; ’17] n n 22 Forward-inverse mapping Inverse-forward mapping = "#~%&'(' # log,- . + "0~%&'(' 0 log 1 − ,- 34→- . GY→X adversarial loss
  • 23. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential CycleGAN [Zhu+; ’17] n n 23 Forward-inverse mapping Inverse-forward mapping L1loss
  • 24. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential CycleGAN [Zhu+; ’17] n n 24 Forward-inverse mapping Inverse-forward mapping λcyc 10.0
  • 25. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential CycleGAN parallel-data-free [Kaneko+, ’17] n NG NG n C 25 CycleGAN copy A A A A A t1 t2 tTbap bap bap bap bap F0 F0 F0 F0 F0 bap bap bap bap bap F0 F0 F0 F0 F0 A A A A A t1 t2 tT
  • 26. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential VC . 1 1 2 r • d c U R l U • c U • t pt t G l em - 1 1 a ) ( A . (1 1 ( l • y sv cG X • g I ni L UI o 26
  • 27. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential Network architecture n . / - . / n : . / / . / / / . 27
  • 28. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential ig l hV vNVQ] V • 3 7 7 E C 7;6 Nk 16 6 6 6Nyo • Ns V PV cK • A6 6 6 6 V i + 7 - .6 G n a N [] • n a r pdNʻ O • t V e [n 32 3 , E6 0 G 28
  • 29. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential Variational autoencoder (VAE) [Hinton+; '06] n z 29 x Encoder qθ(z|X) Decoder pθ(X|z) z !" # $; 0, 1 Input feature Generated feature
  • 30. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential VQ-VAE [van den Oord+, ‘17] n - -( ) E V A n 30 x Encoder p(ze(x)|x) Decoder p(x|zq(x)) ze(x) !" A A e1 e2 e3 eK zq(x) x LQ loss VQ loss Encoder loss
  • 31. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential VQ-VAE n [van den Oord+; ’16] t v o G a r • N x h G d • λ W l lg d • l m r e 31 ! " # = % &'( ) * +&|+&-),+&-)/0, ⋯ +&-0, # λ : d lg c d , " = +(, +0, ⋯ +&-0
  • 32. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential VQ-VAE [van den Oord+, ’17] n 32 Encoder WaveNet ze(x) e1 e2 e3 eK zq(x) id • zq(x) id • ze(x) zq(x) id • zq(x) ( ) https://avdnoord.github.io/homepage/vqvae/
  • 33. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential VQ-VAE [van den Oord+; ‘17] n 33 Encoder WaveNet ze(x) e1 e2 e3 eK zq(x) id cf. https://www.slideshare.net/YukiSaito8/saito18sp03 • zq(x) id • ze(x) zq(x) id • zq(x) ( ) https://avdnoord.github.io/homepage/vqvae/
  • 34. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential n A n C - 34
  • 35. Copyright © DeNA Co.,Ltd. All Rights Reserved. Strictly confidential H9J J JZJ d.-I 6 9J J JZJ 7 J[]MJ 9J[][N JVM 0 MN 1 N NRPVN e N[ Z]L ]ZRVP [XNNL ZNXZN[NV J RWV[ ][RVP J XR L JMJX R N RUN OZNY]NVLa [UWW RVP JVM JV RV[ JV JVNW][ OZNY]NVLa KJ[NM 4 N ZJL RWV W[[RKTN ZWTN WO J ZNX R R N [ Z]L ]ZN RV [W]VM[ f XNNL 1WUU]VRLJ RWV , XX -, , ... H WZR[N c +I WZR[N 4 FWSWUWZR JVM 9 bJ J eD :2 J WLWMNZ KJ[NM RP Y]JTR a [XNNL [aV N[R[ [a[ NU OWZ ZNJT RUN JXXTRLJ RWV[ f 73713 ZJV[JL RWV[ WV RVOWZUJ RWV JVM [a[ NU[ WT 3.. 2 VW , XX -,, --) + H0KN . I 0KN JSJU]ZJ 9 RSJVW JVM 6 9] JKJZJ eCWRLN LWV NZ[RWV ZW]P NL WZ Y]JV RbJ RWV f , ,+ H[ aTRJVW] c.-I F aTRJVW] 1JXXh JVM 3 W]TRVN[ e1WV RV]W][ XZWKJKRTR[ RL ZJV[OWZU OWZ WRLN LWV NZ[RWV f ( ) ..- H9JVNSW ,I A 9JVNSW JVM 6 9JUNWSJ f JZJTTNT 2J J 4ZNN CWRLN 1WV NZ[RWV [RVP 1aLTN 1WV[R[ NV 0M NZ[JZRJT N WZS[ f JZER , HG ] c ,I 8 F G ] A JZS 7[WTJ JVM 0 0 3OZW[ e VXJRZNM RUJPN W RUJPN ZJV[TJ RWV ][RVP LaLTN LWV[R[ NV JM NZ[JZRJT VN WZS[ f H6RV WV c +I 5 3 6RV WV JVM JTJS ] MRVW e NM]LRVP N MRUNV[RWVJTR a WO MJ J R VN]ZJT VN WZS[ f ,-+ ) , + H JV MNV WZM c ,I 0 JV MNV WZM JVM CRVaJT[ e N]ZJT MR[LZN N ZNXZN[NV J RWV TNJZVRVP f 7V XX +( . +( - , H JV MNV WZM d +I 0 JV 2NV WZM 2RNTNUJV 6 GNV 9 RUWVaJV CRVaJT[ 0 5ZJ N[ JVM 9 9J ]SL]WPT] eDJ NVN 0 PNVNZJ R N UWMNT OWZ ZJ J]MRW f 35