SlideShare a Scribd company logo
1 of 17
Acoustic Modeling using Deep Belief
    Networks
                 Yueshen Xu
                 xuyueshen@163.com
                Zhejiang University




1                                     CCNT, ZJU
Abstract
     Problem
       Achieving better phone recognition
     Method
     Deep neural networks which contain many layers of features and
      numbers of parameters
     Rather than Gaussian Mixture Models
     Step
     Step 1: Pre-trained as a multi-layer generative models without
      making use of any discriminative information  spectral feature
      vector
     Step2: Using backpropagation to make those features better at
      predicting a probability distribution


2                                                                  CCNT, ZJU
Introduction
     Typical Automatic Speech Recognition System
      Model the sequential structure of speech signals: Hidden Markov
       Model
      Spectral representation of the sound wave: HMM state + mixture
       of Gaussians+Mel-frequency Cepstral Coefficients(梅尔倒频谱
       系数)
     New research direction
        Deeper acoustic models containing many layers of features
        Feedforward neural networks
     Advantages
        The estimation of posterior probabilities of HMM does not require
         detailed assumptions about data distribution
        Suitable for discrete and continuous features


3                                                                            CCNT, ZJU
Introduction
     Comparison among MFCCs, GMM
      MFCCs
        Partially overcome the very strong conditional independence
         assumption of HMM
      GMM
        Easy to fit to data using the EM algorithm
        Inefficient at modeling high-dimensional data
     Previous work of neural network
        Using backpropagation algorithms to train neural networks
         discriminatively
        Generative modeling vs discriminative training
        Efficient to handle those unlabeled speech




4                                                                      CCNT, ZJU
Introduction
     Main novelty of this paper
      Achieve consistently better phone recognition performance by pre-
       training a multi-layer neural network
      One layer at a time, as a generative model
     General Description
      The generative pre-training creates many layers of feature detector
      Using backpropagation algorithm to adjust the features in every
       layer to make features more useful for discrimination




5                                                                   CCNT, ZJU
Learning a multilayer generative model
     Two vital assumptions of this paper
      The discrimination is more directly related to the underlying causes
       of data than to the individual elements of data itself
      A good feature vector representation of the underlying causes can
       be recovered from the input data by modeling its higher order
       statistical structure
     Directed view
      Fit a multilayer generative model having infinitely layers of latent
       variables
     Undirected view
      Fitting a relatively simple type of learning module that only has one
       layer of latent variables


6                                                                       CCNT, ZJU
Learning a multilayer generative model
     Undirected view
      Restricted Boltzmann Machine(RBM)
        Bipartite graph in which visible units are connected to hidden units
        No visible-visible or hidden-hidden connections
      Visible units vs. Hidden units
        Visible units: representing observation
        Hidden units: representing features using undirected weighted
          connections
     RBM in this paper
      Binary RBM
        Both hidden and visible units are binary and stochastic
      Gaussian-Bernouli RBM
        Hidden units are binary but visible units are linear with Gaussian noise

7                                                                           CCNT, ZJU
Learning a multilayer generative model
     Binary RBM
      The weights on the connections and biases of individual units
       define a probability distribution over the joint states of visible and
       hidden units via an energy function



     The conditional distribution p(h| v, )



                                         
     The conditional distribution p(v| h, )



8                                                                        CCNT, ZJU
Learning a multilayer generative model
     Learning DBN
      Updating each weight wij using the difference between two
       measured, pairwise correlations:


     Directed view
       A sigmoid belief net consisting of multiple layers of binary
        stochastic units

     Hidden layers
      Binary features
     Visible layers
      Binary data vectors

9                                                                      CCNT, ZJU
Learning a multilayer generative model
      Generating data from the model
       Binary states are chosen for the top layer of hidden units
       Adjusting the weights on the top-down connections
         Performing gradient ascent in the expected log probability
      Challenge
       Getting unbiased samples from exponentially large posterior is
        intractable
       Lack of conditional independence
      Learning with tied weights (1/2)
       Learning Context: a sigmoid belief net with an infinite number of layers
        and tied symmetric weights between layers
       The posterior can be computed by simply multiplying visible vectors by
        transposed weight matrix



10                                                                     CCNT, ZJU
Learning a multilayer generative model

                                 • An infinite sigmoid
                                   belief net with
                                   weights
                                 • Inference is easy since once posteriors
                                   have been sampled for the first hidden
                                   layer, the same process can be used for
                                   the next hidden layer




      Learning is a little more difficult
       Because every copy of tied weight matrix gets different derivatives

11                                                                       CCNT, ZJU
Learning a multilayer generative model
      Unbiased estimate of the sum of derivatives
       h(2) can be viewed as a noisy but unbiased estimate of probabilities
        for visible units predicted by h(1)
       h(3) can be viewed as a noisy but unbiased estimate of probabilities
        for visible units predicted by h(2)




      Unbiased estimate of the sum of derivatives
       h(2) can be viewed as a noisy but unbiased estimate of probabilities
        for visible units predicted by h(1)
       h(3) can be viewed as a noisy but unbiased estimate of probabilities
        for visible units predicted by h(2)

12                                                                    CCNT, ZJU
Learning a multilayer generative model
      Learning different weights in each layer
       Making the generative model more powerful by allowing different
          weights in different layers
         Step1: Learn with all of weight matrices tied together
         Step2: Untie the bottom weight matrix form the other matrices
         Step3: Obtain the frozen matrix W(1)
         Step4: Keeping all remaining matrices tied together, and continuing
          to learn higher matrices
         This involves first inferring h(1) from v by using W(1) and then
          inferring h(2) , h(3) , and h(4) in a similar bottom up manner using W or
          WT




13                                                                          CCNT, ZJU
Learning a multilayer generative model
      Deep belief net(DBN)
       Having learned K layers of features, we get a directed generative
        model called ’Deep Belief Net’
       DBN has K different weight matrices between lower layers and an
        infinite number of higher layers
       This paper models the whole system as a
        feedforward, deterministic neural network
       This network is then discriminatively fine tuned by using
        backpropagation to maximize the log probability of correct HMM
        states




14                                                                   CCNT, ZJU
Using Deep Belief Nets for Phone Recognition
      Visible unit
       Using a context window of n successive frames of speech
        coefficients
       Generate phone sequences
       The resulting feedforward neural network is discriminatively trained
        to output a probability distribution over all possible labels of central
        frames
       Then the pdfs over all possible labels for each frame is fed into a
        standard Viterbi decoder




15                                                                        CCNT, ZJU
Conclusions
      Initiative
       This is the first application to acoustic modeling of neural networks
        in which multiple layers of features are generatively pre-trained
       This approach can be extended to explicitly model the covariance
        structure of input features
       It can be used to jointly train acoustic and language models
       It can be applied to a large vocabulary task replace of GMM




16                                                                     CCNT, ZJU
Thank you


17               CCNT, ZJU

More Related Content

What's hot

Deep belief networks for spam filtering
Deep belief networks for spam filteringDeep belief networks for spam filtering
Deep belief networks for spam filteringSOYEON KIM
 
Deep Belief Networks for Spam Filtering
Deep Belief Networks for Spam FilteringDeep Belief Networks for Spam Filtering
Deep Belief Networks for Spam Filteringbutest
 
Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)SungminYou
 
A Survey of Deep Learning Algorithms for Malware Detection
A Survey of Deep Learning Algorithms for Malware DetectionA Survey of Deep Learning Algorithms for Malware Detection
A Survey of Deep Learning Algorithms for Malware DetectionIJCSIS Research Publications
 
Recurrent neural networks for sequence learning and learning human identity f...
Recurrent neural networks for sequence learning and learning human identity f...Recurrent neural networks for sequence learning and learning human identity f...
Recurrent neural networks for sequence learning and learning human identity f...SungminYou
 
ANN based STLF of Power System
ANN based STLF of Power SystemANN based STLF of Power System
ANN based STLF of Power SystemYousuf Khan
 
Applying Deep Learning Machine Translation to Language Services
Applying Deep Learning Machine Translation to Language ServicesApplying Deep Learning Machine Translation to Language Services
Applying Deep Learning Machine Translation to Language ServicesYannis Flet-Berliac
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
deeplearning
deeplearningdeeplearning
deeplearninghuda2018
 
A Spatial Domain Image Steganography Technique Based on Matrix Embedding and ...
A Spatial Domain Image Steganography Technique Based on Matrix Embedding and ...A Spatial Domain Image Steganography Technique Based on Matrix Embedding and ...
A Spatial Domain Image Steganography Technique Based on Matrix Embedding and ...CSCJournals
 
Supervised sequence labelling with recurrent neural networks ch1 6
Supervised sequence labelling with recurrent neural networks ch1 6Supervised sequence labelling with recurrent neural networks ch1 6
Supervised sequence labelling with recurrent neural networks ch1 6SungminYou
 
Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Martin Pelikan
 
Unsupervised Object Detection
Unsupervised Object DetectionUnsupervised Object Detection
Unsupervised Object DetectionMahan Fathi
 
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...IOSR Journals
 

What's hot (20)

Deep Learning
Deep Learning Deep Learning
Deep Learning
 
Deep belief networks for spam filtering
Deep belief networks for spam filteringDeep belief networks for spam filtering
Deep belief networks for spam filtering
 
Deep Belief Networks for Spam Filtering
Deep Belief Networks for Spam FilteringDeep Belief Networks for Spam Filtering
Deep Belief Networks for Spam Filtering
 
PRML 5.5.6-5.6
PRML 5.5.6-5.6PRML 5.5.6-5.6
PRML 5.5.6-5.6
 
Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)
 
A Survey of Deep Learning Algorithms for Malware Detection
A Survey of Deep Learning Algorithms for Malware DetectionA Survey of Deep Learning Algorithms for Malware Detection
A Survey of Deep Learning Algorithms for Malware Detection
 
Recurrent neural networks for sequence learning and learning human identity f...
Recurrent neural networks for sequence learning and learning human identity f...Recurrent neural networks for sequence learning and learning human identity f...
Recurrent neural networks for sequence learning and learning human identity f...
 
ANN based STLF of Power System
ANN based STLF of Power SystemANN based STLF of Power System
ANN based STLF of Power System
 
Cnn
CnnCnn
Cnn
 
Applying Deep Learning Machine Translation to Language Services
Applying Deep Learning Machine Translation to Language ServicesApplying Deep Learning Machine Translation to Language Services
Applying Deep Learning Machine Translation to Language Services
 
Multidimensional RNN
Multidimensional RNNMultidimensional RNN
Multidimensional RNN
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
deeplearning
deeplearningdeeplearning
deeplearning
 
Recurrent Neural Network
Recurrent Neural NetworkRecurrent Neural Network
Recurrent Neural Network
 
A Spatial Domain Image Steganography Technique Based on Matrix Embedding and ...
A Spatial Domain Image Steganography Technique Based on Matrix Embedding and ...A Spatial Domain Image Steganography Technique Based on Matrix Embedding and ...
A Spatial Domain Image Steganography Technique Based on Matrix Embedding and ...
 
Supervised sequence labelling with recurrent neural networks ch1 6
Supervised sequence labelling with recurrent neural networks ch1 6Supervised sequence labelling with recurrent neural networks ch1 6
Supervised sequence labelling with recurrent neural networks ch1 6
 
Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...
 
Unsupervised Object Detection
Unsupervised Object DetectionUnsupervised Object Detection
Unsupervised Object Detection
 
Three classes of deep learning networks
Three classes of deep learning networksThree classes of deep learning networks
Three classes of deep learning networks
 
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
 

Viewers also liked

A Deep Belief Network Approach to Learning Depth from Optical Flow
A Deep Belief Network Approach to Learning Depth from Optical FlowA Deep Belief Network Approach to Learning Depth from Optical Flow
A Deep Belief Network Approach to Learning Depth from Optical FlowReuben Feinman
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningdoppenhe
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOleg Mygryn
 
Introduction to un supervised learning
Introduction to un supervised learningIntroduction to un supervised learning
Introduction to un supervised learningRishikesh .
 
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual IntroductionLukas Masuch
 

Viewers also liked (8)

Thesis Presentation
Thesis PresentationThesis Presentation
Thesis Presentation
 
A Deep Belief Network Approach to Learning Depth from Optical Flow
A Deep Belief Network Approach to Learning Depth from Optical FlowA Deep Belief Network Approach to Learning Depth from Optical Flow
A Deep Belief Network Approach to Learning Depth from Optical Flow
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Introduction to un supervised learning
Introduction to un supervised learningIntroduction to un supervised learning
Introduction to un supervised learning
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 

Similar to Acoustic Modeling using Deep Belief Networks

Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Akash Goel
 
240324_Thuy_Labseminar[GeoTMI: Predicting Quantum Chemical Property with Easy...
240324_Thuy_Labseminar[GeoTMI: Predicting Quantum Chemical Property with Easy...240324_Thuy_Labseminar[GeoTMI: Predicting Quantum Chemical Property with Easy...
240324_Thuy_Labseminar[GeoTMI: Predicting Quantum Chemical Property with Easy...thanhdowork
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...csandit
 
Convolutional Neural Networks square measure terribly kind of like n.pdf
Convolutional Neural Networks square measure terribly kind of like n.pdfConvolutional Neural Networks square measure terribly kind of like n.pdf
Convolutional Neural Networks square measure terribly kind of like n.pdfpoddaranand1
 
Recent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectivesRecent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectivesNamkug Kim
 
Basic Learning Algorithms of ANN
Basic Learning Algorithms of ANNBasic Learning Algorithms of ANN
Basic Learning Algorithms of ANNwaseem khan
 
RETHINKING THE EXPRESSIVE POWER OF GNNS VIA GRAPH BICONNECTIVITY.pptx
RETHINKING THE EXPRESSIVE POWER OF GNNS VIA GRAPH BICONNECTIVITY.pptxRETHINKING THE EXPRESSIVE POWER OF GNNS VIA GRAPH BICONNECTIVITY.pptx
RETHINKING THE EXPRESSIVE POWER OF GNNS VIA GRAPH BICONNECTIVITY.pptxssuser2624f71
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중datasciencekorea
 
Visualizaing and understanding convolutional networks
Visualizaing and understanding convolutional networksVisualizaing and understanding convolutional networks
Visualizaing and understanding convolutional networksSungminYou
 
Face recognition using artificial neural network
Face recognition using artificial neural networkFace recognition using artificial neural network
Face recognition using artificial neural networkSumeet Kakani
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networksstellajoseph
 
Fundamental of deep learning
Fundamental of deep learningFundamental of deep learning
Fundamental of deep learningStanley Wang
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Wanjin Yu
 
Pixel Recurrent Neural Networks
Pixel Recurrent Neural NetworksPixel Recurrent Neural Networks
Pixel Recurrent Neural Networksneouyghur
 

Similar to Acoustic Modeling using Deep Belief Networks (20)

Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders
 
240324_Thuy_Labseminar[GeoTMI: Predicting Quantum Chemical Property with Easy...
240324_Thuy_Labseminar[GeoTMI: Predicting Quantum Chemical Property with Easy...240324_Thuy_Labseminar[GeoTMI: Predicting Quantum Chemical Property with Easy...
240324_Thuy_Labseminar[GeoTMI: Predicting Quantum Chemical Property with Easy...
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
 
Convolutional Neural Networks square measure terribly kind of like n.pdf
Convolutional Neural Networks square measure terribly kind of like n.pdfConvolutional Neural Networks square measure terribly kind of like n.pdf
Convolutional Neural Networks square measure terribly kind of like n.pdf
 
Recent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectivesRecent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectives
 
Basic Learning Algorithms of ANN
Basic Learning Algorithms of ANNBasic Learning Algorithms of ANN
Basic Learning Algorithms of ANN
 
CNN
CNNCNN
CNN
 
AINL 2016: Filchenkov
AINL 2016: FilchenkovAINL 2016: Filchenkov
AINL 2016: Filchenkov
 
RETHINKING THE EXPRESSIVE POWER OF GNNS VIA GRAPH BICONNECTIVITY.pptx
RETHINKING THE EXPRESSIVE POWER OF GNNS VIA GRAPH BICONNECTIVITY.pptxRETHINKING THE EXPRESSIVE POWER OF GNNS VIA GRAPH BICONNECTIVITY.pptx
RETHINKING THE EXPRESSIVE POWER OF GNNS VIA GRAPH BICONNECTIVITY.pptx
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
 
Visualizaing and understanding convolutional networks
Visualizaing and understanding convolutional networksVisualizaing and understanding convolutional networks
Visualizaing and understanding convolutional networks
 
Face recognition using artificial neural network
Face recognition using artificial neural networkFace recognition using artificial neural network
Face recognition using artificial neural network
 
09 classadvanced
09 classadvanced09 classadvanced
09 classadvanced
 
CNN Tutorial
CNN TutorialCNN Tutorial
CNN Tutorial
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Fundamental of deep learning
Fundamental of deep learningFundamental of deep learning
Fundamental of deep learning
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
 
Pixel Recurrent Neural Networks
Pixel Recurrent Neural NetworksPixel Recurrent Neural Networks
Pixel Recurrent Neural Networks
 

More from Yueshen Xu

Context aware service recommendation
Context aware service recommendationContext aware service recommendation
Context aware service recommendationYueshen Xu
 
Course review for ir class 本科课件
Course review for ir class 本科课件Course review for ir class 本科课件
Course review for ir class 本科课件Yueshen Xu
 
Semantic web 本科课件
Semantic web 本科课件Semantic web 本科课件
Semantic web 本科课件Yueshen Xu
 
Recommender system slides for undergraduate
Recommender system slides for undergraduateRecommender system slides for undergraduate
Recommender system slides for undergraduateYueshen Xu
 
推荐系统 本科课件
 推荐系统 本科课件 推荐系统 本科课件
推荐系统 本科课件Yueshen Xu
 
Text classification 本科课件
Text classification 本科课件Text classification 本科课件
Text classification 本科课件Yueshen Xu
 
Thinking in clustering yueshen xu
Thinking in clustering yueshen xuThinking in clustering yueshen xu
Thinking in clustering yueshen xuYueshen Xu
 
Text clustering (information retrieval, in chinese)
Text clustering (information retrieval, in chinese)Text clustering (information retrieval, in chinese)
Text clustering (information retrieval, in chinese)Yueshen Xu
 
(Hierarchical) Topic Modeling_Yueshen Xu
(Hierarchical) Topic Modeling_Yueshen Xu(Hierarchical) Topic Modeling_Yueshen Xu
(Hierarchical) Topic Modeling_Yueshen XuYueshen Xu
 
(Hierarchical) topic modeling
(Hierarchical) topic modeling (Hierarchical) topic modeling
(Hierarchical) topic modeling Yueshen Xu
 
Non parametric bayesian learning in discrete data
Non parametric bayesian learning in discrete dataNon parametric bayesian learning in discrete data
Non parametric bayesian learning in discrete dataYueshen Xu
 
聚类 (Clustering)
聚类 (Clustering)聚类 (Clustering)
聚类 (Clustering)Yueshen Xu
 
徐悦甡简历
徐悦甡简历徐悦甡简历
徐悦甡简历Yueshen Xu
 
Learning to recommend with user generated content
Learning to recommend with user generated contentLearning to recommend with user generated content
Learning to recommend with user generated contentYueshen Xu
 
Social recommender system
Social recommender systemSocial recommender system
Social recommender systemYueshen Xu
 
Summary on the Conference of WISE 2013
Summary on the Conference of WISE 2013Summary on the Conference of WISE 2013
Summary on the Conference of WISE 2013Yueshen Xu
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introductionYueshen Xu
 
Summarization for dragon star program
Summarization for dragon  star programSummarization for dragon  star program
Summarization for dragon star programYueshen Xu
 
Aggregation computation over distributed data streams(the final version)
Aggregation computation over distributed data streams(the final version)Aggregation computation over distributed data streams(the final version)
Aggregation computation over distributed data streams(the final version)Yueshen Xu
 

More from Yueshen Xu (20)

Context aware service recommendation
Context aware service recommendationContext aware service recommendation
Context aware service recommendation
 
Course review for ir class 本科课件
Course review for ir class 本科课件Course review for ir class 本科课件
Course review for ir class 本科课件
 
Semantic web 本科课件
Semantic web 本科课件Semantic web 本科课件
Semantic web 本科课件
 
Recommender system slides for undergraduate
Recommender system slides for undergraduateRecommender system slides for undergraduate
Recommender system slides for undergraduate
 
推荐系统 本科课件
 推荐系统 本科课件 推荐系统 本科课件
推荐系统 本科课件
 
Text classification 本科课件
Text classification 本科课件Text classification 本科课件
Text classification 本科课件
 
Thinking in clustering yueshen xu
Thinking in clustering yueshen xuThinking in clustering yueshen xu
Thinking in clustering yueshen xu
 
Text clustering (information retrieval, in chinese)
Text clustering (information retrieval, in chinese)Text clustering (information retrieval, in chinese)
Text clustering (information retrieval, in chinese)
 
(Hierarchical) Topic Modeling_Yueshen Xu
(Hierarchical) Topic Modeling_Yueshen Xu(Hierarchical) Topic Modeling_Yueshen Xu
(Hierarchical) Topic Modeling_Yueshen Xu
 
(Hierarchical) topic modeling
(Hierarchical) topic modeling (Hierarchical) topic modeling
(Hierarchical) topic modeling
 
Non parametric bayesian learning in discrete data
Non parametric bayesian learning in discrete dataNon parametric bayesian learning in discrete data
Non parametric bayesian learning in discrete data
 
聚类 (Clustering)
聚类 (Clustering)聚类 (Clustering)
聚类 (Clustering)
 
Yueshen xu cv
Yueshen xu cvYueshen xu cv
Yueshen xu cv
 
徐悦甡简历
徐悦甡简历徐悦甡简历
徐悦甡简历
 
Learning to recommend with user generated content
Learning to recommend with user generated contentLearning to recommend with user generated content
Learning to recommend with user generated content
 
Social recommender system
Social recommender systemSocial recommender system
Social recommender system
 
Summary on the Conference of WISE 2013
Summary on the Conference of WISE 2013Summary on the Conference of WISE 2013
Summary on the Conference of WISE 2013
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introduction
 
Summarization for dragon star program
Summarization for dragon  star programSummarization for dragon  star program
Summarization for dragon star program
 
Aggregation computation over distributed data streams(the final version)
Aggregation computation over distributed data streams(the final version)Aggregation computation over distributed data streams(the final version)
Aggregation computation over distributed data streams(the final version)
 

Recently uploaded

Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 

Recently uploaded (20)

Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 

Acoustic Modeling using Deep Belief Networks

  • 1. Acoustic Modeling using Deep Belief Networks Yueshen Xu xuyueshen@163.com Zhejiang University 1 CCNT, ZJU
  • 2. Abstract  Problem  Achieving better phone recognition  Method  Deep neural networks which contain many layers of features and numbers of parameters  Rather than Gaussian Mixture Models  Step  Step 1: Pre-trained as a multi-layer generative models without making use of any discriminative information  spectral feature vector  Step2: Using backpropagation to make those features better at predicting a probability distribution 2 CCNT, ZJU
  • 3. Introduction  Typical Automatic Speech Recognition System  Model the sequential structure of speech signals: Hidden Markov Model  Spectral representation of the sound wave: HMM state + mixture of Gaussians+Mel-frequency Cepstral Coefficients(梅尔倒频谱 系数)  New research direction  Deeper acoustic models containing many layers of features  Feedforward neural networks  Advantages  The estimation of posterior probabilities of HMM does not require detailed assumptions about data distribution  Suitable for discrete and continuous features 3 CCNT, ZJU
  • 4. Introduction  Comparison among MFCCs, GMM  MFCCs  Partially overcome the very strong conditional independence assumption of HMM  GMM  Easy to fit to data using the EM algorithm  Inefficient at modeling high-dimensional data  Previous work of neural network  Using backpropagation algorithms to train neural networks discriminatively  Generative modeling vs discriminative training  Efficient to handle those unlabeled speech 4 CCNT, ZJU
  • 5. Introduction  Main novelty of this paper  Achieve consistently better phone recognition performance by pre- training a multi-layer neural network  One layer at a time, as a generative model  General Description  The generative pre-training creates many layers of feature detector  Using backpropagation algorithm to adjust the features in every layer to make features more useful for discrimination 5 CCNT, ZJU
  • 6. Learning a multilayer generative model  Two vital assumptions of this paper  The discrimination is more directly related to the underlying causes of data than to the individual elements of data itself  A good feature vector representation of the underlying causes can be recovered from the input data by modeling its higher order statistical structure  Directed view  Fit a multilayer generative model having infinitely layers of latent variables  Undirected view  Fitting a relatively simple type of learning module that only has one layer of latent variables 6 CCNT, ZJU
  • 7. Learning a multilayer generative model  Undirected view  Restricted Boltzmann Machine(RBM)  Bipartite graph in which visible units are connected to hidden units  No visible-visible or hidden-hidden connections  Visible units vs. Hidden units  Visible units: representing observation  Hidden units: representing features using undirected weighted connections  RBM in this paper  Binary RBM  Both hidden and visible units are binary and stochastic  Gaussian-Bernouli RBM  Hidden units are binary but visible units are linear with Gaussian noise 7 CCNT, ZJU
  • 8. Learning a multilayer generative model  Binary RBM  The weights on the connections and biases of individual units define a probability distribution over the joint states of visible and hidden units via an energy function  The conditional distribution p(h| v, )   The conditional distribution p(v| h, ) 8 CCNT, ZJU
  • 9. Learning a multilayer generative model  Learning DBN  Updating each weight wij using the difference between two measured, pairwise correlations:  Directed view  A sigmoid belief net consisting of multiple layers of binary stochastic units  Hidden layers  Binary features  Visible layers  Binary data vectors 9 CCNT, ZJU
  • 10. Learning a multilayer generative model  Generating data from the model  Binary states are chosen for the top layer of hidden units  Adjusting the weights on the top-down connections  Performing gradient ascent in the expected log probability  Challenge  Getting unbiased samples from exponentially large posterior is intractable  Lack of conditional independence  Learning with tied weights (1/2)  Learning Context: a sigmoid belief net with an infinite number of layers and tied symmetric weights between layers  The posterior can be computed by simply multiplying visible vectors by transposed weight matrix 10 CCNT, ZJU
  • 11. Learning a multilayer generative model • An infinite sigmoid belief net with weights • Inference is easy since once posteriors have been sampled for the first hidden layer, the same process can be used for the next hidden layer  Learning is a little more difficult  Because every copy of tied weight matrix gets different derivatives 11 CCNT, ZJU
  • 12. Learning a multilayer generative model  Unbiased estimate of the sum of derivatives  h(2) can be viewed as a noisy but unbiased estimate of probabilities for visible units predicted by h(1)  h(3) can be viewed as a noisy but unbiased estimate of probabilities for visible units predicted by h(2)  Unbiased estimate of the sum of derivatives  h(2) can be viewed as a noisy but unbiased estimate of probabilities for visible units predicted by h(1)  h(3) can be viewed as a noisy but unbiased estimate of probabilities for visible units predicted by h(2) 12 CCNT, ZJU
  • 13. Learning a multilayer generative model  Learning different weights in each layer  Making the generative model more powerful by allowing different weights in different layers  Step1: Learn with all of weight matrices tied together  Step2: Untie the bottom weight matrix form the other matrices  Step3: Obtain the frozen matrix W(1)  Step4: Keeping all remaining matrices tied together, and continuing to learn higher matrices  This involves first inferring h(1) from v by using W(1) and then inferring h(2) , h(3) , and h(4) in a similar bottom up manner using W or WT 13 CCNT, ZJU
  • 14. Learning a multilayer generative model  Deep belief net(DBN)  Having learned K layers of features, we get a directed generative model called ’Deep Belief Net’  DBN has K different weight matrices between lower layers and an infinite number of higher layers  This paper models the whole system as a feedforward, deterministic neural network  This network is then discriminatively fine tuned by using backpropagation to maximize the log probability of correct HMM states 14 CCNT, ZJU
  • 15. Using Deep Belief Nets for Phone Recognition  Visible unit  Using a context window of n successive frames of speech coefficients  Generate phone sequences  The resulting feedforward neural network is discriminatively trained to output a probability distribution over all possible labels of central frames  Then the pdfs over all possible labels for each frame is fed into a standard Viterbi decoder 15 CCNT, ZJU
  • 16. Conclusions  Initiative  This is the first application to acoustic modeling of neural networks in which multiple layers of features are generatively pre-trained  This approach can be extended to explicitly model the covariance structure of input features  It can be used to jointly train acoustic and language models  It can be applied to a large vocabulary task replace of GMM 16 CCNT, ZJU
  • 17. Thank you 17 CCNT, ZJU