UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
Exploring Strategies for Training Deep Neural Networks paper review
1. Exploring Strategies for Training Deep
Neural Networks
By Hugo Larochelle, Yoshua Bengio,Jerome Louradour, Pascal Lamblin
By V B Wickramasinghe (148245F)
3. Introduction
● Training deep neural network is hard.
● This is mainly due to randomly initialized deep
architecture tend to get stuck in poor situations.
● But the ability of deep architectures to represent
complex functions is unmatched.
● This paper highlights some of the recent breakthroughs
in training deep architectures that has helped to uncover
their potential.
4. Deep neural networks
● Shallow networks has been proved to be inefficient in circuit theory,
boolean logic and neural networks.
● This is because some functions that can be represented using k layers is
with finite number of units takes exponential number units with k-1 layers.
● Also highly varying function can be easily represented by a number of
non-linearities stacked together.
● Another issue with shallow architectures is that they’ll require exponential
number of training examples to learn complex functions
● But as mentioned earlier training deep architectures is hard. What is the
solution?
6. Stacked Restricted Boltzmann Machine
Network
● RBMs represent a generative model of input.
● Train individual layers of RBMs using contrastive
divergence.
● Then stack them together so that a one layers output
representation works as input to another(A DBN).
● Hinton(2006) argues that this helps in a more complex
representation overall.
● Then the pretrained stacked framework can be trained
to for a particular task using backpropagation.
7. Stacked Autoassociators Network
● Like RBMs autoassociators are a type of network that when combined
helps improving input representation.
● Autoassociators are an encoding model which is trained to minimize the
reconstruction loss of input from output.
● Stacked autoassociator performs same layer wise training procedure as
DBNs.
● Reconstruction error of an autoassociator and log-likelihood of RBM are
both approximate values of convergent series of log-likelihood gradient
obtained in different ways.
13. Conclusion
● DNNs are an indispensable tool for learning tasks.
● This paper presents 3 methods of optimally training DNNs,
1. pre-training one layer at a time in a greedy way.
2. using unsupervised learning at each layer in a way that preserves
information from the input and disentangles factors of variation.
3. fine-tuning the whole network with respect to the ultimate criterion of
interest.
● The experiments are sound and present clearly why deep neural networks
trained using the presented methods can help in improving learning tasks
significantly over single layer networks.