Exploring Strategies for Training Deep Neural Networks paper review

•

1 gefällt mir•664 views

Vimukthi Wickramasinghe

My paper review presentation for MSc neural networks subject

Ingenieurwesen

Exploring Strategies for Training Deep
Neural Networks
By Hugo Larochelle, Yoshua Bengio,Jerome Louradour, Pascal Lamblin
By V B Wickramasinghe (148245F)

Outline
● Introduction
● Deep neural networks
● Stacked Restricted Boltzmann Machine Network
● Stacked Autoassociators Network
● Experimental results
● Conclusion

Introduction
● Training deep neural network is hard.
● This is mainly due to randomly initialized deep
architecture tend to get stuck in poor situations.
● But the ability of deep architectures to represent
complex functions is unmatched.
● This paper highlights some of the recent breakthroughs
in training deep architectures that has helped to uncover
their potential.

Deep neural networks
● Shallow networks has been proved to be inefficient in circuit theory,
boolean logic and neural networks.
● This is because some functions that can be represented using k layers is
with finite number of units takes exponential number units with k-1 layers.
● Also highly varying function can be easily represented by a number of
non-linearities stacked together.
● Another issue with shallow architectures is that they’ll require exponential
number of training examples to learn complex functions
● But as mentioned earlier training deep architectures is hard. What is the
solution?

Stacked Restricted Boltzmann Machine
Network
● RBMs represent a generative model of input.
● Train individual layers of RBMs using contrastive
divergence.
● Then stack them together so that a one layers output
representation works as input to another(A DBN).
● Hinton(2006) argues that this helps in a more complex
representation overall.
● Then the pretrained stacked framework can be trained
to for a particular task using backpropagation.

Stacked Autoassociators Network
● Like RBMs autoassociators are a type of network that when combined
helps improving input representation.
● Autoassociators are an encoding model which is trained to minimize the
reconstruction loss of input from output.
● Stacked autoassociator performs same layer wise training procedure as
DBNs.
● Reconstruction error of an autoassociator and log-likelihood of RBM are
both approximate values of convergent series of log-likelihood gradient
obtained in different ways.

Conclusion
● DNNs are an indispensable tool for learning tasks.
● This paper presents 3 methods of optimally training DNNs,
1. pre-training one layer at a time in a greedy way.
2. using unsupervised learning at each layer in a way that preserves
information from the input and disentangles factors of variation.
3. fine-tuning the whole network with respect to the ultimate criterion of
interest.
● The experiments are sound and present clearly why deep neural networks
trained using the presented methods can help in improving learning tasks
significantly over single layer networks.

Empfohlen

Deep Multi-Task Learning with Shared Memorymarujirou

Convolutional neural networksLearning Courses Online

Attention Is All You NeedSEMINARGROOT

AttentionSEMINARGROOT

Lexically constrained decoding for sequence generation using grid beam searchSatoru Katsumata

Demystifying Neural Style TransferSEMINARGROOT

Multi-task Learning for Dense Prediction Tasks in Computer VisionArun Talkad

Deep Belief NetworksHasan H Topcu

Empfohlen

Deep Multi-Task Learning with Shared Memorymarujirou

Convolutional neural networksLearning Courses Online

Attention Is All You NeedSEMINARGROOT

AttentionSEMINARGROOT

Lexically constrained decoding for sequence generation using grid beam searchSatoru Katsumata

Demystifying Neural Style TransferSEMINARGROOT

Multi-task Learning for Dense Prediction Tasks in Computer VisionArun Talkad

Deep Belief NetworksHasan H Topcu

On the cross domain reusability of neural modules for general video game playingAlexander Braylan

Practical Block-wise Neural Network Architecture Generation郁凱黃

Reading group nfm - 20170312Shuai Zhang

Introduction to Tree-LSTMsDaniel Perez

Introduction to CNNShuai Zhang

Emnlp2015 reading festival_lstm_cwsAce12358

Sequential Reptile_Inter-Task Gradient Alignment for Multilingual LearningMLAI2

Logic gates II presentationAhmedElazhari1

1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...PyData

RNN & LSTM: Neural Network for Sequential DataYao-Chieh Hu

Review-image-segmentation-by-deep-learningTrong-An Bui

Neural Network ArchitecturesMartin Ockajak

SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive ...Shubhangi Tandon

Recent Progress in RNN and NLPhytae

Recurrent Neural Networks for Text Analysisodsc

PR-207: YOLOv3: An Incremental ImprovementJinwon Lee

Deep Neural Machine Translation with Linear Associative UnitSatoru Katsumata

Functional Domain ModelingMichal Bigos

Meta Dropout: Learning to Perturb Latent Features for Generalization MLAI2

Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...IOSR Journals

Curso auxiliar de enfermeríaEuroinnova Formación

ONG and crowdfunding case - 2012Julien Ferla

Weitere ähnliche Inhalte

Was ist angesagt?

On the cross domain reusability of neural modules for general video game playingAlexander Braylan

Practical Block-wise Neural Network Architecture Generation郁凱黃

Reading group nfm - 20170312Shuai Zhang

Introduction to Tree-LSTMsDaniel Perez

Introduction to CNNShuai Zhang

Emnlp2015 reading festival_lstm_cwsAce12358

Sequential Reptile_Inter-Task Gradient Alignment for Multilingual LearningMLAI2

Logic gates II presentationAhmedElazhari1

1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...PyData

RNN & LSTM: Neural Network for Sequential DataYao-Chieh Hu

Review-image-segmentation-by-deep-learningTrong-An Bui

Neural Network ArchitecturesMartin Ockajak

SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive ...Shubhangi Tandon

Recent Progress in RNN and NLPhytae

Recurrent Neural Networks for Text Analysisodsc

PR-207: YOLOv3: An Incremental ImprovementJinwon Lee

Deep Neural Machine Translation with Linear Associative UnitSatoru Katsumata

Functional Domain ModelingMichal Bigos

Meta Dropout: Learning to Perturb Latent Features for Generalization MLAI2

Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...IOSR Journals

Was ist angesagt? (20)

On the cross domain reusability of neural modules for general video game playing

Practical Block-wise Neural Network Architecture Generation

Reading group nfm - 20170312

Introduction to Tree-LSTMs

Introduction to CNN

Emnlp2015 reading festival_lstm_cws

Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning

Logic gates II presentation

1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...

RNN & LSTM: Neural Network for Sequential Data

Review-image-segmentation-by-deep-learning

Neural Network Architectures

SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive ...

Recent Progress in RNN and NLP

Recurrent Neural Networks for Text Analysis

PR-207: YOLOv3: An Incremental Improvement

Deep Neural Machine Translation with Linear Associative Unit

Functional Domain Modeling

Meta Dropout: Learning to Perturb Latent Features for Generalization

Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...

Andere mochten auch

Curso auxiliar de enfermeríaEuroinnova Formación

ONG and crowdfunding case - 2012Julien Ferla

Asistencia de cabildo 06sabmpio

Cv B Marco Crb Rev6 Ijbenmaralc

Encontro com a escritora Rosa Duarte António Pires

Magazine con alma de blues n18Gustavo pollo Zungri

Red de rarea local2Lizz Ibañez

EnerEscolaslpizzacalla

HBSL - Presentation for Educational Institutions March-2016Afshan Siddiqui

Normas Insutec Virtualinsutecvirtual

[En]ICFecc 2010 sponsorICFFrance

Omer Kalil TestimonyMohamed Salih Magzoub, PMP®, CBP, ISO, MBA.

Slideshare MyElaN in French (Belgium)ElaN Languages

Week 1 discussion 2Shay89

Projet CogLab - PPT1af83

"Internet revoluciona China" | Hispanohablantes en Asia > Comunidad Global > ...Roger Nierga

Searl pkMuhammad Faisal Ashraf

Manual cop-dvr16 rs-cop-dvr16hdmifalames

Metodo del Camino Critico CPM PERT Arq. Derby Gonzalez INTECFred Pezoa

Open networking - BNI InsomniacsMuneer Samnani

Andere mochten auch (20)

Curso auxiliar de enfermería

ONG and crowdfunding case - 2012

Asistencia de cabildo 06

Cv B Marco Crb Rev6 Ij

Encontro com a escritora Rosa Duarte

Magazine con alma de blues n18

Red de rarea local2

EnerEscolas

HBSL - Presentation for Educational Institutions March-2016

Normas Insutec Virtual

[En]ICFecc 2010 sponsor

Omer Kalil Testimony

Slideshare MyElaN in French (Belgium)

Week 1 discussion 2

Projet CogLab - PPT1

"Internet revoluciona China" | Hispanohablantes en Asia > Comunidad Global > ...

Searl pk

Manual cop-dvr16 rs-cop-dvr16hdmi

Metodo del Camino Critico CPM PERT Arq. Derby Gonzalez INTEC

Open networking - BNI Insomniacs

Ähnlich wie Exploring Strategies for Training Deep Neural Networks paper review

Fundamental of deep learningStanley Wang

Autoencoders for image_classificationCenk Bircanoğlu

DSRLab seminar Introduction to deep learningPoo Kuan Hoong

CNN.pptx.pdfKnoldus Inc.

Nips 2017 in a nutshellLULU CHENG

deeplearninghuda2018

Practical MLAntonio Pitasi

Deep learning from a novice perspectiveAnirban Santara

ML Module 3 Non Linear Learning.pptxDebabrataPain1

Efficient design of feedforward network for pattern classificationIOSR Journals

GNR638_Course Project for spring semesterBijayChandraDasTECH0

Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...csandit

V2.0 open power ai virtual university deep learning and ai introductionGanesan Narayanasamy

Deep Learning via Semi-Supervised Embedding （第 7 回 Deep Learning 勉強会資料; 大澤）Ohsawa Goodfellow

DMS MODULE 1 PRESENTATION.pptxSREESAIARJUNKOSINEPA

GNR638_project ppt.pdfAtulVerma631398

PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...Jinwon Lee

2017 (albawi-alkabi)image-net classification with deep convolutional neural n...ali hassan

Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningBigDataCloud

D028018022researchinventy

Ähnlich wie Exploring Strategies for Training Deep Neural Networks paper review (20)

Fundamental of deep learning

Autoencoders for image_classification

DSRLab seminar Introduction to deep learning

CNN.pptx.pdf

Nips 2017 in a nutshell

deeplearning

Practical ML

Deep learning from a novice perspective

ML Module 3 Non Linear Learning.pptx

Efficient design of feedforward network for pattern classification

GNR638_Course Project for spring semester

Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...

V2.0 open power ai virtual university deep learning and ai introduction

Deep Learning via Semi-Supervised Embedding （第 7 回 Deep Learning 勉強会資料; 大澤）

DMS MODULE 1 PRESENTATION.pptx

GNR638_project ppt.pdf

PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...

2017 (albawi-alkabi)image-net classification with deep convolutional neural n...

Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning

D028018022

Mehr von Vimukthi Wickramasinghe

BeanstalkgVimukthi Wickramasinghe

pgdip-project-report-final-148245FVimukthi Wickramasinghe

Factored Operating Systems paper reviewVimukthi Wickramasinghe

Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machi...Vimukthi Wickramasinghe

Application Performance & Flexibility on Exokernel Systems paper reviewVimukthi Wickramasinghe

Improved Query Performance With Variant Indexes - review presentationVimukthi Wickramasinghe

A parallel gpu version of the traveling salesman problem slidesVimukthi Wickramasinghe

Smart mrs bi project-presentationVimukthi Wickramasinghe

Mehr von Vimukthi Wickramasinghe (8)

Beanstalkg

pgdip-project-report-final-148245F

Factored Operating Systems paper review

Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machi...

Application Performance & Flexibility on Exokernel Systems paper review

Improved Query Performance With Variant Indexes - review presentation

A parallel gpu version of the traveling salesman problem slides

Smart mrs bi project-presentation

Kürzlich hochgeladen

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth

Roadmap to Membership of RICS - Pathways and RoutesM Maged Hegazy, LLM, MBA, CCP, P3O

Porous Ceramics seminar and technical writingrakeshbaidya232001

University management System project report..pdfKamal Acharya

Introduction to Multiple Access Protocol.pptxupamatechverse

UNIT-II FMM-Flow Through Circular Conduitsrknatarajan

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat

result management system report for college projectTonystark477637

Online banking management system project.pdfKamal Acharya

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile

Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth

Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

AKTU Computer Networks notes --- Unit 3.pdfankushspencer015

UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan

Kürzlich hochgeladen (20)

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...

Roadmap to Membership of RICS - Pathways and Routes

Porous Ceramics seminar and technical writing

University management System project report..pdf

Introduction to Multiple Access Protocol.pptx

UNIT-II FMM-Flow Through Circular Conduits

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts

result management system report for college project

Online banking management system project.pdf

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...

Water Industry Process Automation & Control Monthly - April 2024

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...

Coefficient of Thermal Expansion and their Importance.pptx

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

AKTU Computer Networks notes --- Unit 3.pdf

UNIT-V FMM.HYDRAULIC TURBINE - Construction and working

Exploring Strategies for Training Deep Neural Networks paper review

1. Exploring Strategies for Training Deep Neural Networks By Hugo Larochelle, Yoshua Bengio,Jerome Louradour, Pascal Lamblin By V B Wickramasinghe (148245F)

2. Outline ● Introduction ● Deep neural networks ● Stacked Restricted Boltzmann Machine Network ● Stacked Autoassociators Network ● Experimental results ● Conclusion

3. Introduction ● Training deep neural network is hard. ● This is mainly due to randomly initialized deep architecture tend to get stuck in poor situations. ● But the ability of deep architectures to represent complex functions is unmatched. ● This paper highlights some of the recent breakthroughs in training deep architectures that has helped to uncover their potential.

4. Deep neural networks ● Shallow networks has been proved to be inefficient in circuit theory, boolean logic and neural networks. ● This is because some functions that can be represented using k layers is with finite number of units takes exponential number units with k-1 layers. ● Also highly varying function can be easily represented by a number of non-linearities stacked together. ● Another issue with shallow architectures is that they’ll require exponential number of training examples to learn complex functions ● But as mentioned earlier training deep architectures is hard. What is the solution?

5. Deep neural networks

6. Stacked Restricted Boltzmann Machine Network ● RBMs represent a generative model of input. ● Train individual layers of RBMs using contrastive divergence. ● Then stack them together so that a one layers output representation works as input to another(A DBN). ● Hinton(2006) argues that this helps in a more complex representation overall. ● Then the pretrained stacked framework can be trained to for a particular task using backpropagation.

7. Stacked Autoassociators Network ● Like RBMs autoassociators are a type of network that when combined helps improving input representation. ● Autoassociators are an encoding model which is trained to minimize the reconstruction loss of input from output. ● Stacked autoassociator performs same layer wise training procedure as DBNs. ● Reconstruction error of an autoassociator and log-likelihood of RBM are both approximate values of convergent series of log-likelihood gradient obtained in different ways.

8. Stacked Autoassociators Network

9. Experimental results

10. Experimental results

11. Experimental results

12. Experimental results

13. Conclusion ● DNNs are an indispensable tool for learning tasks. ● This paper presents 3 methods of optimally training DNNs, 1. pre-training one layer at a time in a greedy way. 2. using unsupervised learning at each layer in a way that preserves information from the input and disentangles factors of variation. 3. fine-tuning the whole network with respect to the ultimate criterion of interest. ● The experiments are sound and present clearly why deep neural networks trained using the presented methods can help in improving learning tasks significantly over single layer networks.