SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
Batch Normalization: 

Accelerating Deep Network Training 

by Reducing Internal Covariate Shift
#17
2019/02/06
@iiou16_tech
abstract
Deep Neural Networks
 
 
 
  Batch Normalization
 
dropOut
  Batch Normalization 14 1
ImageNet
4.9 5 4.8
outline
1. Introduction
2. Towards Reducing Internal Covariate Shift
3. Normalization via Mini-Batch Statistics
1. Training and Inference with Batch-Normalized
Networks
2. Batch-Normalized ConvolutionalNetworks
3. Batch Normalization enables higher learning rates
4. Batch Normalization regularizes the model
4. Experiments
1. Activations over time
2. ImageNet classification
5. Conclusion
Introduction

• Deep Learning SGD
– x θ l 

θ
– 

• ( ) ( m 

)
• 1
•
Introduction

• covariate shift 

–
DNNx
DNNx’
x DNN x’
Introduction

• covariate shift
x
F1 F2
DNN
F2 F1
→F1 F2 x
2 Towards Reducing Internal Covariate
Shift
•
–
(DNN)
– DNN
– 0 1 

( )
– 1
2 Towards Reducing Internal Covariate
Shift
•
•
• itr
• (SGD)
• 

3 Normalization via Mini-Batch Statistics
•
• 1
• 0 1
•
• γ β x
• 2
3 Normalization via Mini-Batch Statistics
•
• 2
• 0 1 SGD
DNN
• 

/
• itr 

3.1 Training and Inference with
BatchNormalized Networks
•
• /

•
• / 

•
activation
/


/
3.2 Batch-Normalized Convolutional
Networks
• Convolutionarl
• ( )
•
BN
• m
• Conv BN p*q
m*p*q
• Conv BN 2*
3.3 Batch Normalization enables higher
learning rates
• 

• BN 

a 

1/a 



3.4 Batch Normalization regularizes the
model
• 

• BN
• DropOut 

4 Experiments 

4.1 Activations over time
• 

BN
• MNIST
• 3 NN ( )
• 60 50000
BN BN
4.2 ImageNet classification
• Inception ImageNet
• Relu
• CNN layer 5*5( )→3*3 ×2
• batch size = 32
• Optimiser : Momentum SGD
https://arxiv.org/pdf/1409.4842.pdf
4.2.1 Accelerating BN Networks
• BASE Inception BN
• BN
•
• DropOut
• L2 Weight regularization 1/5
• 6
• 

• 1%
• photometric distortion
•
• Local Response Normalization
https://arxiv.org/pdf/1409.4842.pdf
4.2.2 Single-Network Classification
BN-x5LSVRC2012
lr=0.0015
BN
4.2.1
lr=0.0075
4.2.1
lr=0.045
BN-x5
Leru→sigmoid
4.2.3 Ensemble Classification
• ImageNet Best Result
• BN-x30 6
SoTA
• DropOut (5% or 10%)
Conclusion(1/2)
•
• NN 



• activation 

DNN
• SGD
BN 2
• BN
• BN
• dropOut
• BN ImageNet
Conclution(2/2)
• Standardization layer
• BN
• future work
• Recurrent Neural Networks BN
• / BN
• domain adaptation
•
Batch Normalization BN→ , 2
Standardization layer SL no paramater
activation
activation
1
• BN google
• https://patents.google.com/patent/US20160217368A1/en
A neural network system implemented by one or more computers, the neural network system comprising:
a batch normalization layer between a first neural network layer and a second neural network layer, wherein the
first neural network layer generates first layer outputs having a plurality of components, and wherein the batch
normalization layer is configured to, during training of the neural network system on a batch of training examples:
receive a respective first layer output for each training example in the batch;
compute a plurality of normalization statistics for the batch from the first layer outputs;
normalize each component of each first layer output using the normalization statistics to generate a respective
normalized layer output for each training example in the batch;
generate a respective batch normalization layer output for each of the training examples from the normalized layer
outputs; and
provide the batch normalization layer output as an input to the second neural network layer.
• ※
• https://www.slideshare.net/YosukeShinya/ss-125937523
• by 50 @2018/12/15
2
• BN
• Group Normalization
• fixup

Weitere ähnliche Inhalte

Was ist angesagt?

モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019Yusuke Uchida
 
SeRanet introduction
SeRanet introductionSeRanet introduction
SeRanet introductionKosuke Nakago
 
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural NetworkISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural NetworkHiroki Nakahara
 
Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용홍배 김
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural NetworkJunho Cho
 
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...Hiroki Nakahara
 
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...Edge AI and Vision Alliance
 
Deep Learningによる超解像の進歩
Deep Learningによる超解像の進歩Deep Learningによる超解像の進歩
Deep Learningによる超解像の進歩Hiroto Honda
 
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)Shunta Saito
 
#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentationMatthew Opala
 
Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Dmytro Mishkin
 
A Random Forest using a Multi-valued Decision Diagram on an FPGa
A Random Forest using a Multi-valued Decision Diagram on an FPGaA Random Forest using a Multi-valued Decision Diagram on an FPGa
A Random Forest using a Multi-valued Decision Diagram on an FPGaHiroki Nakahara
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Universitat Politècnica de Catalunya
 
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...Taegyun Jeon
 
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Alex Conway
 
DeepFix: a fully convolutional neural network for predicting human fixations...
DeepFix:  a fully convolutional neural network for predicting human fixations...DeepFix:  a fully convolutional neural network for predicting human fixations...
DeepFix: a fully convolutional neural network for predicting human fixations...Universitat Politècnica de Catalunya
 
Introduction to Chainer Chemistry
Introduction to Chainer ChemistryIntroduction to Chainer Chemistry
Introduction to Chainer ChemistryPreferred Networks
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer VisionSungjoon Choi
 
FPL15 talk: Deep Convolutional Neural Network on FPGA
FPL15 talk: Deep Convolutional Neural Network on FPGAFPL15 talk: Deep Convolutional Neural Network on FPGA
FPL15 talk: Deep Convolutional Neural Network on FPGAHiroki Nakahara
 
Convolutional Neural Networks for Computer vision Applications
Convolutional Neural Networks for Computer vision ApplicationsConvolutional Neural Networks for Computer vision Applications
Convolutional Neural Networks for Computer vision ApplicationsAlex Conway
 

Was ist angesagt? (20)

モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019
 
SeRanet introduction
SeRanet introductionSeRanet introduction
SeRanet introduction
 
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural NetworkISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
ISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network
 
Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural Network
 
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
ISCAS'18: A Deep Neural Network on the Nested RNS (NRNS) on an FPGA: Applied ...
 
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
 
Deep Learningによる超解像の進歩
Deep Learningによる超解像の進歩Deep Learningによる超解像の進歩
Deep Learningによる超解像の進歩
 
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
 
#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation
 
Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...
 
A Random Forest using a Multi-valued Decision Diagram on an FPGa
A Random Forest using a Multi-valued Decision Diagram on an FPGaA Random Forest using a Multi-valued Decision Diagram on an FPGa
A Random Forest using a Multi-valued Decision Diagram on an FPGa
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
 
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
 
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
 
DeepFix: a fully convolutional neural network for predicting human fixations...
DeepFix:  a fully convolutional neural network for predicting human fixations...DeepFix:  a fully convolutional neural network for predicting human fixations...
DeepFix: a fully convolutional neural network for predicting human fixations...
 
Introduction to Chainer Chemistry
Introduction to Chainer ChemistryIntroduction to Chainer Chemistry
Introduction to Chainer Chemistry
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
FPL15 talk: Deep Convolutional Neural Network on FPGA
FPL15 talk: Deep Convolutional Neural Network on FPGAFPL15 talk: Deep Convolutional Neural Network on FPGA
FPL15 talk: Deep Convolutional Neural Network on FPGA
 
Convolutional Neural Networks for Computer vision Applications
Convolutional Neural Networks for Computer vision ApplicationsConvolutional Neural Networks for Computer vision Applications
Convolutional Neural Networks for Computer vision Applications
 

Ähnlich wie Batch normalization

ImageNet classification with deep convolutional neural networks(2012)
ImageNet classification with deep convolutional neural networks(2012)ImageNet classification with deep convolutional neural networks(2012)
ImageNet classification with deep convolutional neural networks(2012)WoochulShin10
 
Autoencoders for image_classification
Autoencoders for image_classificationAutoencoders for image_classification
Autoencoders for image_classificationCenk Bircanoğlu
 
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio..."Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...Edge AI and Vision Alliance
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)DonghyunKang12
 
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...ssuser9357dd
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsChester Chen
 
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Bootstrap Your Own Latent: A New Approach to Self-Supervised LearningBootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Bootstrap Your Own Latent: A New Approach to Self-Supervised LearningSungchul Kim
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Introduction to deep learning in python and Matlab
Introduction to deep learning in python and MatlabIntroduction to deep learning in python and Matlab
Introduction to deep learning in python and MatlabImry Kissos
 
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Balázs Hidasi
 
Image classification with neural networks
Image classification with neural networksImage classification with neural networks
Image classification with neural networksSepehr Rasouli
 
Deep Learning Part 1 : Neural Networks
Deep Learning Part 1 : Neural NetworksDeep Learning Part 1 : Neural Networks
Deep Learning Part 1 : Neural NetworksMadhu Sanjeevi (Mady)
 
Deep Learning for Computer Vision - PyconDE 2017
Deep Learning for Computer Vision - PyconDE 2017Deep Learning for Computer Vision - PyconDE 2017
Deep Learning for Computer Vision - PyconDE 2017Alex Conway
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkNAVER Engineering
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in PythonImry Kissos
 
Implementation of linear regression and logistic regression on Spark
Implementation of linear regression and logistic regression on SparkImplementation of linear regression and logistic regression on Spark
Implementation of linear regression and logistic regression on SparkDalei Li
 
Getting your hands dirty with deep learning in java
Getting your hands dirty with deep learning in javaGetting your hands dirty with deep learning in java
Getting your hands dirty with deep learning in javaDave Snowdon
 
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용현호 김
 

Ähnlich wie Batch normalization (20)

ImageNet classification with deep convolutional neural networks(2012)
ImageNet classification with deep convolutional neural networks(2012)ImageNet classification with deep convolutional neural networks(2012)
ImageNet classification with deep convolutional neural networks(2012)
 
Autoencoders for image_classification
Autoencoders for image_classificationAutoencoders for image_classification
Autoencoders for image_classification
 
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio..."Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN Applications
 
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Bootstrap Your Own Latent: A New Approach to Self-Supervised LearningBootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
Introduction to deep learning in python and Matlab
Introduction to deep learning in python and MatlabIntroduction to deep learning in python and Matlab
Introduction to deep learning in python and Matlab
 
OBDPC 2022
OBDPC 2022OBDPC 2022
OBDPC 2022
 
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
 
Image classification with neural networks
Image classification with neural networksImage classification with neural networks
Image classification with neural networks
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Deep Learning Part 1 : Neural Networks
Deep Learning Part 1 : Neural NetworksDeep Learning Part 1 : Neural Networks
Deep Learning Part 1 : Neural Networks
 
Deep Learning for Computer Vision - PyconDE 2017
Deep Learning for Computer Vision - PyconDE 2017Deep Learning for Computer Vision - PyconDE 2017
Deep Learning for Computer Vision - PyconDE 2017
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident network
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in Python
 
Implementation of linear regression and logistic regression on Spark
Implementation of linear regression and logistic regression on SparkImplementation of linear regression and logistic regression on Spark
Implementation of linear regression and logistic regression on Spark
 
Getting your hands dirty with deep learning in java
Getting your hands dirty with deep learning in javaGetting your hands dirty with deep learning in java
Getting your hands dirty with deep learning in java
 
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
[Pycon 2015] 오늘 당장 딥러닝 실험하기 제출용
 

Kürzlich hochgeladen

Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 
Autonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptAutonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptbibisarnayak0
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Coursebim.edu.pl
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solidnamansinghjarodiya
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Erbil Polytechnic University
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdfCaalaaAbdulkerim
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
home automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadhome automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadaditya806802
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the weldingMuhammadUzairLiaqat
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfChristianCDAM
 
Crushers to screens in aggregate production
Crushers to screens in aggregate productionCrushers to screens in aggregate production
Crushers to screens in aggregate productionChinnuNinan
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptNarmatha D
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 

Kürzlich hochgeladen (20)

Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Autonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptAutonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.ppt
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Course
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solid
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdf
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
home automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadhome automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasad
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the welding
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdf
 
Crushers to screens in aggregate production
Crushers to screens in aggregate productionCrushers to screens in aggregate production
Crushers to screens in aggregate production
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.ppt
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 

Batch normalization

  • 1. Batch Normalization: 
 Accelerating Deep Network Training 
 by Reducing Internal Covariate Shift #17 2019/02/06 @iiou16_tech
  • 2. abstract Deep Neural Networks         Batch Normalization   dropOut   Batch Normalization 14 1 ImageNet 4.9 5 4.8
  • 3. outline 1. Introduction 2. Towards Reducing Internal Covariate Shift 3. Normalization via Mini-Batch Statistics 1. Training and Inference with Batch-Normalized Networks 2. Batch-Normalized ConvolutionalNetworks 3. Batch Normalization enables higher learning rates 4. Batch Normalization regularizes the model 4. Experiments 1. Activations over time 2. ImageNet classification 5. Conclusion
  • 4. Introduction
 • Deep Learning SGD – x θ l 
 θ – 
 • ( ) ( m 
 ) • 1 •
  • 5. Introduction
 • covariate shift 
 – DNNx DNNx’ x DNN x’
  • 7. 2 Towards Reducing Internal Covariate Shift • – (DNN) – DNN – 0 1 
 ( ) – 1
  • 8. 2 Towards Reducing Internal Covariate Shift • • • itr • (SGD) • 

  • 9. 3 Normalization via Mini-Batch Statistics • • 1 • 0 1 • • γ β x • 2
  • 10. 3 Normalization via Mini-Batch Statistics • • 2 • 0 1 SGD DNN • 
 / • itr 

  • 11. 3.1 Training and Inference with BatchNormalized Networks • • /
 • • / 
 • activation / 
 /
  • 12. 3.2 Batch-Normalized Convolutional Networks • Convolutionarl • ( ) • BN • m • Conv BN p*q m*p*q • Conv BN 2*
  • 13. 3.3 Batch Normalization enables higher learning rates • 
 • BN 
 a 
 1/a 
 

  • 14. 3.4 Batch Normalization regularizes the model • 
 • BN • DropOut 

  • 15. 4 Experiments 
 4.1 Activations over time • 
 BN • MNIST • 3 NN ( ) • 60 50000 BN BN
  • 16. 4.2 ImageNet classification • Inception ImageNet • Relu • CNN layer 5*5( )→3*3 ×2 • batch size = 32 • Optimiser : Momentum SGD https://arxiv.org/pdf/1409.4842.pdf
  • 17. 4.2.1 Accelerating BN Networks • BASE Inception BN • BN • • DropOut • L2 Weight regularization 1/5 • 6 • 
 • 1% • photometric distortion • • Local Response Normalization https://arxiv.org/pdf/1409.4842.pdf
  • 19. 4.2.3 Ensemble Classification • ImageNet Best Result • BN-x30 6 SoTA • DropOut (5% or 10%)
  • 20. Conclusion(1/2) • • NN 
 
 • activation 
 DNN • SGD BN 2 • BN • BN • dropOut • BN ImageNet
  • 21. Conclution(2/2) • Standardization layer • BN • future work • Recurrent Neural Networks BN • / BN • domain adaptation • Batch Normalization BN→ , 2 Standardization layer SL no paramater activation activation
  • 22. 1 • BN google • https://patents.google.com/patent/US20160217368A1/en A neural network system implemented by one or more computers, the neural network system comprising: a batch normalization layer between a first neural network layer and a second neural network layer, wherein the first neural network layer generates first layer outputs having a plurality of components, and wherein the batch normalization layer is configured to, during training of the neural network system on a batch of training examples: receive a respective first layer output for each training example in the batch; compute a plurality of normalization statistics for the batch from the first layer outputs; normalize each component of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch; generate a respective batch normalization layer output for each of the training examples from the normalized layer outputs; and provide the batch normalization layer output as an input to the second neural network layer. • ※ • https://www.slideshare.net/YosukeShinya/ss-125937523 • by 50 @2018/12/15
  • 23. 2 • BN • Group Normalization • fixup