SlideShare ist ein Scribd-Unternehmen logo
1 von 53
Downloaden Sie, um offline zu lesen
Case Study of CNN
from LeNet to ResNet
NamHyuk Ahn @ Ajou Univ.
2016. 03. 09
Convolutional Neural Network
Convolution Layer
- Convolution (3-dim dot product) image and filter
- Stack filter in one layer (See blue and green output,
called channel)
Convolution Layer
- Local Connectivity
• Instead connect all pixels to neurons, connect
only local region of input (called receptive field)
• It can reduce many parameter
- Parameter sharing
• To reduce parameter, each channel have same
filter. (# of filter == # of channel)
Convolution Layer
- Example) 1st conv layer in AlexNet
• Input: [224, 224], filter: [11x11x3], 96, output: [55, 55]
- Each filter extract different features (i.e. horizontal
edge, vertical edge…)
Pooling Layer
- Downsample image to reduce parameter
- Usually use max pooling (take maximum value in
region)
ReLU, FC Layer
- ReLU
• Sort of activation function (e.g. sigmoid, tanh…)
- Fully-connected Layer
• Same as normal neural network
Convolutional Neural Network
Training CNN
1. Calculate loss function with foward-prop
2. Optimize parameter w.r.t loss function with back-
prop
• Use gradient descent method (SGD)
• Gradient of weight can calculate with chain rule of partial derivate
ILSVRC trend
AlexNet (2012)
(ILSVRC 2012 winner)
AlexNet
- ReLU
- Data augmentation
- Dropout
- Ensemble CNN (1-CNN 18.2%, 7-CNN 15.4%)
AlexNet
- Other methods (but will not mention today)
• SGD + momentum (+ mini-batch)
• Multiple GPU
• Weight Decay
• Local Response Normalization
Problems of sigmoid
- Gradient vanishing
• when gradient pass sigmoid, it can vanish
because local gradient of sigmoid can be almost
zero.
- Output is not zero-centered
• cause bad performance
ReLU
- Converge of SGD is faster than sigmoid-like
- Computationally cheap
Data augmentation
- Randomly crop [256, 256] images to [224, 224]
- At test time, crop 5 images and average to predict
Dropout
- Similar to bagging (approximation of bagging)
- Act like regularizer (reduce overfit)
- Instead of using all neurons, “dropout” some neurons
randomly (usually 0.5 probability)
Dropout
• At test time, not “dropout” neurons, but use
weighted neurons (usually 0.5)
• Weight is expected value of each neurons
Architecture
- conv - pool - … - fc - softmax (similar to LeNet)
- Use large size filter (i.e. 11x11)
Architecture
- Weights must be initalized randomly
• If not, all gradients of neurons will be same
• Usually, use gaussian distribution, std = 0.01
- Use mini-batch SGD and momentum SGD to
update weight
VGGNet (2014)
(ILSVRC 2014 2nd)
VGGNet
- Use small size kernel (always 3x3)
• Can use multiple non-linearlity (e.g. ReLU)
• Less weights to train
- Hard data augmentation (more than AlexNet)
- Ensemble 7 model (ILSVRC submission 7.3%)
Architecture
- Most memory needs in early layers, most parameters
increase in fc layers.
GoogLeNet - Inception v1 (2014)
(ILSVRC 2014 winner)
GoogLeNet
Inception module
- Use 1x1, 3x3 and 5x5 conv
simultaneously to capture
variety of structure
- Capture dense structure to
1x1, more spread out structure
to 3x3, 5x5
- Computational expensive
• Use 1x1 conv layer to
reduce dimension (explain
details in later in ResNet)
Auxiliary Classifiers
- Deep network raises concern about effectiveness
of graident in backprop
- Loss of auxiliary is added to total loss (weighted by
0.3), remove at test time
Average Pooling
- Proposed in Network in Network (also used in
GoogLeNet)
- Problems of fc layer
• Needs lots of parameter, easy to overfit
- Replace fc to average pooling
Average Pooling
- Make channel as same as # of class in last conv
- Calc average on each channel, and pass to softmax
- Reduce overfit
MSRA ResNet (2015)
(ILSVRC 2015 winner)
before ResNet..
- Have to know about
• PReLU
• Xavier Initalization
• Batch Normalization
PReLU
- Adaptive version of ReLU
- Train slope of function when x < 0
- Slightly more parameter (# of layer x # of channel)
Xavier Initalization
- If init with gaussian distribution, output of neurons
will be nearly zeros when network is deeep
- If increase std (1.0), output will saturate to -1 or 1
- Xavier init decide initial value by number of input
neurons
- Looks fine, but this init method assume linear
activation so can’t use in ReLU-like network
output is saturated
output is vanished
Xavier Initalization / 2
Xavier Initalization
Xavier Initalization / 2
Batch Normalization
- Make output to be gaussian distribution, but
normalization cost a lot
• Calc mean, variance in each dimension (assume each dims are
uncorrelated)
• Calc mean, variance in mini-batch (not entire set)
- Normalize constrain non-linearlity and constrain
network by assume each dims are uncorrelated
• Linear transform output (factors are parameter)
Batch Normalization
- When test, calc mean, variance using entire set (use
moving average)
- BN act like regularizer (don’t need Dropout)
ResNet
ResNet
Problem of degradation
- More depth, more accurate but deep network can
vanish/explode gradient
• BN, Xavier Init, Dropout can handle (~30 layer)
- More deeper, degradation problem occur
• Not only overfit, but also increase training error
Deep Residual Learning
- Element-wise addition with F(x) and shortcut
connection, and pass through ReLU non-linearlity
- Dim of x, F(x) are unequal (changing of channel),
linear project x to match dim (done by 1x1 conv)
- Similar to LSTM
Deeper Bottleneck
- To reduce training time, modify as bottleneck design
(just for economical reason)
• (3x3x3)x64x64 + (3x3x3)x64x64=221184 (left)
• (1x1x3)x256x64 + (3x3x3)x64x64 + (1x1x3)x64x256=208896 (right)
• More width(channel) in right, but similar parameter
• Similar method also used in GoogLeNet
ResNet
- Data augmentation as AlexNet does
- Batch Normalization (no dropout)
- Xavier / 2 initalization
- Average pooling
- Structure follows VGGNet style
Conclusion
Top-5Error
0%
4%
8%
12%
16%
AlexN
et
(2012)
VG
G
N
et
(2014)
Inception-V1
(2014)
H
um
an
PR
eLU
-net
(2015)
BN
-Inception
(2015)
R
esN
et-152
(2015)
Inception-R
esN
et
(2016)
3.1%
3.57%
4.82%4.94%5.1%
6.66%
7.32%
15.31%
Conclusion
- Dropout, BN
- ReLU-like activation (e.g. PReLU, ELU..)
- Xavier initalization
- Average pooling
- Use pre-trained model :)
Reference
- Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep
convolutional neural networks." Advances in neural information processing systems. 2012.
- Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image
recognition." arXiv preprint arXiv:1409.1556 (2014).
- Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." arXiv preprint arXiv:1312.4400 (2013).
- He, Kaiming, et al. "Delving deep into rectifiers: Surpassing human-level performance on imagenet
classification." Proceedings of the IEEE International Conference on Computer Vision. 2015.
- He, Kaiming, et al. "Deep Residual Learning for Image Recognition." arXiv preprint arXiv:1512.03385
(2015).
- Szegedy, Christian, Sergey Ioffe, and Vincent Vanhoucke. "Inception-v4, Inception-ResNet and the
Impact of Residual Connections on Learning." arXiv preprint arXiv:1602.07261 (2016).
- Gu, Jiuxiang, et al. "Recent Advances in Convolutional Neural Networks." arXiv preprint arXiv:
1512.07108 (2015). (good for tutorial)
- Also Thanks to CS231n, I used some figures in CS231n lecture slides. 

see http://cs231n.stanford.edu/index.html

Weitere ähnliche Inhalte

Was ist angesagt?

Issues in knowledge representation
Issues in knowledge representationIssues in knowledge representation
Issues in knowledge representation
Sravanthi Emani
 

Was ist angesagt? (20)

Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its Applications
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNN
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
Cnn
CnnCnn
Cnn
 
Machine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural NetworkMachine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural Network
 
Resnet
ResnetResnet
Resnet
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
 
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...
 
Issues in knowledge representation
Issues in knowledge representationIssues in knowledge representation
Issues in knowledge representation
 
Feed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descentFeed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descent
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural Networks
 
Activation function
Activation functionActivation function
Activation function
 
Denoising autoencoder by Harish.R
Denoising autoencoder by Harish.RDenoising autoencoder by Harish.R
Denoising autoencoder by Harish.R
 
Back propagation
Back propagationBack propagation
Back propagation
 
Cnn method
Cnn methodCnn method
Cnn method
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
 

Ähnlich wie Case Study of Convolutional Neural Network

nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
abhishek upadhyay
 
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
ali hassan
 

Ähnlich wie Case Study of Convolutional Neural Network (20)

backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
 
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit sat
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
CNN.pptx
CNN.pptxCNN.pptx
CNN.pptx
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
 
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
 
Lecture 5: Convolutional Neural Network Models
Lecture 5: Convolutional Neural Network ModelsLecture 5: Convolutional Neural Network Models
Lecture 5: Convolutional Neural Network Models
 
Network Deconvolution review [cdm]
Network Deconvolution review [cdm]Network Deconvolution review [cdm]
Network Deconvolution review [cdm]
 
14_cnn complete.pptx
14_cnn complete.pptx14_cnn complete.pptx
14_cnn complete.pptx
 
Cerebellar Model Articulation Controller
Cerebellar Model Articulation ControllerCerebellar Model Articulation Controller
Cerebellar Model Articulation Controller
 
Parallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationParallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel application
 
nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
 
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
2017 (albawi-alkabi)image-net classification with deep convolutional neural n...
 
Lec 6-bp
Lec 6-bpLec 6-bp
Lec 6-bp
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
Loop parallelization & pipelining
Loop parallelization & pipeliningLoop parallelization & pipelining
Loop parallelization & pipelining
 
convolutional_neural_networks in deep learning
convolutional_neural_networks in deep learningconvolutional_neural_networks in deep learning
convolutional_neural_networks in deep learning
 
ML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptxML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptx
 

Mehr von NamHyuk Ahn (7)

Supporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OSSupporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OS
 
TensorFlow Tutorial
TensorFlow TutorialTensorFlow Tutorial
TensorFlow Tutorial
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)
 
Single Shot Multibox Detector
Single Shot Multibox DetectorSingle Shot Multibox Detector
Single Shot Multibox Detector
 
Multimodal Residual Learning for Visual QA
Multimodal Residual Learning for Visual QAMultimodal Residual Learning for Visual QA
Multimodal Residual Learning for Visual QA
 
Google's Multilingual Neural Machine Translation System
Google's Multilingual Neural Machine Translation SystemGoogle's Multilingual Neural Machine Translation System
Google's Multilingual Neural Machine Translation System
 
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image SegmentationDeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
 

Kürzlich hochgeladen

Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 

Kürzlich hochgeladen (20)

KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 

Case Study of Convolutional Neural Network

  • 1. Case Study of CNN from LeNet to ResNet NamHyuk Ahn @ Ajou Univ. 2016. 03. 09
  • 3.
  • 4.
  • 5. Convolution Layer - Convolution (3-dim dot product) image and filter - Stack filter in one layer (See blue and green output, called channel)
  • 6. Convolution Layer - Local Connectivity • Instead connect all pixels to neurons, connect only local region of input (called receptive field) • It can reduce many parameter - Parameter sharing • To reduce parameter, each channel have same filter. (# of filter == # of channel)
  • 7. Convolution Layer - Example) 1st conv layer in AlexNet • Input: [224, 224], filter: [11x11x3], 96, output: [55, 55] - Each filter extract different features (i.e. horizontal edge, vertical edge…)
  • 8. Pooling Layer - Downsample image to reduce parameter - Usually use max pooling (take maximum value in region)
  • 9. ReLU, FC Layer - ReLU • Sort of activation function (e.g. sigmoid, tanh…) - Fully-connected Layer • Same as normal neural network
  • 11. Training CNN 1. Calculate loss function with foward-prop 2. Optimize parameter w.r.t loss function with back- prop • Use gradient descent method (SGD) • Gradient of weight can calculate with chain rule of partial derivate
  • 12.
  • 13.
  • 14.
  • 16.
  • 18. AlexNet - ReLU - Data augmentation - Dropout - Ensemble CNN (1-CNN 18.2%, 7-CNN 15.4%)
  • 19. AlexNet - Other methods (but will not mention today) • SGD + momentum (+ mini-batch) • Multiple GPU • Weight Decay • Local Response Normalization
  • 20. Problems of sigmoid - Gradient vanishing • when gradient pass sigmoid, it can vanish because local gradient of sigmoid can be almost zero. - Output is not zero-centered • cause bad performance
  • 21. ReLU - Converge of SGD is faster than sigmoid-like - Computationally cheap
  • 22. Data augmentation - Randomly crop [256, 256] images to [224, 224] - At test time, crop 5 images and average to predict
  • 23. Dropout - Similar to bagging (approximation of bagging) - Act like regularizer (reduce overfit) - Instead of using all neurons, “dropout” some neurons randomly (usually 0.5 probability)
  • 24. Dropout • At test time, not “dropout” neurons, but use weighted neurons (usually 0.5) • Weight is expected value of each neurons
  • 25. Architecture - conv - pool - … - fc - softmax (similar to LeNet) - Use large size filter (i.e. 11x11)
  • 26. Architecture - Weights must be initalized randomly • If not, all gradients of neurons will be same • Usually, use gaussian distribution, std = 0.01 - Use mini-batch SGD and momentum SGD to update weight
  • 28. VGGNet - Use small size kernel (always 3x3) • Can use multiple non-linearlity (e.g. ReLU) • Less weights to train - Hard data augmentation (more than AlexNet) - Ensemble 7 model (ILSVRC submission 7.3%)
  • 29. Architecture - Most memory needs in early layers, most parameters increase in fc layers.
  • 30. GoogLeNet - Inception v1 (2014) (ILSVRC 2014 winner)
  • 32. Inception module - Use 1x1, 3x3 and 5x5 conv simultaneously to capture variety of structure - Capture dense structure to 1x1, more spread out structure to 3x3, 5x5 - Computational expensive • Use 1x1 conv layer to reduce dimension (explain details in later in ResNet)
  • 33. Auxiliary Classifiers - Deep network raises concern about effectiveness of graident in backprop - Loss of auxiliary is added to total loss (weighted by 0.3), remove at test time
  • 34. Average Pooling - Proposed in Network in Network (also used in GoogLeNet) - Problems of fc layer • Needs lots of parameter, easy to overfit - Replace fc to average pooling
  • 35. Average Pooling - Make channel as same as # of class in last conv - Calc average on each channel, and pass to softmax - Reduce overfit
  • 37. before ResNet.. - Have to know about • PReLU • Xavier Initalization • Batch Normalization
  • 38. PReLU - Adaptive version of ReLU - Train slope of function when x < 0 - Slightly more parameter (# of layer x # of channel)
  • 39. Xavier Initalization - If init with gaussian distribution, output of neurons will be nearly zeros when network is deeep - If increase std (1.0), output will saturate to -1 or 1 - Xavier init decide initial value by number of input neurons - Looks fine, but this init method assume linear activation so can’t use in ReLU-like network
  • 41. Xavier Initalization / 2 Xavier Initalization Xavier Initalization / 2
  • 42. Batch Normalization - Make output to be gaussian distribution, but normalization cost a lot • Calc mean, variance in each dimension (assume each dims are uncorrelated) • Calc mean, variance in mini-batch (not entire set) - Normalize constrain non-linearlity and constrain network by assume each dims are uncorrelated • Linear transform output (factors are parameter)
  • 43. Batch Normalization - When test, calc mean, variance using entire set (use moving average) - BN act like regularizer (don’t need Dropout)
  • 46. Problem of degradation - More depth, more accurate but deep network can vanish/explode gradient • BN, Xavier Init, Dropout can handle (~30 layer) - More deeper, degradation problem occur • Not only overfit, but also increase training error
  • 47. Deep Residual Learning - Element-wise addition with F(x) and shortcut connection, and pass through ReLU non-linearlity - Dim of x, F(x) are unequal (changing of channel), linear project x to match dim (done by 1x1 conv) - Similar to LSTM
  • 48. Deeper Bottleneck - To reduce training time, modify as bottleneck design (just for economical reason) • (3x3x3)x64x64 + (3x3x3)x64x64=221184 (left) • (1x1x3)x256x64 + (3x3x3)x64x64 + (1x1x3)x64x256=208896 (right) • More width(channel) in right, but similar parameter • Similar method also used in GoogLeNet
  • 49. ResNet - Data augmentation as AlexNet does - Batch Normalization (no dropout) - Xavier / 2 initalization - Average pooling - Structure follows VGGNet style
  • 52. Conclusion - Dropout, BN - ReLU-like activation (e.g. PReLU, ELU..) - Xavier initalization - Average pooling - Use pre-trained model :)
  • 53. Reference - Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. - Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014). - Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." arXiv preprint arXiv:1312.4400 (2013). - He, Kaiming, et al. "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification." Proceedings of the IEEE International Conference on Computer Vision. 2015. - He, Kaiming, et al. "Deep Residual Learning for Image Recognition." arXiv preprint arXiv:1512.03385 (2015). - Szegedy, Christian, Sergey Ioffe, and Vincent Vanhoucke. "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning." arXiv preprint arXiv:1602.07261 (2016). - Gu, Jiuxiang, et al. "Recent Advances in Convolutional Neural Networks." arXiv preprint arXiv: 1512.07108 (2015). (good for tutorial) - Also Thanks to CS231n, I used some figures in CS231n lecture slides. 
 see http://cs231n.stanford.edu/index.html