SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Short Story Submission
Rimzim Thube
SJSU ID- 014555021
A Survey of Convolutional Neural
Networks:
Analysis, Applications, and Prospects
Zewen Li, Wenjie Yang, Shouheng Peng, Fan Liu, Member, IEEE
Introduction to Convolution Neural Network (CNN)
•Applications using CNN –
•Face recognition
•Autonomous vehicles
•Self-service supermarket
•Intelligent medical treatment
Emergence of CNN
• McCulloch and Pitts – First mathematical MP model of
neurons
• Rosenblatt - Added learning capability to MP model
• Hinton – Proposed multi-layer feedforward network trained
by the error Back Propagation – BP network
• Waibel - Time Delay Neural Network (TDNN) for speech
recognition
• LeCun – First convolution network (LeNet) to recognize
handwritten text
Overview of CNN
• Feedforward neural network
• Extracts features from data from convolution structures
• Architecture inspired by visual perception
• Biological neuron corresponds to an artificial neuron
• CNN kernels represent different receptors that can respond to various
features
• Activation function transmit signal to next neuron if it exceeds certain
threshold
• Loss functions and optimizers teach the whole CNN system to learn
Advantages of CNN
• Local connections – Each neuron connected to not all but
small no. of neurons. Reduces parameters and speed up
convergence.
• Weight sharing - Connections share same weights
• Down-sampling dimensionality reduction.
• These characteristics make CNN most representative
algorithms
Components of CNN
• Convolution - pivotal step for feature extraction. Output is feature
map
• Padding - introduced to enlarge the input with zero value
• Stride – Control the density of convolution
• Pooling - Obviate redundancy or down sampling
LeNet - 5
• Composed of 7 trainable layers containing 2 convolutional layers, 2
pooling layers, and 3 fully-connected layers
• NN characteristics of local receptive fields, shared weights, and spatial
or temporal subsampling, ensures shift, scale, and distortion
• Used for handwriting recognition
AlexNet
• Has 8 layers, containing 5 convolutional layers and 3 fully-connected
layers
• uses ReLU as the activation function of CNN to solve gradient
vanishing
• Dropout was used in last few layers to avoid overfitting
• Local Response Normalization (LRN) to enhance generalization of
model
AlexNet
• Employ 2 powerful GPUs, two feature maps generated by two GPUs
can be combined as the final output
• Enlarges dataset and calculates average of their predictions as final
result
• Principal Component Analysis (PCA) to change the RGB values of
training set
VGGNet
• LRN layer was removed
• VGGNets use 3 × 3 convolution kernels rather than 5 × 5 or 5 × 5
ones, since several small kernels have the same receptive field and
more nonlinear variations compared with larger ones.
GoogLeNet - Inception v1
• CNN formed by stacking with Inception modules
• Inception v1 deploys 1 × 1, 3 × 3, 5 × 5 convolution kernels to
construct a “wide” network
• Convolution kernels with different sizes can extract the feature maps
of different scales of the image
• 1 × 1 convolution kernel is used to reduce the number of channels,
i.e., reduce computational cost
GoogLeNet - Inception v2
• Output of every layer is normalized to increase the robustness of
model and train it with high learning rate
• Single 5 × 5 convolutional layers can be replaced by two 3 × 3 ones
• One n x n convolutional layer can be replaced byone 1 x n and one n x
1 convolutional layer
• Filter banks expanded wider to improve high dimensional
representations
ResNet
• Two layer residual block constructed by the shortcut connection
• 50-layer ResNet, 101-layer ResNet, and 152-layer ResNet utilize three-
layer residual blocks
• Three-layer residual block is also called the bottleneck module
because the two ends of the block are narrower than the middle
• Can mitigate the gradient vanishing problem since the gradient can
directly flow through shortcut connections
•
DCGAN
• GAN has generative model G and a discriminative model D
• The model G with random noise z generates a sample G(z) that
subjects to the data distribution data learned by G.
• The model D can determine whether the input sample is real data x
or generated data G(z).
• Both G and D can be nonlinear functions. The aim of G is to generate
real data, the aim of D is to distinguish fake data generated by G from
the real data
MobileNets
• lightweight models proposed by Google for embedded devices such
as mobile phones
• depth-wise separable convolutions and several advanced techniques
to build thin deep neural networks.
ShuffleNets
• Series of CNN-based models to solve the problem of insufficient
computing power of mobile devices
• Combine pointwise group convolution, channel shuffle, which
significantly reduce the computational cost with little loss of accuracy
GhostNet
• As large amounts of redundant features are extracted by existing
CNNs for image cognition, GhostNet is used to reduce computational
cost effectively
• Similar feature maps in traditional convolution layers are called ghost
• Traditional convolution layers divided into two parts
• Less convolution kernels are directly used in feature extraction
• These features are processed in linear transformation to acquire
multiple feature maps. They proved that Ghost module applies to
other CNN models
Activation function
• In a multilayer neural network, there is a function between two layers
which is called activation function
• Determines which information should be transmitted to the next
neuron
• If no activation function, input layer will be linear function of the
output
• Nonlinear functions are introduced as activation functions to enhance
ability of neural network
Types of activation function
• Sigmoid function can map a real number to (0, 1), so it can be used
for binary classification problems.
• Tanh function maps a real number to (-1, 1), achieves normalization.
This makes the next layer easier to learn.
• Rectified Linear Unit (ReLU), when x is less than 0, its value is 0; when
x is greater than or equal to 0, its value is x itself. Speeds up learning.
• ELU function has a negative value, so the average value of its output is
close to 0, making the rate of convergence faster than ReLU.
Loss/Cost function
• Calculates the distance between the predicted value and the actual
value
• Used as a learning criterion of the optimization problem
• Common loss functions Mean Absolute Error (MAE), Mean Square
Error (MSE), Cross Entropy
Rules of Thumb for Loss Function Selection
• CNN models for regression problems, choose L1 loss or L2 loss as the
loss function.
• For classification problems, select the rest of the loss functions
• Cross entropy loss is the most popular choice, with a softmax layer in
the end.
• The selection of loss function in CNNs also depends on the
application scenario. For example, when it comes to face recognition,
contrastive loss and triplet loss are turned out to be the commonly-
used ones nowadays.
Optimizer
• In convolutional neural networks, need to optimize non-convex
functions.
• Mathematical methods require huge computing power, so optimizers
are used in the training process to minimize the loss function for
getting optimal network parameters within acceptable time.
• Common optimization algorithms are Momentum, RMSprop, Adam,
etc.
Applications of one-dimensional CNN
• Time Series Prediction
• Electrocardiogram (ECG) time series, weather forecast, and traffic flow
prediction, highway traffic flow prediction
• Signal Identification
• ECG signal identification, structural damage identification, and system fault
identification
Applications of two-dimensional CNN
• Image Classification
• medical image classification, traffic scenes related classification, classify
breast cancer tissues
• Object Detection
• Image Segmentation
• Face Recognition
Applications of multi-dimensional CNN
• Human Action Recognition
• Object Recognition/Detection
Conclusion
• Due to the advantages of convolutional neural networks, such as local
connection, weight sharing, and down-sampling dimensionality reduction,
they have been widely deployed in both research and industry projects
• First, we discussed basic building blocks of CNN and how to construct a
CNN-based model from scratch
• Secondly, some excellent CNN networks
• Third, we introduce activation functions, loss functions, and optimizers for
CNN
• Fourth, we discuss some typical applications of CNN
• CNN can be refined further in terms of model size, security, and easy
hyperparameters selection. Moreover, there are lots of problems that
convolution is hard to handle, such as low generalization ability, lack of
equivariance, and poor crowded-scene results, so that several promising
directions are pointed.

Weitere ähnliche Inhalte

Was ist angesagt?

Alpaydin - Chapter 2
Alpaydin - Chapter 2Alpaydin - Chapter 2
Alpaydin - Chapter 2
butest
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Simplilearn
 
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
Simplilearn
 

Was ist angesagt? (20)

Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
 
Transfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine LearningTransfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine Learning
 
Deep learning
Deep learningDeep learning
Deep learning
 
Deep Learning With Neural Networks
Deep Learning With Neural NetworksDeep Learning With Neural Networks
Deep Learning With Neural Networks
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
 
Autoencoders
AutoencodersAutoencoders
Autoencoders
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
Neural Networks: Self-Organizing Maps (SOM)
Neural Networks:  Self-Organizing Maps (SOM)Neural Networks:  Self-Organizing Maps (SOM)
Neural Networks: Self-Organizing Maps (SOM)
 
Alpaydin - Chapter 2
Alpaydin - Chapter 2Alpaydin - Chapter 2
Alpaydin - Chapter 2
 
Introduction to batch normalization
Introduction to batch normalizationIntroduction to batch normalization
Introduction to batch normalization
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
Deep Neural Networks (DNN)
Deep Neural Networks (DNN)Deep Neural Networks (DNN)
Deep Neural Networks (DNN)
 
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
 
Recurrent neural network
Recurrent neural networkRecurrent neural network
Recurrent neural network
 
HOPFIELD NETWORK
HOPFIELD NETWORKHOPFIELD NETWORK
HOPFIELD NETWORK
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)
 
Machine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural NetworkMachine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural Network
 

Ähnlich wie A Survey of Convolutional Neural Networks

intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
ssuser3aa461
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
Pierre de Lacaze
 

Ähnlich wie A Survey of Convolutional Neural Networks (20)

Deep learning
Deep learningDeep learning
Deep learning
 
Handwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPTHandwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPT
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
Wits presentation 6_28072015
Wits presentation 6_28072015Wits presentation 6_28072015
Wits presentation 6_28072015
 
Sp19_P2.pptx
Sp19_P2.pptxSp19_P2.pptx
Sp19_P2.pptx
 
Autoencoders for image_classification
Autoencoders for image_classificationAutoencoders for image_classification
Autoencoders for image_classification
 
Digit recognition
Digit recognitionDigit recognition
Digit recognition
 
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
 
FINAL_Team_4.pptx
FINAL_Team_4.pptxFINAL_Team_4.pptx
FINAL_Team_4.pptx
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its Applications
 
DL.pdf
DL.pdfDL.pdf
DL.pdf
 
DSRLab seminar Introduction to deep learning
DSRLab seminar   Introduction to deep learningDSRLab seminar   Introduction to deep learning
DSRLab seminar Introduction to deep learning
 
Lecture on Deep Learning
Lecture on Deep LearningLecture on Deep Learning
Lecture on Deep Learning
 

Kürzlich hochgeladen

"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
Health
 

Kürzlich hochgeladen (20)

Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdf
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic Marks
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
 

A Survey of Convolutional Neural Networks

  • 1. Short Story Submission Rimzim Thube SJSU ID- 014555021
  • 2. A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects Zewen Li, Wenjie Yang, Shouheng Peng, Fan Liu, Member, IEEE
  • 3. Introduction to Convolution Neural Network (CNN) •Applications using CNN – •Face recognition •Autonomous vehicles •Self-service supermarket •Intelligent medical treatment
  • 4. Emergence of CNN • McCulloch and Pitts – First mathematical MP model of neurons • Rosenblatt - Added learning capability to MP model • Hinton – Proposed multi-layer feedforward network trained by the error Back Propagation – BP network • Waibel - Time Delay Neural Network (TDNN) for speech recognition • LeCun – First convolution network (LeNet) to recognize handwritten text
  • 5. Overview of CNN • Feedforward neural network • Extracts features from data from convolution structures • Architecture inspired by visual perception • Biological neuron corresponds to an artificial neuron • CNN kernels represent different receptors that can respond to various features • Activation function transmit signal to next neuron if it exceeds certain threshold • Loss functions and optimizers teach the whole CNN system to learn
  • 6. Advantages of CNN • Local connections – Each neuron connected to not all but small no. of neurons. Reduces parameters and speed up convergence. • Weight sharing - Connections share same weights • Down-sampling dimensionality reduction. • These characteristics make CNN most representative algorithms
  • 7. Components of CNN • Convolution - pivotal step for feature extraction. Output is feature map • Padding - introduced to enlarge the input with zero value • Stride – Control the density of convolution • Pooling - Obviate redundancy or down sampling
  • 8. LeNet - 5 • Composed of 7 trainable layers containing 2 convolutional layers, 2 pooling layers, and 3 fully-connected layers • NN characteristics of local receptive fields, shared weights, and spatial or temporal subsampling, ensures shift, scale, and distortion • Used for handwriting recognition
  • 9. AlexNet • Has 8 layers, containing 5 convolutional layers and 3 fully-connected layers • uses ReLU as the activation function of CNN to solve gradient vanishing • Dropout was used in last few layers to avoid overfitting • Local Response Normalization (LRN) to enhance generalization of model
  • 10. AlexNet • Employ 2 powerful GPUs, two feature maps generated by two GPUs can be combined as the final output • Enlarges dataset and calculates average of their predictions as final result • Principal Component Analysis (PCA) to change the RGB values of training set
  • 11. VGGNet • LRN layer was removed • VGGNets use 3 × 3 convolution kernels rather than 5 × 5 or 5 × 5 ones, since several small kernels have the same receptive field and more nonlinear variations compared with larger ones.
  • 12. GoogLeNet - Inception v1 • CNN formed by stacking with Inception modules • Inception v1 deploys 1 × 1, 3 × 3, 5 × 5 convolution kernels to construct a “wide” network • Convolution kernels with different sizes can extract the feature maps of different scales of the image • 1 × 1 convolution kernel is used to reduce the number of channels, i.e., reduce computational cost
  • 13. GoogLeNet - Inception v2 • Output of every layer is normalized to increase the robustness of model and train it with high learning rate • Single 5 × 5 convolutional layers can be replaced by two 3 × 3 ones • One n x n convolutional layer can be replaced byone 1 x n and one n x 1 convolutional layer • Filter banks expanded wider to improve high dimensional representations
  • 14. ResNet • Two layer residual block constructed by the shortcut connection • 50-layer ResNet, 101-layer ResNet, and 152-layer ResNet utilize three- layer residual blocks • Three-layer residual block is also called the bottleneck module because the two ends of the block are narrower than the middle • Can mitigate the gradient vanishing problem since the gradient can directly flow through shortcut connections •
  • 15. DCGAN • GAN has generative model G and a discriminative model D • The model G with random noise z generates a sample G(z) that subjects to the data distribution data learned by G. • The model D can determine whether the input sample is real data x or generated data G(z). • Both G and D can be nonlinear functions. The aim of G is to generate real data, the aim of D is to distinguish fake data generated by G from the real data
  • 16. MobileNets • lightweight models proposed by Google for embedded devices such as mobile phones • depth-wise separable convolutions and several advanced techniques to build thin deep neural networks.
  • 17. ShuffleNets • Series of CNN-based models to solve the problem of insufficient computing power of mobile devices • Combine pointwise group convolution, channel shuffle, which significantly reduce the computational cost with little loss of accuracy
  • 18. GhostNet • As large amounts of redundant features are extracted by existing CNNs for image cognition, GhostNet is used to reduce computational cost effectively • Similar feature maps in traditional convolution layers are called ghost • Traditional convolution layers divided into two parts • Less convolution kernels are directly used in feature extraction • These features are processed in linear transformation to acquire multiple feature maps. They proved that Ghost module applies to other CNN models
  • 19. Activation function • In a multilayer neural network, there is a function between two layers which is called activation function • Determines which information should be transmitted to the next neuron • If no activation function, input layer will be linear function of the output • Nonlinear functions are introduced as activation functions to enhance ability of neural network
  • 20. Types of activation function • Sigmoid function can map a real number to (0, 1), so it can be used for binary classification problems. • Tanh function maps a real number to (-1, 1), achieves normalization. This makes the next layer easier to learn. • Rectified Linear Unit (ReLU), when x is less than 0, its value is 0; when x is greater than or equal to 0, its value is x itself. Speeds up learning. • ELU function has a negative value, so the average value of its output is close to 0, making the rate of convergence faster than ReLU.
  • 21. Loss/Cost function • Calculates the distance between the predicted value and the actual value • Used as a learning criterion of the optimization problem • Common loss functions Mean Absolute Error (MAE), Mean Square Error (MSE), Cross Entropy
  • 22. Rules of Thumb for Loss Function Selection • CNN models for regression problems, choose L1 loss or L2 loss as the loss function. • For classification problems, select the rest of the loss functions • Cross entropy loss is the most popular choice, with a softmax layer in the end. • The selection of loss function in CNNs also depends on the application scenario. For example, when it comes to face recognition, contrastive loss and triplet loss are turned out to be the commonly- used ones nowadays.
  • 23. Optimizer • In convolutional neural networks, need to optimize non-convex functions. • Mathematical methods require huge computing power, so optimizers are used in the training process to minimize the loss function for getting optimal network parameters within acceptable time. • Common optimization algorithms are Momentum, RMSprop, Adam, etc.
  • 24. Applications of one-dimensional CNN • Time Series Prediction • Electrocardiogram (ECG) time series, weather forecast, and traffic flow prediction, highway traffic flow prediction • Signal Identification • ECG signal identification, structural damage identification, and system fault identification
  • 25. Applications of two-dimensional CNN • Image Classification • medical image classification, traffic scenes related classification, classify breast cancer tissues • Object Detection • Image Segmentation • Face Recognition
  • 26. Applications of multi-dimensional CNN • Human Action Recognition • Object Recognition/Detection
  • 27. Conclusion • Due to the advantages of convolutional neural networks, such as local connection, weight sharing, and down-sampling dimensionality reduction, they have been widely deployed in both research and industry projects • First, we discussed basic building blocks of CNN and how to construct a CNN-based model from scratch • Secondly, some excellent CNN networks • Third, we introduce activation functions, loss functions, and optimizers for CNN • Fourth, we discuss some typical applications of CNN • CNN can be refined further in terms of model size, security, and easy hyperparameters selection. Moreover, there are lots of problems that convolution is hard to handle, such as low generalization ability, lack of equivariance, and poor crowded-scene results, so that several promising directions are pointed.