SlideShare ist ein Scribd-Unternehmen logo
1 von 13
1
Convolutional Neural Networks
And Facial Recognition
By Taylee Gray
May 16th, 2019
Towson University
MATH 490
2
Table of Contents
Table of Contents 2
Introduction 3
Example Applications 4
Inputs, Labels, Outputs 5
A Description of the Model Function 5
Description of Stochastic Gradient Descent 7
Description of Backpropagation 8
How Pooling Affects Complexity 9
Demonstration 10
References 12
3
Introduction
The Olivetti dataset from AT&T Laboratories of Cambridge University is the dataset that
will help us conclude the importance of Convolutional Neural Networks (ConvNets or CNNs) in
facial recognition. This dataset portrays ten different portraits of the same person with a distinct
set of forty individuals. With this dataset we will be showing how CNNs are most effective in
image recognition and classification.
For starters, an Artificial Neural Network (ANN) is a computational model that is
inspired by the way biological neural networks in the human brain process information
(Ujjwalkarn). When referring to a neural network, the basic unit of computation is the neuron.
These neurons are usually called nodes. A single neuron receives input from other nodes to
which the neural network then creates an output. All inputs have an associated weight to
compute the weighted sum output. Determining the weights usually depends on the relative
importance to other inputs on the assigned basis.
Input 1→ →Output
Input 2→
The neural network above computes the numerical input from 𝑥1 and 𝑥2 along with the
corresponding weights as well as the input of 1 with weight 𝑏 (known as our bias) to get an
output. Of the many activation functions, the sigmoid is the most common in a NN. The sigmoid
function takes a real-valued input and fits that input into a range between zero and one. As for a
CNN, typically the ReLU function is more prominent.
Convolutional Neural Networks are a category of neural networks. These networks get
more in depth hence the concept of deep learning. Deep neural networks are powerful algorithms
often much harder to train than shallow. In 1994, one of the first ConvNets was pioneered by
Yann LeCun (Ujjwalkarn). This propelled the field of deep learning and was later named LeNet5.
LeNet5 was architecture by a sequence of three layers:
● 32𝑥32 input layer Convolutions and subsampling layers consisting of:
● 6 different 28𝑥28 feature maps
● 6 different 14𝑥14 feature maps
● 16 different 10𝑥10 feature maps
● 16 different 5𝑥5 feature maps Fully connected 2D layer
● 120 outputs to 84 outputs to the final output of 10
f(w1x1+w2x2+b)
1
x1
x2
Y
b
w 1
w 2
4
Without a visual it is still clear to see that the original input is the largest which is then
decomposed by pooling and nonlinearity to extract specific features eventually giving us the
optimal output.
Example Applications
Time and technology go hand in hand and as they progress we are thrown with both
positives and negatives. CNNs can be used for the greater good for instance, by automatically
detecting cancer in endoscopic images. Studies by Yuma Endo and other engineers at AI Medical
Service, Inc. in Japan presented a CNN that was trained using 13,584 endoscopic images of
gastric cancer (Hirasawa). In order to improve the accuracy of the CNN they also trained an
independent test set of 2296 stomach images. From these images, 77 gastric cancer lesions were
applied to the CNN from 69 consecutive patients. As a result, the CNN correctly diagnosed 92%
of its patients so we can conclude that this could be well applicable to the medical field reducing
the burden of endoscopists. In general, the medical field is always advancing and what is popular
at the moment is how CNNs are used to dispense medication based on a facial scan,
Facial recognition is used worldwide in our everyday lives. It has been successful in
catching criminals by the use of surveillance cameras. Any time a person goes missing there is a
limited period of time to find them before the odds reduce significantly. As previously stated, the
use of surveillance cameras in these situations are positively effective when combined with facial
detection software like CNNs. Apple made its huge debut with the iPhone X having facial
recognition unlock the device itself. Also, pinpointing terrorists by the use of facial recognition
helps minimize the possibility of attacks; the list goes on and on.
However, on the opposite spectrum of things, is facial recognition ethical? Despite the
many positives of facial recognition, CNNs invoke plenty criticism regarding the legality and
ethic standpoint. Facebook currently faces a lawsuit over its own facial recognition technology
called, DeepFace. This technology identified people in photos without their consent. Amazon's
smart home company Ring also came under fire for the same violation of civil rights.
As of May 15th, 2019, San Francisco banned facial recognition technology making it the
first city in the United States to have such a restriction. The disadvantage of these networks is
simply the bias, bias referring to people being falsely accused and recognized. This bias also
references to people of color and inaccurately exposes them. According to CBS News, twenty-
eight members of congress falsely matched up with mugshots of criminals. The ban will not
apply to federal use due to security reasons. All in all, it is hard to depict a line between the good
and bad in ethics when it comes to facial recognition in CNNs.
5
Inputs, Labels, Outputs
Our input for our model will be a total of 400 images. There are 40 people in our dataset
each with 10 pictures of them. The photos are of size 64x64 and grayscale so that we don’t have
to use a three dimensional input for our model. The images do not show large groups of people
or whole shots of a person’s body. They just contain the subject’s face with no background
showing. The labels for the model are the names of each person in our subject group, or more
precisely a number associated with that person.
The output layer of the model is represented by 40 nodes. Each node represents a person
and their likelihood of the input image being that person. We do this by using the softmax
activation function. The softmax function takes an input vector and normalizes it into a
probability distribution consisting of 40 probabilities. The softmax function is given by:
(Where is the sigmoid and is the kronecker delta function)
Using the softmax we can create a probability distribution across all output nodes of the
likelihood of the image being associated with it. The node with the largest probability ends up
being the node that the Neural Network “selects”. Therefore, it selects one out of 40 names to be
the label.
A Description of the Model Function
There are four primary portions of a Convolutional Neural Network that differentiates it
from a traditional Fully Connected Neural Network. CNNs include Convolutional Layers, and
Pooling Layers. These layers are introduced in the beginning of the network so that the image
that is being used as input does not have to be used as is in the Fully Connected Layer. Typically
(though not exclusively) CNNs use the ReLU activation function. They also use something
called Dropout to protect from overfitting, and a Flattening Function to connect the
preprocessing to a traditional NN. CNNs are useful because it is unwise to use a raw image as
the first layer of a traditional NN due to The Curse of Dimensionality. The Curse of
Dimensionality states that as the dimensions of your input increases, so should the number of
data points (exponentially) that way you can “fill” the feature space adequately (Shetty, 2019).
In the case of facial recognition an image tends to be very large, and therefore the feature space
tends to be extremely large. A pixel image creates a dimension space and
three times that if you have a colored image (one dimension for each color channel in the image).
If it is not feasible to get more data points the only option is to implement a kind of
preprocessing to decrease complexity. This is the basis of CNNs.
6
The first step of a CNN is its namesake the Convolutional Layer. The purpose of
convolution is to extract features from the image while maintaining a kind of spatial relationship
between the pixels in the image. To convolve an image, a “filter” or “kernel” is used. Call it .
The filter is a matrix of some predetermined size that is less than the size of the input
image with size . The filter also has a “stride” , or the number of pixels
moved per step either by row or column. The alse filter has randomly initialized numbers as the
parameters of the matrix that will later be learned via backpropagation. Lastly there is the
number of filters used in a layer .
The first step of convolution is to place the top left of your filter at the top left of your
image. Call the portion of that overlaps . Then compute the sum of the element-wise
product of the two overlapping matrices. In other words:
.
This is the element of the convolved feature. Next, take a single stride to the right and repeat
this process. Striding right increments the row of the convolved feature. When your filter hits
the end of you input image, reset the filter to the beginning of the row and take a stride down.
Increment the column of the convolved feature. Continue until the bottom of the image is
reached. If you have a three dimensional feature set (a colored image), you should repeat this
process for each slice of the third dimension.
This process results in a convolved feature of size:
.
After a convolutional layer, it is convention to use the Rectified Linear Unit (or ReLU) as
our activation function. ReLU is defined as:
.
Previously, we had been using the Sigmoid function as our activation function. The reason
ReLU is far more commonly used is that Sigmoid suffers from something called the vanishing
gradient.
Because ,
It can be shown that .
Repeatedly multiplying small numbers over multiple layers will result in a smaller and smaller
gradient that eventually will yield no change to your model. On the other hand, using ReLU, the
gradient is zero if and one if . (Because ReLU is non differentiable on it is
said the derivative is zero in practice.) Because of this, multiple gradients of the the ReLU
7
neither explode nor vanish, so it makes a good activation function for a model with many layers
like a CNN.
The Pooling Layer acts similarly to the Convolutional Layer, where it reduces the
dimensionality of the input while trying to maintain the important information of the feature
map. By reducing the dimensionality of the input it is more easily computable, better protected
against overfitting, and less likely to be affected by tiny transformations distortions to the input.
Like the convolutional filter, pooling has a filter as well, with a size but unlike a
convolutional filter it does not have parameters to be changed by backpropagation. Instead, it
uses a function to get its output number. The most popular and effective option is max-pooling,
where the element of the output matrix is the largest number inside the pooling filter. There also
exist sum and average functions that can be used.
While pooling can help fight overfitting, it is not always successful. That is where we
can implement dropout. Dropout is a “regularization method that approximates training a large
number of neural networks with different architectures in parallel (Brownlee, 2019).” To do this,
while training the network a random selection of neurons are ignored (dropped out), meaning all
incoming and outgoing connections are ignored. This ends up making noise and error in the
training process that helps keep any one neuron from getting weights that are too large. Very
large weights are a sign that your data has been over fit. Buy thinning out the network using
dropout it means it is possible for us to require more neurons to maintain the same number of
actuated neurons in the network. This process keeps our data from being over fitted.
The last step of a CNN is the flattening. Flattening simply takes our output of
our convolution and pooling layers and transforms it to be taken in as a vector to our fully
connected layer. This can be done by having reading the matrix left-to-right, top-to-bottom and
filling out the one-dimensional vector element-by-element. The result of this is a vector of
length that can be used as the input of our Fully Connected Layer. Once we have this
input vector we can operate our NN as a typical multi-layer perceptron. We are assuming you
have a background in these kinds of NN and so we won’t be going into detail of the structure of
them here.
Description of Stochastic Gradient Descent
In the CNN process the gradient descent optimization algorithm aims to minimize some
cost/loss function based on that function’s gradient. Successive iterations are employed to
progressively approach either a local or global minimum of the cost function..
Recall the gradient of a function is given by:
The gradient can be intuitively thought of as the path of descent a ball would take while
rolling down a hill. However, using neural networks we don’t have access to this function
8
otherwise we would be done! There would be no reason to train a model. Instead we have
access to a loss function which can help us approximate this.
Our loss function is described as the sum of differences between what your model gave
you and what your model should give you:
(This is the Mean Squared Error or MSE.)
Using our loss function we can approximate GD using a subset of our training data. This
is the difference between Stochastic Gradient Descent and Gradient Descent.
The iterative process for SGD is as follows:
● Start at some randomly selected vector
● For each step t, have some process to generate . A subset of our training set.
● Then compute so that we can use to formula to update our vector:
● Here is our predetermined learning rate.
Repeating this process we approach a local minimum of our loss function. Our hope is
that this local minimum would actually be the global minimum so that our model would be
entirely optimized, but this is not guaranteed. Also, because SGD only is applying a subset of
it is smaller than GD and thus gets to the minimum much faster than GD but probably won’t
converge to the minimum. Instead, it will oscillate around the minimum, but the approximation
is good enough (Ng).
Description of Backpropagation
Backpropagation is how a neural network learns by looking at how much error or cost
was calculated and trying to minimize that number. For CNN’s we look at the error between
each of our layers and decided whether or not something matches in that layer and if so how well
it matches. From there the neural network decides what or who it is looking at. For example the
Cost of a single Neuron in a network can be given as:
Where C is the cost function:
,
R is the ReLU, Z is our weighted input (𝑍 = 𝑍𝑍), and W is our weight. Taking the
partial derivative of our error function with respect to the weight we are looking at will show us
how much error contributes to our function, and as such we need to expand our formula using the
chain rule:
9
These partial derivatives are used to check each parameter and how that parameter
contributes to the total change in error. Furthermore as we go through multiple layers our cost
function will have more and more inputs and as such the expansion can get rather lengthy and
cumbersome. However as we go through more weights we use the previous calculated weights
when using the chain rule and as such we have the program remember those values so that it
does not need to recalculate them.
With the information gathered we can no find the actual error of the layer and see how
that is impacting our final layer that is our result. That is the derivative of our Cost function with
respect to the output layer (Zo) and hidden layers (Zh):
We can see the hidden layer error is equivalent to the output layer times our weighted
output times the ReLU of our hidden layer. This is the general process of backpropagation where
we keep shuffling our weighted error back to our previous layer and continue to filter our output
until we arrive at an error that is suitable to the machine to make an informed decision on what
the image is and to put a name to it. How the machine decides to adjust which weights of each
input is derived the SGD above and depending on if one value is more or less will adjust the
weight accordingly to give us the lowest error possible.
How Pooling Affects Complexity
Pooling is an important layer in Convolutional Neural Networks because it decreases the
complexity of your model. By down sampling our input layer by layer we can protect against
overfitting or minor variations in our data that would otherwise look like a completely new
image to a naive network. We also decrease the dimensionality of the network which improves
manageability and runtime
There are several different kinds of functions that can be used in your Pooling Layer that
all perform differently. There is a Sum, Average, and Max Pooling function. In a network it
isn’t uncommon that one might use a combination of these. Sum and Average have similar
results, just scaled differently so the comparison to be had is the difference between Average and
Max. Max pooling takes the largest value over the filter while Average obviously takes the
average. The intuition here is that Max takes the “most important” feature, while Average takes
into consideration all of the values within the filter. According to Arpathy in Convolutional
Neural Networks for Visual Recognition, the most effective pooling function is Max Pooling due
10
to its ability to extract the valuable information from the image and is not affected as much by
small variations.
However, if a Pooling Layer has the same structure as a Convolutional Layer minus the
parameters and adding a predetermined function, would it be feasible to replace it with another
Convolutional Layer in an attempt to get the best of both worlds. The Convolutional Layer still
can down sample the input features but it can use SGD to learn filters that would hopefully
extract more information from the image. In Springberg et. al’s Striving for Simplicity: The All
Convolutional Net they explored this possibility in hopes of finding a more simplistic CNN
model. They used the CIFAR-10 (Krizhevsky & Hinton, 2009) dataset to study different
models.
The first model used was the control model or your typical CNN with Pooling Layers.
The second model used was a All-CNN with an incremented stride on the Convolutional Layers
that precede the removed Pooling Layers. It is important to increment the stride because the next
layer you output to should be accepting an input of the same spatial region as it was in the
control model. The final model was using a replacement of the Pooling Layer with a
Convolutional Layer. After comparing the models results showed an increase in accuracy with
Pooling Layers replaced and a lower Loss as well.
Our expectation was that a network with no Pooling Layers would perform poorly due to
overfitting of the data, which is the main advantage to using Pooling Layers in the first place. In
order to test Springberg’s conclusions we ran tests of our own on a relatively simply structured
network with not overly aggressive overfitting counter measures to see if these All
Convolutional Networks are prone to overfitting.
Demonstration
To demonstrate the effect of no pooling on a Convolutional Neural Networks we started
by first building a typical CNN. We built the CNN using keras which is a high-level python
library built on Google’s TensorFlow for Machine Learning. Our CNN started with ten pictures
each of forty different people from AT&T’s Olivetti dataset. Important to note that the images we
used had been preprocessed to only be the face of the person being photographed. We had tried
using the Labeled Faces in the Wild dataset and found that the images were not close enough to
the subjects face and included too much of the background to create successful a model. There
are ways to process those faces into usable inputs but we opted to change datasets for simplicity.
The structure of our CNN was as follows:
● Convolutional Layer of size 64, with filter size 3 x 3, stride of 1, with the ReLU
activation function
● Max Pooling Layer with filter size 2 x 2, stride of 2
● Another Convolutional Layer
● Another Max Pooling Layer
11
● A 10% Dropout Layer
● The Flattening Function
● A length 64 vector for the first layer of our Fully Connected Network
● A 20% Dropout Layer
● Our output layer with 40 nodes using the softmax activation function.
After we make the structure of the network the next steps are to define the optimizer and
it’s parameters. As we mentioned before out optimizer was Stochastic Gradient Descent with
, as well as a few other tweaking parameters (momentum=1, decay=0.05). After
defining the input, structure, and optimizer the last thing to do is run the Network. We used a
batch size of 10 and ran through around 100 epochs each time.
Here are the graphs of accuracy and loss for a our baseline CNN using SGD and Max
Pooling Layers. After 100 epochs we earned validation accuracy of almost 80% and a validation
loss of near 1. By looking a the graphs we can see that both scores on training and testing data
are fairly close. This is a good indication that we have not overfit our data.
The next step was to remove our pooling layers and see how the model faired. In order to
remove the pooling layers there are a few steps you have to make to maintain the models
integrity. According the Springberg, the way to convert your traditional CNN to and all
Convolutional one is to first change the Pooling Layer into a Convolutional Layer using the same
size and stride as the Pooling Layer was. Next and importantly you have to increase the stride of
the Convolutional Layer preceding the original Pooling Layer by one so the network can
maintain the same feature size. Here were the results after 200 epochs.
12
Our training data without pooling did perform better than with pooling with an accuracy
of almost 90% and a loss of below 0.5. However, our testing data did not do very well at all,
with testing loss rising to over 2.0 at some points. This is a clear case of overfitting our data.
This makes sense because part of the reason pooling is used is to prevent overfitting. While our
loss did fair worse there were some advantages to using a CNN with no pooling. While the
model did learn more slowly, it actually ran drastically quicker. The CNN with pooling took
around 4 seconds per epoch while the CNN without pooling took under half that time. For this
reason we believe that if you used more aggressive anti-overfitting techniques like a higher
dropout percentage, then maybe you could see some real benefit to using an All Convolutional
Network.
References
Arpathy, K. “Convolutional Neural Networks for Visual Recognition.” CS231n
Convolutional Neural Networks for Visual Recognition, cs231n.github.io/convolutional-
networks/.
“Backpropagation.” Backpropagation - ML Documentation, ml-
cheatsheet.readthedocs.io/en/latest/backpropagation.html.
Brownlee, Jason. A Gentle Introduction to Dropout for Regularizing Deep Neural Networks.
21 Apr. 2019, machinelearningmastery.com/dropout-for-regularizing-deep-neural-
networks/. Accessed 13 May 2019.
Cbs/ap. “San Francisco Bans Facial Recognition Technology.” CBS News, CBS Interactive,
15 May 2019, www.cbsnews.com/news/san-francisco-becomes-first-us-city-to-ban-facial-
recognition-technology-today-2019-05-14/.
13
Hirasawa, Toshiaki, et al. “Application of Artificial Intelligence Using a Convolutional
Neural Network for Detecting Gastric Cancer in Endoscopic Images.” SpringerLink,
Springer Japan, 15 Jan. 2018, link.springer.com/article/10.1007/s10120-018-0793-2.
Krizhevsky, A, and G Hinton. Learning Multiple Layers of Features from Tiny Images.2009.
Ng, Andrew. Supervised Learning. Stanford University, cs229.stanford.edu/notes/cs229-
notes1.pdf.
Shetty, Badreesh, and Badreesh Shetty. Curse of Dimensionality. 15 Jan. 2019,
towardsdatascience.com/curse-of-dimensionality-2092410f3d27. Accessed 13 May 2019.
Springberg, Jost Tobias, et al. Striving For Simplicity: The All Convolutional Net. Department
of Computer Science University of Freiburg, 2015, arxiv.org/pdf/1412.6806.pdf.
Suárez-Paniagua, Víctor. “Evaluation of Pooling Operations in Convolutional Architectures
for Drug-Drug Interaction Extraction.” BMC Bioinformatics, BioMed Central, 13 June
2018, www.ncbi.nlm.nih.gov/pmc/articles/PMC5998766/.
Ujjwalkarn. “A Quick Introduction to Neural Networks.” The Data Science Blog, 10 Aug.
2016, ujjwalkarn.me/2016/08/09/quick-intro-neural-networks/.

Weitere ähnliche Inhalte

Was ist angesagt?

Passive Image Forensic Method to Detect Resampling Forgery in Digital Images
Passive Image Forensic Method to Detect Resampling Forgery in Digital ImagesPassive Image Forensic Method to Detect Resampling Forgery in Digital Images
Passive Image Forensic Method to Detect Resampling Forgery in Digital Imagesiosrjce
 
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...IRJET Journal
 
Gender Classification using SVM With Flask
Gender Classification using SVM With FlaskGender Classification using SVM With Flask
Gender Classification using SVM With FlaskAI Publications
 
let's dive to deep learning
let's dive to deep learninglet's dive to deep learning
let's dive to deep learningMohamed Essam
 
Speech Processing with deep learning
Speech Processing  with deep learningSpeech Processing  with deep learning
Speech Processing with deep learningMohamed Essam
 
Efficient mobilenet architecture_as_image_recognit
Efficient mobilenet architecture_as_image_recognitEfficient mobilenet architecture_as_image_recognit
Efficient mobilenet architecture_as_image_recognitEL Mehdi RAOUHI
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural NetworkVignesh Suresh
 
IMAGE COMPOSITE DETECTION USING CUSTOMIZED
IMAGE COMPOSITE DETECTION USING CUSTOMIZEDIMAGE COMPOSITE DETECTION USING CUSTOMIZED
IMAGE COMPOSITE DETECTION USING CUSTOMIZEDijcga
 
Gesture Recognition Review: A Survey of Various Gesture Recognition Algorithms
Gesture Recognition Review: A Survey of Various Gesture Recognition AlgorithmsGesture Recognition Review: A Survey of Various Gesture Recognition Algorithms
Gesture Recognition Review: A Survey of Various Gesture Recognition AlgorithmsIJRES Journal
 
Facial Image Analysis for age and gender and
Facial Image Analysis for age and gender andFacial Image Analysis for age and gender and
Facial Image Analysis for age and gender andYuheng Wang
 
Movie Sentiment Analysis using Deep Learning RNN
Movie Sentiment Analysis using Deep Learning RNNMovie Sentiment Analysis using Deep Learning RNN
Movie Sentiment Analysis using Deep Learning RNNijtsrd
 
IRJET- Machine Learning based Object Identification System using Python
IRJET- Machine Learning based Object Identification System using PythonIRJET- Machine Learning based Object Identification System using Python
IRJET- Machine Learning based Object Identification System using PythonIRJET Journal
 
Scrdet++ analysis
Scrdet++ analysisScrdet++ analysis
Scrdet++ analysisNEHA Kapoor
 
Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1 Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1 ahmed mokhtar
 
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
HUMAN MOTION  DETECTION AND TRACKING FOR VIDEO SURVEILLANCEHUMAN MOTION  DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCENEHA THADEUS
 

Was ist angesagt? (20)

Passive Image Forensic Method to Detect Resampling Forgery in Digital Images
Passive Image Forensic Method to Detect Resampling Forgery in Digital ImagesPassive Image Forensic Method to Detect Resampling Forgery in Digital Images
Passive Image Forensic Method to Detect Resampling Forgery in Digital Images
 
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...
 
Gender Classification using SVM With Flask
Gender Classification using SVM With FlaskGender Classification using SVM With Flask
Gender Classification using SVM With Flask
 
let's dive to deep learning
let's dive to deep learninglet's dive to deep learning
let's dive to deep learning
 
Speech Processing with deep learning
Speech Processing  with deep learningSpeech Processing  with deep learning
Speech Processing with deep learning
 
Image recognition
Image recognitionImage recognition
Image recognition
 
Object tracking
Object trackingObject tracking
Object tracking
 
Efficient mobilenet architecture_as_image_recognit
Efficient mobilenet architecture_as_image_recognitEfficient mobilenet architecture_as_image_recognit
Efficient mobilenet architecture_as_image_recognit
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural Network
 
IMAGE COMPOSITE DETECTION USING CUSTOMIZED
IMAGE COMPOSITE DETECTION USING CUSTOMIZEDIMAGE COMPOSITE DETECTION USING CUSTOMIZED
IMAGE COMPOSITE DETECTION USING CUSTOMIZED
 
Gesture Recognition Review: A Survey of Various Gesture Recognition Algorithms
Gesture Recognition Review: A Survey of Various Gesture Recognition AlgorithmsGesture Recognition Review: A Survey of Various Gesture Recognition Algorithms
Gesture Recognition Review: A Survey of Various Gesture Recognition Algorithms
 
Facial Image Analysis for age and gender and
Facial Image Analysis for age and gender andFacial Image Analysis for age and gender and
Facial Image Analysis for age and gender and
 
Object Recognition
Object RecognitionObject Recognition
Object Recognition
 
Movie Sentiment Analysis using Deep Learning RNN
Movie Sentiment Analysis using Deep Learning RNNMovie Sentiment Analysis using Deep Learning RNN
Movie Sentiment Analysis using Deep Learning RNN
 
IRJET- Machine Learning based Object Identification System using Python
IRJET- Machine Learning based Object Identification System using PythonIRJET- Machine Learning based Object Identification System using Python
IRJET- Machine Learning based Object Identification System using Python
 
deep learning
deep learningdeep learning
deep learning
 
Scrdet++ analysis
Scrdet++ analysisScrdet++ analysis
Scrdet++ analysis
 
Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1 Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1
 
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
HUMAN MOTION  DETECTION AND TRACKING FOR VIDEO SURVEILLANCEHUMAN MOTION  DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 

Ähnlich wie Convolutional Neural Networks

CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION
CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION
CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION AIRCC Publishing Corporation
 
CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION
CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION
CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION ijcsit
 
PADDY CROP DISEASE DETECTION USING SVM AND CNN ALGORITHM
PADDY CROP DISEASE DETECTION USING SVM AND CNN ALGORITHMPADDY CROP DISEASE DETECTION USING SVM AND CNN ALGORITHM
PADDY CROP DISEASE DETECTION USING SVM AND CNN ALGORITHMIRJET Journal
 
Deep Neural Network DNN.docx
Deep Neural Network DNN.docxDeep Neural Network DNN.docx
Deep Neural Network DNN.docxjaffarbikat
 
Lung Cancer Detection using transfer learning.pptx.pdf
Lung Cancer Detection using transfer learning.pptx.pdfLung Cancer Detection using transfer learning.pptx.pdf
Lung Cancer Detection using transfer learning.pptx.pdfjagan477830
 
Image Classification And Skin cancer detection
Image Classification And Skin cancer detectionImage Classification And Skin cancer detection
Image Classification And Skin cancer detectionEman Othman
 
Deep learning for pose-invariant face detection in unconstrained environment
Deep learning for pose-invariant face detection in unconstrained environmentDeep learning for pose-invariant face detection in unconstrained environment
Deep learning for pose-invariant face detection in unconstrained environmentIJECEIAES
 
IRJET-Breast Cancer Detection using Convolution Neural Network
IRJET-Breast Cancer Detection using Convolution Neural NetworkIRJET-Breast Cancer Detection using Convolution Neural Network
IRJET-Breast Cancer Detection using Convolution Neural NetworkIRJET Journal
 
Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classificationijtsrd
 
Face Recognition Based Intelligent Door Control System
Face Recognition Based Intelligent Door Control SystemFace Recognition Based Intelligent Door Control System
Face Recognition Based Intelligent Door Control Systemijtsrd
 
IRJET- Automated Detection of Gender from Face Images
IRJET-  	  Automated Detection of Gender from Face ImagesIRJET-  	  Automated Detection of Gender from Face Images
IRJET- Automated Detection of Gender from Face ImagesIRJET Journal
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsKasun Chinthaka Piyarathna
 
Let_s_Dive_to_Deep_Learning.pptx
Let_s_Dive_to_Deep_Learning.pptxLet_s_Dive_to_Deep_Learning.pptx
Let_s_Dive_to_Deep_Learning.pptxMohamed Essam
 
Brain Tumor Detection Using Deep Learning ppt new made.pptx
Brain Tumor Detection Using Deep Learning ppt new made.pptxBrain Tumor Detection Using Deep Learning ppt new made.pptx
Brain Tumor Detection Using Deep Learning ppt new made.pptxvikyt2211
 
Designing a neural network architecture for image recognition
Designing a neural network architecture for image recognitionDesigning a neural network architecture for image recognition
Designing a neural network architecture for image recognitionShandukaniVhulondo
 

Ähnlich wie Convolutional Neural Networks (20)

asd.pptx
asd.pptxasd.pptx
asd.pptx
 
CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION
CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION
CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION
 
CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION
CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION
CONVOLUTIONAL NEURAL NETWORK BASED FEATURE EXTRACTION FOR IRIS RECOGNITION
 
PADDY CROP DISEASE DETECTION USING SVM AND CNN ALGORITHM
PADDY CROP DISEASE DETECTION USING SVM AND CNN ALGORITHMPADDY CROP DISEASE DETECTION USING SVM AND CNN ALGORITHM
PADDY CROP DISEASE DETECTION USING SVM AND CNN ALGORITHM
 
Deep Neural Network DNN.docx
Deep Neural Network DNN.docxDeep Neural Network DNN.docx
Deep Neural Network DNN.docx
 
Lung Cancer Detection using transfer learning.pptx.pdf
Lung Cancer Detection using transfer learning.pptx.pdfLung Cancer Detection using transfer learning.pptx.pdf
Lung Cancer Detection using transfer learning.pptx.pdf
 
Image Classification And Skin cancer detection
Image Classification And Skin cancer detectionImage Classification And Skin cancer detection
Image Classification And Skin cancer detection
 
Deep learning for pose-invariant face detection in unconstrained environment
Deep learning for pose-invariant face detection in unconstrained environmentDeep learning for pose-invariant face detection in unconstrained environment
Deep learning for pose-invariant face detection in unconstrained environment
 
IRJET-Breast Cancer Detection using Convolution Neural Network
IRJET-Breast Cancer Detection using Convolution Neural NetworkIRJET-Breast Cancer Detection using Convolution Neural Network
IRJET-Breast Cancer Detection using Convolution Neural Network
 
Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classification
 
Cnn
CnnCnn
Cnn
 
Deep learning and computer vision
Deep learning and computer visionDeep learning and computer vision
Deep learning and computer vision
 
Face Recognition Based Intelligent Door Control System
Face Recognition Based Intelligent Door Control SystemFace Recognition Based Intelligent Door Control System
Face Recognition Based Intelligent Door Control System
 
IRJET- Automated Detection of Gender from Face Images
IRJET-  	  Automated Detection of Gender from Face ImagesIRJET-  	  Automated Detection of Gender from Face Images
IRJET- Automated Detection of Gender from Face Images
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its Applications
 
DL.pdf
DL.pdfDL.pdf
DL.pdf
 
Let_s_Dive_to_Deep_Learning.pptx
Let_s_Dive_to_Deep_Learning.pptxLet_s_Dive_to_Deep_Learning.pptx
Let_s_Dive_to_Deep_Learning.pptx
 
Som paper1.doc
Som paper1.docSom paper1.doc
Som paper1.doc
 
Brain Tumor Detection Using Deep Learning ppt new made.pptx
Brain Tumor Detection Using Deep Learning ppt new made.pptxBrain Tumor Detection Using Deep Learning ppt new made.pptx
Brain Tumor Detection Using Deep Learning ppt new made.pptx
 
Designing a neural network architecture for image recognition
Designing a neural network architecture for image recognitionDesigning a neural network architecture for image recognition
Designing a neural network architecture for image recognition
 

Kürzlich hochgeladen

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Convolutional Neural Networks

  • 1. 1 Convolutional Neural Networks And Facial Recognition By Taylee Gray May 16th, 2019 Towson University MATH 490
  • 2. 2 Table of Contents Table of Contents 2 Introduction 3 Example Applications 4 Inputs, Labels, Outputs 5 A Description of the Model Function 5 Description of Stochastic Gradient Descent 7 Description of Backpropagation 8 How Pooling Affects Complexity 9 Demonstration 10 References 12
  • 3. 3 Introduction The Olivetti dataset from AT&T Laboratories of Cambridge University is the dataset that will help us conclude the importance of Convolutional Neural Networks (ConvNets or CNNs) in facial recognition. This dataset portrays ten different portraits of the same person with a distinct set of forty individuals. With this dataset we will be showing how CNNs are most effective in image recognition and classification. For starters, an Artificial Neural Network (ANN) is a computational model that is inspired by the way biological neural networks in the human brain process information (Ujjwalkarn). When referring to a neural network, the basic unit of computation is the neuron. These neurons are usually called nodes. A single neuron receives input from other nodes to which the neural network then creates an output. All inputs have an associated weight to compute the weighted sum output. Determining the weights usually depends on the relative importance to other inputs on the assigned basis. Input 1→ →Output Input 2→ The neural network above computes the numerical input from 𝑥1 and 𝑥2 along with the corresponding weights as well as the input of 1 with weight 𝑏 (known as our bias) to get an output. Of the many activation functions, the sigmoid is the most common in a NN. The sigmoid function takes a real-valued input and fits that input into a range between zero and one. As for a CNN, typically the ReLU function is more prominent. Convolutional Neural Networks are a category of neural networks. These networks get more in depth hence the concept of deep learning. Deep neural networks are powerful algorithms often much harder to train than shallow. In 1994, one of the first ConvNets was pioneered by Yann LeCun (Ujjwalkarn). This propelled the field of deep learning and was later named LeNet5. LeNet5 was architecture by a sequence of three layers: ● 32𝑥32 input layer Convolutions and subsampling layers consisting of: ● 6 different 28𝑥28 feature maps ● 6 different 14𝑥14 feature maps ● 16 different 10𝑥10 feature maps ● 16 different 5𝑥5 feature maps Fully connected 2D layer ● 120 outputs to 84 outputs to the final output of 10 f(w1x1+w2x2+b) 1 x1 x2 Y b w 1 w 2
  • 4. 4 Without a visual it is still clear to see that the original input is the largest which is then decomposed by pooling and nonlinearity to extract specific features eventually giving us the optimal output. Example Applications Time and technology go hand in hand and as they progress we are thrown with both positives and negatives. CNNs can be used for the greater good for instance, by automatically detecting cancer in endoscopic images. Studies by Yuma Endo and other engineers at AI Medical Service, Inc. in Japan presented a CNN that was trained using 13,584 endoscopic images of gastric cancer (Hirasawa). In order to improve the accuracy of the CNN they also trained an independent test set of 2296 stomach images. From these images, 77 gastric cancer lesions were applied to the CNN from 69 consecutive patients. As a result, the CNN correctly diagnosed 92% of its patients so we can conclude that this could be well applicable to the medical field reducing the burden of endoscopists. In general, the medical field is always advancing and what is popular at the moment is how CNNs are used to dispense medication based on a facial scan, Facial recognition is used worldwide in our everyday lives. It has been successful in catching criminals by the use of surveillance cameras. Any time a person goes missing there is a limited period of time to find them before the odds reduce significantly. As previously stated, the use of surveillance cameras in these situations are positively effective when combined with facial detection software like CNNs. Apple made its huge debut with the iPhone X having facial recognition unlock the device itself. Also, pinpointing terrorists by the use of facial recognition helps minimize the possibility of attacks; the list goes on and on. However, on the opposite spectrum of things, is facial recognition ethical? Despite the many positives of facial recognition, CNNs invoke plenty criticism regarding the legality and ethic standpoint. Facebook currently faces a lawsuit over its own facial recognition technology called, DeepFace. This technology identified people in photos without their consent. Amazon's smart home company Ring also came under fire for the same violation of civil rights. As of May 15th, 2019, San Francisco banned facial recognition technology making it the first city in the United States to have such a restriction. The disadvantage of these networks is simply the bias, bias referring to people being falsely accused and recognized. This bias also references to people of color and inaccurately exposes them. According to CBS News, twenty- eight members of congress falsely matched up with mugshots of criminals. The ban will not apply to federal use due to security reasons. All in all, it is hard to depict a line between the good and bad in ethics when it comes to facial recognition in CNNs.
  • 5. 5 Inputs, Labels, Outputs Our input for our model will be a total of 400 images. There are 40 people in our dataset each with 10 pictures of them. The photos are of size 64x64 and grayscale so that we don’t have to use a three dimensional input for our model. The images do not show large groups of people or whole shots of a person’s body. They just contain the subject’s face with no background showing. The labels for the model are the names of each person in our subject group, or more precisely a number associated with that person. The output layer of the model is represented by 40 nodes. Each node represents a person and their likelihood of the input image being that person. We do this by using the softmax activation function. The softmax function takes an input vector and normalizes it into a probability distribution consisting of 40 probabilities. The softmax function is given by: (Where is the sigmoid and is the kronecker delta function) Using the softmax we can create a probability distribution across all output nodes of the likelihood of the image being associated with it. The node with the largest probability ends up being the node that the Neural Network “selects”. Therefore, it selects one out of 40 names to be the label. A Description of the Model Function There are four primary portions of a Convolutional Neural Network that differentiates it from a traditional Fully Connected Neural Network. CNNs include Convolutional Layers, and Pooling Layers. These layers are introduced in the beginning of the network so that the image that is being used as input does not have to be used as is in the Fully Connected Layer. Typically (though not exclusively) CNNs use the ReLU activation function. They also use something called Dropout to protect from overfitting, and a Flattening Function to connect the preprocessing to a traditional NN. CNNs are useful because it is unwise to use a raw image as the first layer of a traditional NN due to The Curse of Dimensionality. The Curse of Dimensionality states that as the dimensions of your input increases, so should the number of data points (exponentially) that way you can “fill” the feature space adequately (Shetty, 2019). In the case of facial recognition an image tends to be very large, and therefore the feature space tends to be extremely large. A pixel image creates a dimension space and three times that if you have a colored image (one dimension for each color channel in the image). If it is not feasible to get more data points the only option is to implement a kind of preprocessing to decrease complexity. This is the basis of CNNs.
  • 6. 6 The first step of a CNN is its namesake the Convolutional Layer. The purpose of convolution is to extract features from the image while maintaining a kind of spatial relationship between the pixels in the image. To convolve an image, a “filter” or “kernel” is used. Call it . The filter is a matrix of some predetermined size that is less than the size of the input image with size . The filter also has a “stride” , or the number of pixels moved per step either by row or column. The alse filter has randomly initialized numbers as the parameters of the matrix that will later be learned via backpropagation. Lastly there is the number of filters used in a layer . The first step of convolution is to place the top left of your filter at the top left of your image. Call the portion of that overlaps . Then compute the sum of the element-wise product of the two overlapping matrices. In other words: . This is the element of the convolved feature. Next, take a single stride to the right and repeat this process. Striding right increments the row of the convolved feature. When your filter hits the end of you input image, reset the filter to the beginning of the row and take a stride down. Increment the column of the convolved feature. Continue until the bottom of the image is reached. If you have a three dimensional feature set (a colored image), you should repeat this process for each slice of the third dimension. This process results in a convolved feature of size: . After a convolutional layer, it is convention to use the Rectified Linear Unit (or ReLU) as our activation function. ReLU is defined as: . Previously, we had been using the Sigmoid function as our activation function. The reason ReLU is far more commonly used is that Sigmoid suffers from something called the vanishing gradient. Because , It can be shown that . Repeatedly multiplying small numbers over multiple layers will result in a smaller and smaller gradient that eventually will yield no change to your model. On the other hand, using ReLU, the gradient is zero if and one if . (Because ReLU is non differentiable on it is said the derivative is zero in practice.) Because of this, multiple gradients of the the ReLU
  • 7. 7 neither explode nor vanish, so it makes a good activation function for a model with many layers like a CNN. The Pooling Layer acts similarly to the Convolutional Layer, where it reduces the dimensionality of the input while trying to maintain the important information of the feature map. By reducing the dimensionality of the input it is more easily computable, better protected against overfitting, and less likely to be affected by tiny transformations distortions to the input. Like the convolutional filter, pooling has a filter as well, with a size but unlike a convolutional filter it does not have parameters to be changed by backpropagation. Instead, it uses a function to get its output number. The most popular and effective option is max-pooling, where the element of the output matrix is the largest number inside the pooling filter. There also exist sum and average functions that can be used. While pooling can help fight overfitting, it is not always successful. That is where we can implement dropout. Dropout is a “regularization method that approximates training a large number of neural networks with different architectures in parallel (Brownlee, 2019).” To do this, while training the network a random selection of neurons are ignored (dropped out), meaning all incoming and outgoing connections are ignored. This ends up making noise and error in the training process that helps keep any one neuron from getting weights that are too large. Very large weights are a sign that your data has been over fit. Buy thinning out the network using dropout it means it is possible for us to require more neurons to maintain the same number of actuated neurons in the network. This process keeps our data from being over fitted. The last step of a CNN is the flattening. Flattening simply takes our output of our convolution and pooling layers and transforms it to be taken in as a vector to our fully connected layer. This can be done by having reading the matrix left-to-right, top-to-bottom and filling out the one-dimensional vector element-by-element. The result of this is a vector of length that can be used as the input of our Fully Connected Layer. Once we have this input vector we can operate our NN as a typical multi-layer perceptron. We are assuming you have a background in these kinds of NN and so we won’t be going into detail of the structure of them here. Description of Stochastic Gradient Descent In the CNN process the gradient descent optimization algorithm aims to minimize some cost/loss function based on that function’s gradient. Successive iterations are employed to progressively approach either a local or global minimum of the cost function.. Recall the gradient of a function is given by: The gradient can be intuitively thought of as the path of descent a ball would take while rolling down a hill. However, using neural networks we don’t have access to this function
  • 8. 8 otherwise we would be done! There would be no reason to train a model. Instead we have access to a loss function which can help us approximate this. Our loss function is described as the sum of differences between what your model gave you and what your model should give you: (This is the Mean Squared Error or MSE.) Using our loss function we can approximate GD using a subset of our training data. This is the difference between Stochastic Gradient Descent and Gradient Descent. The iterative process for SGD is as follows: ● Start at some randomly selected vector ● For each step t, have some process to generate . A subset of our training set. ● Then compute so that we can use to formula to update our vector: ● Here is our predetermined learning rate. Repeating this process we approach a local minimum of our loss function. Our hope is that this local minimum would actually be the global minimum so that our model would be entirely optimized, but this is not guaranteed. Also, because SGD only is applying a subset of it is smaller than GD and thus gets to the minimum much faster than GD but probably won’t converge to the minimum. Instead, it will oscillate around the minimum, but the approximation is good enough (Ng). Description of Backpropagation Backpropagation is how a neural network learns by looking at how much error or cost was calculated and trying to minimize that number. For CNN’s we look at the error between each of our layers and decided whether or not something matches in that layer and if so how well it matches. From there the neural network decides what or who it is looking at. For example the Cost of a single Neuron in a network can be given as: Where C is the cost function: , R is the ReLU, Z is our weighted input (𝑍 = 𝑍𝑍), and W is our weight. Taking the partial derivative of our error function with respect to the weight we are looking at will show us how much error contributes to our function, and as such we need to expand our formula using the chain rule:
  • 9. 9 These partial derivatives are used to check each parameter and how that parameter contributes to the total change in error. Furthermore as we go through multiple layers our cost function will have more and more inputs and as such the expansion can get rather lengthy and cumbersome. However as we go through more weights we use the previous calculated weights when using the chain rule and as such we have the program remember those values so that it does not need to recalculate them. With the information gathered we can no find the actual error of the layer and see how that is impacting our final layer that is our result. That is the derivative of our Cost function with respect to the output layer (Zo) and hidden layers (Zh): We can see the hidden layer error is equivalent to the output layer times our weighted output times the ReLU of our hidden layer. This is the general process of backpropagation where we keep shuffling our weighted error back to our previous layer and continue to filter our output until we arrive at an error that is suitable to the machine to make an informed decision on what the image is and to put a name to it. How the machine decides to adjust which weights of each input is derived the SGD above and depending on if one value is more or less will adjust the weight accordingly to give us the lowest error possible. How Pooling Affects Complexity Pooling is an important layer in Convolutional Neural Networks because it decreases the complexity of your model. By down sampling our input layer by layer we can protect against overfitting or minor variations in our data that would otherwise look like a completely new image to a naive network. We also decrease the dimensionality of the network which improves manageability and runtime There are several different kinds of functions that can be used in your Pooling Layer that all perform differently. There is a Sum, Average, and Max Pooling function. In a network it isn’t uncommon that one might use a combination of these. Sum and Average have similar results, just scaled differently so the comparison to be had is the difference between Average and Max. Max pooling takes the largest value over the filter while Average obviously takes the average. The intuition here is that Max takes the “most important” feature, while Average takes into consideration all of the values within the filter. According to Arpathy in Convolutional Neural Networks for Visual Recognition, the most effective pooling function is Max Pooling due
  • 10. 10 to its ability to extract the valuable information from the image and is not affected as much by small variations. However, if a Pooling Layer has the same structure as a Convolutional Layer minus the parameters and adding a predetermined function, would it be feasible to replace it with another Convolutional Layer in an attempt to get the best of both worlds. The Convolutional Layer still can down sample the input features but it can use SGD to learn filters that would hopefully extract more information from the image. In Springberg et. al’s Striving for Simplicity: The All Convolutional Net they explored this possibility in hopes of finding a more simplistic CNN model. They used the CIFAR-10 (Krizhevsky & Hinton, 2009) dataset to study different models. The first model used was the control model or your typical CNN with Pooling Layers. The second model used was a All-CNN with an incremented stride on the Convolutional Layers that precede the removed Pooling Layers. It is important to increment the stride because the next layer you output to should be accepting an input of the same spatial region as it was in the control model. The final model was using a replacement of the Pooling Layer with a Convolutional Layer. After comparing the models results showed an increase in accuracy with Pooling Layers replaced and a lower Loss as well. Our expectation was that a network with no Pooling Layers would perform poorly due to overfitting of the data, which is the main advantage to using Pooling Layers in the first place. In order to test Springberg’s conclusions we ran tests of our own on a relatively simply structured network with not overly aggressive overfitting counter measures to see if these All Convolutional Networks are prone to overfitting. Demonstration To demonstrate the effect of no pooling on a Convolutional Neural Networks we started by first building a typical CNN. We built the CNN using keras which is a high-level python library built on Google’s TensorFlow for Machine Learning. Our CNN started with ten pictures each of forty different people from AT&T’s Olivetti dataset. Important to note that the images we used had been preprocessed to only be the face of the person being photographed. We had tried using the Labeled Faces in the Wild dataset and found that the images were not close enough to the subjects face and included too much of the background to create successful a model. There are ways to process those faces into usable inputs but we opted to change datasets for simplicity. The structure of our CNN was as follows: ● Convolutional Layer of size 64, with filter size 3 x 3, stride of 1, with the ReLU activation function ● Max Pooling Layer with filter size 2 x 2, stride of 2 ● Another Convolutional Layer ● Another Max Pooling Layer
  • 11. 11 ● A 10% Dropout Layer ● The Flattening Function ● A length 64 vector for the first layer of our Fully Connected Network ● A 20% Dropout Layer ● Our output layer with 40 nodes using the softmax activation function. After we make the structure of the network the next steps are to define the optimizer and it’s parameters. As we mentioned before out optimizer was Stochastic Gradient Descent with , as well as a few other tweaking parameters (momentum=1, decay=0.05). After defining the input, structure, and optimizer the last thing to do is run the Network. We used a batch size of 10 and ran through around 100 epochs each time. Here are the graphs of accuracy and loss for a our baseline CNN using SGD and Max Pooling Layers. After 100 epochs we earned validation accuracy of almost 80% and a validation loss of near 1. By looking a the graphs we can see that both scores on training and testing data are fairly close. This is a good indication that we have not overfit our data. The next step was to remove our pooling layers and see how the model faired. In order to remove the pooling layers there are a few steps you have to make to maintain the models integrity. According the Springberg, the way to convert your traditional CNN to and all Convolutional one is to first change the Pooling Layer into a Convolutional Layer using the same size and stride as the Pooling Layer was. Next and importantly you have to increase the stride of the Convolutional Layer preceding the original Pooling Layer by one so the network can maintain the same feature size. Here were the results after 200 epochs.
  • 12. 12 Our training data without pooling did perform better than with pooling with an accuracy of almost 90% and a loss of below 0.5. However, our testing data did not do very well at all, with testing loss rising to over 2.0 at some points. This is a clear case of overfitting our data. This makes sense because part of the reason pooling is used is to prevent overfitting. While our loss did fair worse there were some advantages to using a CNN with no pooling. While the model did learn more slowly, it actually ran drastically quicker. The CNN with pooling took around 4 seconds per epoch while the CNN without pooling took under half that time. For this reason we believe that if you used more aggressive anti-overfitting techniques like a higher dropout percentage, then maybe you could see some real benefit to using an All Convolutional Network. References Arpathy, K. “Convolutional Neural Networks for Visual Recognition.” CS231n Convolutional Neural Networks for Visual Recognition, cs231n.github.io/convolutional- networks/. “Backpropagation.” Backpropagation - ML Documentation, ml- cheatsheet.readthedocs.io/en/latest/backpropagation.html. Brownlee, Jason. A Gentle Introduction to Dropout for Regularizing Deep Neural Networks. 21 Apr. 2019, machinelearningmastery.com/dropout-for-regularizing-deep-neural- networks/. Accessed 13 May 2019. Cbs/ap. “San Francisco Bans Facial Recognition Technology.” CBS News, CBS Interactive, 15 May 2019, www.cbsnews.com/news/san-francisco-becomes-first-us-city-to-ban-facial- recognition-technology-today-2019-05-14/.
  • 13. 13 Hirasawa, Toshiaki, et al. “Application of Artificial Intelligence Using a Convolutional Neural Network for Detecting Gastric Cancer in Endoscopic Images.” SpringerLink, Springer Japan, 15 Jan. 2018, link.springer.com/article/10.1007/s10120-018-0793-2. Krizhevsky, A, and G Hinton. Learning Multiple Layers of Features from Tiny Images.2009. Ng, Andrew. Supervised Learning. Stanford University, cs229.stanford.edu/notes/cs229- notes1.pdf. Shetty, Badreesh, and Badreesh Shetty. Curse of Dimensionality. 15 Jan. 2019, towardsdatascience.com/curse-of-dimensionality-2092410f3d27. Accessed 13 May 2019. Springberg, Jost Tobias, et al. Striving For Simplicity: The All Convolutional Net. Department of Computer Science University of Freiburg, 2015, arxiv.org/pdf/1412.6806.pdf. Suárez-Paniagua, Víctor. “Evaluation of Pooling Operations in Convolutional Architectures for Drug-Drug Interaction Extraction.” BMC Bioinformatics, BioMed Central, 13 June 2018, www.ncbi.nlm.nih.gov/pmc/articles/PMC5998766/. Ujjwalkarn. “A Quick Introduction to Neural Networks.” The Data Science Blog, 10 Aug. 2016, ujjwalkarn.me/2016/08/09/quick-intro-neural-networks/.