In this session, we’ll discuss approaches for applying convolutional neural networks to novel computer vision problems, even without having millions of images of your own. Pretrained models and generic image data sets from Google, Kaggle, universities, and other places can be leveraged and adapted to solve industry and business specific problems. We’ll discuss the approaches of transfer learning and fine tuning to help anyone get started on using deep learning to get cutting edge results on their computer vision problems.
3. What is transfer learning?
● Transfer learning is taking a neural network that has already been
trained and adapting it to a new dataset.
● Allows us to use deep learning on problems that we have very little
training data for.
● Improve the generalization of our network on some novel task.
● Let big AI labs do all the hard work for us.
4. Pretrained Neural Networks
Big research labs typically design and test their new deep learning models
against large public benchmark datasets.
Most commonly on Imagenet – 1.2 million images across 1000 categories
Or on MS COCO – 330k images for things like object detection, image
segmentation, and keypoint detection
7. Transfer learning - Classification
Ok, so we have a state of the art neural network that someone else
trained to recognize images and classify them in one of 1000 categories.
But our problem has maybe 5 categories. What do we do?
8. Transfer learning - Classification
Ok, so we have a state of the art neural network that someone else
trained to recognize images and classify them in one of 1000 categories.
But our problem has maybe 5 categories. What do we do?
9. Transfer learning - Classification
Ok, so we have a state of the art neural network that someone else
trained to recognize images and classify them in one of 1000 categories.
But our problem has maybe 5 categories. What do we do?
fc 5
10. Transfer learning - Detection
Very similar, the only portions of the pretrained detection network that are
not flexible are the last portions.
We can call this the ‘Detector’ portion of the network, we can just swap
that out with our own ’detector’ portion that will learn the specific objects
that we want to detect.
11. What is transfer learning?
Once you have your own classifier/regression or whatever layer tacked on
you can start training.
If your dataset is small you may want to only train your new layer.
If your dataset is large you may want to retrain the whole network.
If your dataset is very different from the one that the network was
originally trained with you may want to retrain the whole network.
12. What if I don’t want to train?
Can even make use of a pretrained neural network without retraining.
Once trained the deep learning model because an effective way to
represent any image as a feature rich vector.
Now we can pass any image through our pretrained neural network and
get a dense vector representation of that image.
Allows us to do use these features as the input for any traditional machine
learning algorithm (logistic regression, random forest, etc …)
Or we can set up a reverse image search type of system by computing
similarity scores between the vectors.
13. Transfer learning – How Well Does it Work?
Remi Cadene’s masters thesis -
Deep Learning for Visual
Recognition
20. Representation learning
Deep learning presents us with a unique philosophy for machine learning. Instead
of learning the mapping from our features directly to an output we are learning the
best representation of our features.
Each layer of a neural network introduces feature representations that are
expressed in terms of other more simple representations.
The process has multiple steps, with each layer building on the representations
created by the previous layer.
22. Representation Learning
Here is a test with natural images. Unsupervised clustering on the raw images and
then again on the features extracted from the final layer of a deep neural network.
33. Representation learning
This is an interesting observation!
The ‘deep’ part of this neural network is essentially working to try and make the
problem easier.
We will leverage this property to very quickly and effectively make use of the
power of deep learning without the headache of training the networks from
scratch.
35. Applied Deep Learning is Mostly Finetuning
Most applied deep learning work revolves around two strategies. Both of which
utilize neural networks that have already been trained
1. We can take a pretrained network and finetune it on our specific problem.
a. Ex. A computer vision model already knows how to ‘see’ we just need to finetune it to
see whatever our specific problem requires.
2. Using a pretrained network and extract its internal representations of a
dataset to use as features for a model.
36. Fine Tuning a Network
To finetune a network we typically choose a network that was already trained on a
similar problem to our own.
We then train the network in much the same way that it was originally trained with
a few differences.
1. Use a much smaller learning rate
2. Only train for a handful of iterations
37. Feature Extraction
We can use these networks as feature extractors to represent our data as dense information
rich vectors.
We can vectorize the following quite easily:
1. Images
a. Using Convolutional Neural Networks trained to classify images
2. Words
a. Using neural networks trained to predict words given their context
3. Sentences
a. Using neural networks that are trained to reconstruct sentences given their context
39. Further Resources – Theory
How transferable are features in deep neural networks? Jason Yosinski, Jeff
Clune, Yoshua Bengio, Hod Lipson - https://arxiv.org/abs/1411.1792
Deep Learning Book, Ian Goodfellow, Yoshua Bengio, Aaron Courville -
http://www.deeplearningbook.org/
Overfeat: Integrated Recognition, Localization and Detection using Convolutional
Networks. Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob
Fergus, Yann LeCun - https://arxiv.org/pdf/1312.6229.pdf