SlideShare a Scribd company logo
1 of 76
Convolutional Neural Networks
Computer Vision
CS 543 / ECE 549
University of Illinois
Jia-Bin Huang
05/05/15
Reminder: final project
• Posters on Friday, May 8 at 7pm in SC 2405
– two rounds, 1.25 hr each
• Papers due May 11 by email
• Cannot accept late papers/posters due to
grading deadlines
• Send Derek an email if you can’t present your
poster
Today’s class
• Overview
• Convolutional Neural Network (CNN)
• Understanding and Visualizing CNN
• Training CNN
• Probabilistic Interpretation
Image Categorization: Training phase
Training
Labels
Training
Images
Classifier
Training
Training
Image
Features
Trained
Classifier
Image Categorization: Testing phase
Training
Labels
Training
Images
Classifier
Training
Training
Image
Features
Trained
Classifier
Image
Features
Testing
Test Image
Outdoor
PredictionTrained
Classifier
Features are the Keys
SIFT [Loewe IJCV 04] HOG [Dalal and Triggs CVPR 05]
SPM [Lazebnik et al. CVPR 06] DPM [Felzenszwalb et al. PAMI 10]
Color Descriptor [Van De Sande et al. PAMI 10]
• Each layer of hierarchy extracts features from output
of previous layer
• All the way from pixels  classifier
• Layers have the (nearly) same structure
Learning a Hierarchy of Feature Extractors
Layer 1 Layer 2 Layer 3 Simple
Classifie
Image/Video
Pixels
Image/video Labels
Engineered vs. learned features
Image
Feature extraction
Pooling
Classifier
Label
Image
Convolution/pool
Convolution/pool
Convolution/pool
Convolution/pool
Convolution/pool
Dense
Dense
Dense
Label
Biological neuron and Perceptrons
A biological neuron An artificial neuron (Perceptron)
- a linear classifier
Simple, Complex and Hypercomplex cells
David H. Hubel and Torsten Wiesel
David Hubel's Eye, Brain, and Vision
Suggested a hierarchy of feature detectors
in the visual cortex, with higher level features
responding to patterns of activation in lower
level cells, and propagating activation
upwards to still higher level cells.
Hubel/Wiesel Architecture and Multi-layer Neural Network
Hubel and Weisel’s architecture Multi-layer Neural Network
- A non-linear classifier
Multi-layer Neural Network
• A non-linear classifier
• Training: find network weights w to minimize the
error between true training labels 𝑦𝑖 and
estimated labels 𝑓𝒘 𝒙𝒊
• Minimization can be done by gradient descent
provided 𝑓 is differentiable
• This training method is called
back-propagation
Convolutional Neural Networks
(CNN, ConvNet, DCN)
• CNN = a multi-layer neural network with
– Local connectivity
– Share weight parameters across spatial positions
• One activation map (a depth slice), computed
with one set of weights
Image credit: A. Karpathy
Neocognitron [Fukushima, Biological Cybernetics 1980]
LeNet [LeCun et al. 1998]
Gradient-based learning applied to document
recognition [LeCun, Bottou, Bengio, Haffner 1998]
What is a Convolution?
• Weighted moving sum
Input Feature Activation Map
.
.
.
slide credit: S. Lazebnik
Input Image
Convolution
(Learned)
Non-linearity
Spatial pooling
Normalization
Convolutional Neural Networks
Feature maps
slide credit: S. Lazebnik
Input Image
Convolution
(Learned)
Non-linearity
Spatial pooling
Normalization
Feature maps
Input Feature Map
.
.
.
Convolutional Neural Networks
slide credit: S. Lazebnik
Input Image
Convolution
(Learned)
Non-linearity
Spatial pooling
Normalization
Feature maps
Convolutional Neural Networks
Rectified Linear Unit (ReLU)
slide credit: S. Lazebnik
Input Image
Convolution
(Learned)
Non-linearity
Spatial pooling
Normalization
Feature maps
Max pooling
Convolutional Neural Networks
slide credit: S. Lazebnik
Input Image
Convolution
(Learned)
Non-linearity
Spatial pooling
Normalization
Feature maps
Feature Maps Feature Maps
After Contrast
Normalization
Convolutional Neural Networks
slide credit: S. Lazebnik
Input Image
Convolution
(Learned)
Non-linearity
Spatial pooling
Normalization
Feature maps
Convolutional filters are trained in a
supervised manner by back-propagating
classification error
Convolutional Neural Networks
slide credit: S. Lazebnik
SIFT Descriptor
Image
Pixels
Apply gradient
filters
Spatial pool
(Sum)
Normalize to unit
length
Feature
Vector
Lowe [IJCV 2004]
SIFT Descriptor
Image
Pixels Apply
oriented filters
Spatial pool
(Sum)
Normalize to unit
length
Feature
Vector
Lowe [IJCV 2004]
slide credit: R. Fergus
Spatial Pyramid Matching
SIFT
Features
Filter with
Visual Words
Multi-scale
spatial pool
(Sum)
Max
Classifier
Lazebnik,
Schmid,
Ponce
[CVPR 2006]
slide credit: R. Fergus
Deformable Part Model
Deformable Part Models are Convolutional Neural Networks [Girshick et al. CVPR 15]
AlexNet
• Similar framework to LeCun’98 but:
• Bigger model (7 hidden layers, 650,000 units, 60,000,000 params)
• More data (106 vs. 103 images)
• GPU implementation (50x speedup over CPU)
• Trained on two GPUs for a week
A. Krizhevsky, I. Sutskever, and G. Hinton,
ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012
Using CNN for Image Classification
AlexNet
Fully connected layer Fc7
d = 4096
d = 4096
Averaging
Softmax
Layer
“Jia-Bin”
Fixed input size:
224x224x3
ImageNet Challenge 2012-2014
Team Year Place Error (top-5) External data
SuperVision – Toronto
(7 layers)
2012 - 16.4% no
SuperVision 2012 1st 15.3% ImageNet 22k
Clarifai – NYU (7 layers) 2013 - 11.7% no
Clarifai 2013 1st 11.2% ImageNet 22k
VGG – Oxford (16 layers) 2014 2nd 7.32% no
GoogLeNet (19 layers) 2014 1st 6.67% no
Human expert* 5.1%
Team Method Error (top-5)
DeepImage - Baidu Data augmentation + multi GPU 5.33%
PReLU-nets - MSRA Parametric ReLU + smart initialization 4.94%
BN-Inception ensemble
- Google
Reducing internal covariate shift 4.82%
Beyond classification
• Detection
• Segmentation
• Regression
• Pose estimation
• Matching patches
• Synthesis
and many more…
R-CNN: Regions with CNN features
• Trained on ImageNet classification
• Finetune CNN on PASCAL
RCNN [Girshick et al. CVPR 2014]
Fast R-CNN
Fast RCNN [Girshick, R 2015]
https://github.com/rbgirshick/fast-rcnn
Labeling Pixels: Semantic Labels
Fully Convolutional Networks for Semantic Segmentation [Long et al. CVPR 2015]
Labeling Pixels: Edge Detection
DeepEdge: A Multi-Scale Bifurcated Deep Network for Top-Down Contour Detection
[Bertasius et al. CVPR 2015]
CNN for Regression
DeepPose [Toshev and Szegedy CVPR 2014]
CNN as a Similarity Measure for Matching
FaceNet [Schroff et al. 2015]
Stereo matching [Zbontar and LeCun CVPR 2015]
Compare patch [Zagoruyko and Komodakis 2015]
Match ground and aerial images
[Lin et al. CVPR 2015]FlowNet [Fischer et al 2015]
CNN for Image Generation
Learning to Generate Chairs with Convolutional Neural Networks [Dosovitskiy et al. CVPR 2015]
Chair Morphing
Learning to Generate Chairs with Convolutional Neural Networks [Dosovitskiy et al. CVPR 2015]
Understanding and Visualizing CNN
• Find images that maximize some class scores
• Individual neuron activation
• Visualize input pattern using deconvnet
• Invert CNN features
• Breaking CNNs
Find images that maximize some class scores
person: HOG template
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
[Simonyan et al. ICLR Workshop 2014]
Individual Neuron Activation
RCNN [Girshick et al. CVPR 2014]
Individual Neuron Activation
RCNN [Girshick et al. CVPR 2014]
Individual Neuron Activation
RCNN [Girshick et al. CVPR 2014]
Visualizing the Input Pattern
• What input pattern originally caused a given
activation in the feature maps?
Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]
Layer 1
Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]
Layer 2
Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]
Layer 3
Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]
Layer 4 and 5
Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]
Invert CNN features
• Reconstruct an image from CNN features
Understanding deep image representations by inverting them
[Mahendran and Vedaldi CVPR 2015]
CNN Reconstruction
Reconstruction from different layers
Multiple reconstructions
Understanding deep image representations by inverting them
[Mahendran and Vedaldi CVPR 2015]
Breaking CNNs
Intriguing properties of neural networks [Szegedy ICLR 2014]
What is going on?
x
x ¬ x+a
¶E
¶x
¶E
¶x
http://karpathy.github.io/2015/03/30/breaking-convnets/
Explaining and Harnessing Adversarial Examples [Goodfellow ICLR 2015]
What is going on?
• Recall gradient descent training: modify the
weights to reduce classifier error
• Adversarial examples: modify the image to
increase classifier error
http://karpathy.github.io/2015/03/30/breaking-convnets/
Explaining and Harnessing Adversarial Examples [Goodfellow ICLR 2015]
w
ww



E

x ¬ x+a
¶E
¶x
Fooling a linear classifier
• Perceptron weight update: add a small
multiple of the example to the weight vector:
w  w + αx
• To fool a linear classifier, add a small multiple
of the weight vector to the training example:
x  x + αw
http://karpathy.github.io/2015/03/30/breaking-convnets/
Explaining and Harnessing Adversarial Examples [Goodfellow ICLR 2015]
Fooling a linear classifier
http://karpathy.github.io/2015/03/30/breaking-convnets/
Breaking CNNs
Deep Neural Networks are Easily Fooled: High Confidence Predictions for
Unrecognizable Images [Nguyen et al. CVPR 2015]
Images that both CNN and Human can recognize
Deep Neural Networks are Easily Fooled: High Confidence Predictions for
Unrecognizable Images [Nguyen et al. CVPR 2015]
Direct Encoding
Deep Neural Networks are Easily Fooled: High Confidence Predictions for
Unrecognizable Images [Nguyen et al. CVPR 2015]
Indirect Encoding
Deep Neural Networks are Easily Fooled: High Confidence Predictions for
Unrecognizable Images [Nguyen et al. CVPR 2015]
Take a break…
Image source: http://mehimandthecats.com/feline-care-guide/
Training Convolutional Neural Networks
• Backpropagation + stochastic gradient descent
with momentum
– Neural Networks: Tricks of the Trade
• Dropout
• Data augmentation
• Batch normalization
• Initialization
– Transfer learning
Dropout
Dropout: A simple way to prevent neural networks from overfitting [Srivastava JMLR 2014]
Data Augmentation (Jittering)
• Create virtual training
samples
– Horizontal flip
– Random crop
– Color casting
– Geometric distortion
Deep Image [Wu et al. 2015]
Parametric Rectified Linear Unit
Delving Deep into Rectifiers: Surpassing Human-Level Performance on
ImageNet Classification [He et al. 2015]
Batch Normalization
Batch Normalization: Accelerating Deep Network Training by
Reducing Internal Covariate Shift [Ioffe and Szegedy 2015]
Transfer Learning
• Improvement of learning in a new task through the
transfer of knowledge from a related task that has
already been learned.
• Weight initialization for CNN
Learning and Transferring Mid-Level Image Representations using
Convolutional Neural Networks [Oquab et al. CVPR 2014]
Convolutional activation features
[Donahue et al. ICML 2013]
CNN Features off-the-shelf:
an Astounding Baseline for Recognition
[Razavian et al. 2014]
How transferable are features in CNN?
How transferable are features in deep
neural networks [Yosinski NIPS 2014]
Deep Neural Networks Rival the Representation
of Primate Inferior Temporal Cortex
Deep Neural Networks Rival the Representation of Primate IT Cortex for
Core Visual Object Recognition [Cadieu et al. PLOS 2014]
Deep Neural Networks Rival the Representation
of Primate Inferior Temporal Cortex
Deep Neural Networks Rival the Representation of Primate IT Cortex for
Core Visual Object Recognition [Cadieu et al. PLOS 2014]
Deep Rendering Model (DRM)
A Probabilistic Theory of Deep Learning [Patel, Nguyen, and Baraniuk 2015]
CNN as a Max-Sum Inference
A Probabilistic Theory of Deep Learning [Patel, Nguyen, and Baraniuk 2015]
Model
Inference
Learning
Tools
• Caffe
• cuda-convnet2
• Torch
• MatConvNet
• Pylearn2
Resources
• http://deeplearning.net/
• https://github.com/ChristosChristofidis/aweso
me-deep-learning
Things to remember
• Overview
– Neuroscience, Perceptron, multi-layer neural networks
• Convolutional neural network (CNN)
– Convolution, nonlinearity, max pooling
– CNN for classification and beyond
• Understanding and visualizing CNN
– Find images that maximize some class scores;
visualize individual neuron activation, input pattern and
images; breaking CNNs
• Training CNN
– Dropout, data augmentation; batch normalization; transfer
learning
• Probabilistic interpretation
– Deep rendering model; CNN forward-propagation as max-
sum inference; training as an EM algorithm

More Related Content

What's hot

Object Detection & Tracking
Object Detection & TrackingObject Detection & Tracking
Object Detection & TrackingAkshay Gujarathi
 
Object tracking presentation
Object tracking  presentationObject tracking  presentation
Object tracking presentationMrsShwetaBanait1
 
Real Time Object Tracking
Real Time Object TrackingReal Time Object Tracking
Real Time Object TrackingVanya Valindria
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overviewjins0618
 
Image segmentation
Image segmentationImage segmentation
Image segmentationDeepak Kumar
 
Multiple object detection
Multiple object detectionMultiple object detection
Multiple object detectionSAURABH KUMAR
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction Wael Badawy
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnnSumeraHangi
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition Intel Nervana
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detectionBrodmann17
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaPreferred Networks
 
Object tracking survey
Object tracking surveyObject tracking survey
Object tracking surveyRich Nguyen
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnnDebarko De
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionDADAJONJURAKUZIEV
 
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...Analytics India Magazine
 
A Deep Journey into Super-resolution
A Deep Journey into Super-resolutionA Deep Journey into Super-resolution
A Deep Journey into Super-resolutionRonak Mehta
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural NetworksPyData
 

What's hot (20)

Object Detection & Tracking
Object Detection & TrackingObject Detection & Tracking
Object Detection & Tracking
 
Object tracking
Object trackingObject tracking
Object tracking
 
Object tracking presentation
Object tracking  presentationObject tracking  presentation
Object tracking presentation
 
Real Time Object Tracking
Real Time Object TrackingReal Time Object Tracking
Real Time Object Tracking
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
Multiple object detection
Multiple object detectionMultiple object detection
Multiple object detection
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi Kerola
 
Object tracking survey
Object tracking surveyObject tracking survey
Object tracking survey
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
 
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
 
PPT s08-machine vision-s2
PPT s08-machine vision-s2PPT s08-machine vision-s2
PPT s08-machine vision-s2
 
A Deep Journey into Super-resolution
A Deep Journey into Super-resolutionA Deep Journey into Super-resolution
A Deep Journey into Super-resolution
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
 

Similar to Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015

Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...Simone Ercoli
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Universitat Politècnica de Catalunya
 
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondenceParn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondenceNAVER Engineering
 
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Saimunur Rahman
 
TIP_TAViT_presentation.pdf
TIP_TAViT_presentation.pdfTIP_TAViT_presentation.pdf
TIP_TAViT_presentation.pdfBoahKim2
 
imageclassification-160206090009.pdf
imageclassification-160206090009.pdfimageclassification-160206090009.pdf
imageclassification-160206090009.pdfKammetaJoshna
 
Lecture_04_Supervised_Pretraining.pptx
Lecture_04_Supervised_Pretraining.pptxLecture_04_Supervised_Pretraining.pptx
Lecture_04_Supervised_Pretraining.pptxGauravGautam216125
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용홍배 김
 
NetVLAD: CNN architecture for weakly supervised place recognition
NetVLAD:  CNN architecture for weakly supervised place recognitionNetVLAD:  CNN architecture for weakly supervised place recognition
NetVLAD: CNN architecture for weakly supervised place recognitionGeunhee Cho
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural NetworksYogendra Tamang
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsBenjamin Le
 
Scalable image recognition model with deep embedding
Scalable image recognition model with deep embeddingScalable image recognition model with deep embedding
Scalable image recognition model with deep embedding捷恩 蔡
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer visionMarcin Jedyk
 
Introduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksIntroduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksMarcinJedyk
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in ImagesAnil Kumar Gupta
 

Similar to Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015 (20)

Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
 
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
 
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondenceParn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
 
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
 
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
 
TIP_TAViT_presentation.pdf
TIP_TAViT_presentation.pdfTIP_TAViT_presentation.pdf
TIP_TAViT_presentation.pdf
 
imageclassification-160206090009.pdf
imageclassification-160206090009.pdfimageclassification-160206090009.pdf
imageclassification-160206090009.pdf
 
Lecture_04_Supervised_Pretraining.pptx
Lecture_04_Supervised_Pretraining.pptxLecture_04_Supervised_Pretraining.pptx
Lecture_04_Supervised_Pretraining.pptx
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 
Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용
 
NetVLAD: CNN architecture for weakly supervised place recognition
NetVLAD:  CNN architecture for weakly supervised place recognitionNetVLAD:  CNN architecture for weakly supervised place recognition
NetVLAD: CNN architecture for weakly supervised place recognition
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
Scalable image recognition model with deep embedding
Scalable image recognition model with deep embeddingScalable image recognition model with deep embedding
Scalable image recognition model with deep embedding
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer vision
 
Fa19_P1.pptx
Fa19_P1.pptxFa19_P1.pptx
Fa19_P1.pptx
 
Introduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksIntroduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural Networks
 
lecture_16_jiajun.pdf
lecture_16_jiajun.pdflecture_16_jiajun.pdf
lecture_16_jiajun.pdf
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in Images
 

More from Jia-Bin Huang

How to write a clear paper
How to write a clear paperHow to write a clear paper
How to write a clear paperJia-Bin Huang
 
Research 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeXResearch 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeXJia-Bin Huang
 
Computer Vision Crash Course
Computer Vision Crash CourseComputer Vision Crash Course
Computer Vision Crash CourseJia-Bin Huang
 
Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.Jia-Bin Huang
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialJia-Bin Huang
 
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)Jia-Bin Huang
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015Jia-Bin Huang
 
Writing Fast MATLAB Code
Writing Fast MATLAB CodeWriting Fast MATLAB Code
Writing Fast MATLAB CodeJia-Bin Huang
 
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Jia-Bin Huang
 
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...Jia-Bin Huang
 
Real-time Face Detection and Recognition
Real-time Face Detection and RecognitionReal-time Face Detection and Recognition
Real-time Face Detection and RecognitionJia-Bin Huang
 
Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013Jia-Bin Huang
 
Jia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum VitaeJia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum VitaeJia-Bin Huang
 
Pose aware online visual tracking
Pose aware online visual trackingPose aware online visual tracking
Pose aware online visual trackingJia-Bin Huang
 
Face Expression Enhancement
Face Expression EnhancementFace Expression Enhancement
Face Expression EnhancementJia-Bin Huang
 
Image Smoothing for Structure Extraction
Image Smoothing for Structure ExtractionImage Smoothing for Structure Extraction
Image Smoothing for Structure ExtractionJia-Bin Huang
 
Three Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiucThree Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiucJia-Bin Huang
 
Static and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture RecognitionStatic and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture RecognitionJia-Bin Huang
 
Real-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionReal-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionJia-Bin Huang
 
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Jia-Bin Huang
 

More from Jia-Bin Huang (20)

How to write a clear paper
How to write a clear paperHow to write a clear paper
How to write a clear paper
 
Research 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeXResearch 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeX
 
Computer Vision Crash Course
Computer Vision Crash CourseComputer Vision Crash Course
Computer Vision Crash Course
 
Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorial
 
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015
 
Writing Fast MATLAB Code
Writing Fast MATLAB CodeWriting Fast MATLAB Code
Writing Fast MATLAB Code
 
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
 
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
 
Real-time Face Detection and Recognition
Real-time Face Detection and RecognitionReal-time Face Detection and Recognition
Real-time Face Detection and Recognition
 
Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013
 
Jia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum VitaeJia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum Vitae
 
Pose aware online visual tracking
Pose aware online visual trackingPose aware online visual tracking
Pose aware online visual tracking
 
Face Expression Enhancement
Face Expression EnhancementFace Expression Enhancement
Face Expression Enhancement
 
Image Smoothing for Structure Extraction
Image Smoothing for Structure ExtractionImage Smoothing for Structure Extraction
Image Smoothing for Structure Extraction
 
Three Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiucThree Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiuc
 
Static and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture RecognitionStatic and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture Recognition
 
Real-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionReal-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes Recognition
 
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
 

Recently uploaded

Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Crushers to screens in aggregate production
Crushers to screens in aggregate productionCrushers to screens in aggregate production
Crushers to screens in aggregate productionChinnuNinan
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdfCaalaaAbdulkerim
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communicationpanditadesh123
 
DM Pillar Training Manual.ppt will be useful in deploying TPM in project
DM Pillar Training Manual.ppt will be useful in deploying TPM in projectDM Pillar Training Manual.ppt will be useful in deploying TPM in project
DM Pillar Training Manual.ppt will be useful in deploying TPM in projectssuserb6619e
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Erbil Polytechnic University
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfChristianCDAM
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
National Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfNational Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfRajuKanojiya4
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...Erbil Polytechnic University
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 

Recently uploaded (20)

Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Crushers to screens in aggregate production
Crushers to screens in aggregate productionCrushers to screens in aggregate production
Crushers to screens in aggregate production
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdf
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communication
 
DM Pillar Training Manual.ppt will be useful in deploying TPM in project
DM Pillar Training Manual.ppt will be useful in deploying TPM in projectDM Pillar Training Manual.ppt will be useful in deploying TPM in project
DM Pillar Training Manual.ppt will be useful in deploying TPM in project
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdf
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
National Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfNational Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdf
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 

Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015

  • 1. Convolutional Neural Networks Computer Vision CS 543 / ECE 549 University of Illinois Jia-Bin Huang 05/05/15
  • 2. Reminder: final project • Posters on Friday, May 8 at 7pm in SC 2405 – two rounds, 1.25 hr each • Papers due May 11 by email • Cannot accept late papers/posters due to grading deadlines • Send Derek an email if you can’t present your poster
  • 3. Today’s class • Overview • Convolutional Neural Network (CNN) • Understanding and Visualizing CNN • Training CNN • Probabilistic Interpretation
  • 4. Image Categorization: Training phase Training Labels Training Images Classifier Training Training Image Features Trained Classifier
  • 5. Image Categorization: Testing phase Training Labels Training Images Classifier Training Training Image Features Trained Classifier Image Features Testing Test Image Outdoor PredictionTrained Classifier
  • 6. Features are the Keys SIFT [Loewe IJCV 04] HOG [Dalal and Triggs CVPR 05] SPM [Lazebnik et al. CVPR 06] DPM [Felzenszwalb et al. PAMI 10] Color Descriptor [Van De Sande et al. PAMI 10]
  • 7. • Each layer of hierarchy extracts features from output of previous layer • All the way from pixels  classifier • Layers have the (nearly) same structure Learning a Hierarchy of Feature Extractors Layer 1 Layer 2 Layer 3 Simple Classifie Image/Video Pixels Image/video Labels
  • 8. Engineered vs. learned features Image Feature extraction Pooling Classifier Label Image Convolution/pool Convolution/pool Convolution/pool Convolution/pool Convolution/pool Dense Dense Dense Label
  • 9. Biological neuron and Perceptrons A biological neuron An artificial neuron (Perceptron) - a linear classifier
  • 10. Simple, Complex and Hypercomplex cells David H. Hubel and Torsten Wiesel David Hubel's Eye, Brain, and Vision Suggested a hierarchy of feature detectors in the visual cortex, with higher level features responding to patterns of activation in lower level cells, and propagating activation upwards to still higher level cells.
  • 11. Hubel/Wiesel Architecture and Multi-layer Neural Network Hubel and Weisel’s architecture Multi-layer Neural Network - A non-linear classifier
  • 12. Multi-layer Neural Network • A non-linear classifier • Training: find network weights w to minimize the error between true training labels 𝑦𝑖 and estimated labels 𝑓𝒘 𝒙𝒊 • Minimization can be done by gradient descent provided 𝑓 is differentiable • This training method is called back-propagation
  • 13. Convolutional Neural Networks (CNN, ConvNet, DCN) • CNN = a multi-layer neural network with – Local connectivity – Share weight parameters across spatial positions • One activation map (a depth slice), computed with one set of weights Image credit: A. Karpathy
  • 15. LeNet [LeCun et al. 1998] Gradient-based learning applied to document recognition [LeCun, Bottou, Bengio, Haffner 1998]
  • 16. What is a Convolution? • Weighted moving sum Input Feature Activation Map . . . slide credit: S. Lazebnik
  • 18. Input Image Convolution (Learned) Non-linearity Spatial pooling Normalization Feature maps Input Feature Map . . . Convolutional Neural Networks slide credit: S. Lazebnik
  • 19. Input Image Convolution (Learned) Non-linearity Spatial pooling Normalization Feature maps Convolutional Neural Networks Rectified Linear Unit (ReLU) slide credit: S. Lazebnik
  • 20. Input Image Convolution (Learned) Non-linearity Spatial pooling Normalization Feature maps Max pooling Convolutional Neural Networks slide credit: S. Lazebnik
  • 21. Input Image Convolution (Learned) Non-linearity Spatial pooling Normalization Feature maps Feature Maps Feature Maps After Contrast Normalization Convolutional Neural Networks slide credit: S. Lazebnik
  • 22. Input Image Convolution (Learned) Non-linearity Spatial pooling Normalization Feature maps Convolutional filters are trained in a supervised manner by back-propagating classification error Convolutional Neural Networks slide credit: S. Lazebnik
  • 23. SIFT Descriptor Image Pixels Apply gradient filters Spatial pool (Sum) Normalize to unit length Feature Vector Lowe [IJCV 2004]
  • 24. SIFT Descriptor Image Pixels Apply oriented filters Spatial pool (Sum) Normalize to unit length Feature Vector Lowe [IJCV 2004] slide credit: R. Fergus
  • 25. Spatial Pyramid Matching SIFT Features Filter with Visual Words Multi-scale spatial pool (Sum) Max Classifier Lazebnik, Schmid, Ponce [CVPR 2006] slide credit: R. Fergus
  • 26. Deformable Part Model Deformable Part Models are Convolutional Neural Networks [Girshick et al. CVPR 15]
  • 27. AlexNet • Similar framework to LeCun’98 but: • Bigger model (7 hidden layers, 650,000 units, 60,000,000 params) • More data (106 vs. 103 images) • GPU implementation (50x speedup over CPU) • Trained on two GPUs for a week A. Krizhevsky, I. Sutskever, and G. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012
  • 28. Using CNN for Image Classification AlexNet Fully connected layer Fc7 d = 4096 d = 4096 Averaging Softmax Layer “Jia-Bin” Fixed input size: 224x224x3
  • 29. ImageNet Challenge 2012-2014 Team Year Place Error (top-5) External data SuperVision – Toronto (7 layers) 2012 - 16.4% no SuperVision 2012 1st 15.3% ImageNet 22k Clarifai – NYU (7 layers) 2013 - 11.7% no Clarifai 2013 1st 11.2% ImageNet 22k VGG – Oxford (16 layers) 2014 2nd 7.32% no GoogLeNet (19 layers) 2014 1st 6.67% no Human expert* 5.1% Team Method Error (top-5) DeepImage - Baidu Data augmentation + multi GPU 5.33% PReLU-nets - MSRA Parametric ReLU + smart initialization 4.94% BN-Inception ensemble - Google Reducing internal covariate shift 4.82%
  • 30. Beyond classification • Detection • Segmentation • Regression • Pose estimation • Matching patches • Synthesis and many more…
  • 31. R-CNN: Regions with CNN features • Trained on ImageNet classification • Finetune CNN on PASCAL RCNN [Girshick et al. CVPR 2014]
  • 32. Fast R-CNN Fast RCNN [Girshick, R 2015] https://github.com/rbgirshick/fast-rcnn
  • 33. Labeling Pixels: Semantic Labels Fully Convolutional Networks for Semantic Segmentation [Long et al. CVPR 2015]
  • 34. Labeling Pixels: Edge Detection DeepEdge: A Multi-Scale Bifurcated Deep Network for Top-Down Contour Detection [Bertasius et al. CVPR 2015]
  • 35. CNN for Regression DeepPose [Toshev and Szegedy CVPR 2014]
  • 36. CNN as a Similarity Measure for Matching FaceNet [Schroff et al. 2015] Stereo matching [Zbontar and LeCun CVPR 2015] Compare patch [Zagoruyko and Komodakis 2015] Match ground and aerial images [Lin et al. CVPR 2015]FlowNet [Fischer et al 2015]
  • 37. CNN for Image Generation Learning to Generate Chairs with Convolutional Neural Networks [Dosovitskiy et al. CVPR 2015]
  • 38. Chair Morphing Learning to Generate Chairs with Convolutional Neural Networks [Dosovitskiy et al. CVPR 2015]
  • 39. Understanding and Visualizing CNN • Find images that maximize some class scores • Individual neuron activation • Visualize input pattern using deconvnet • Invert CNN features • Breaking CNNs
  • 40. Find images that maximize some class scores person: HOG template Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps [Simonyan et al. ICLR Workshop 2014]
  • 41. Individual Neuron Activation RCNN [Girshick et al. CVPR 2014]
  • 42. Individual Neuron Activation RCNN [Girshick et al. CVPR 2014]
  • 43. Individual Neuron Activation RCNN [Girshick et al. CVPR 2014]
  • 44. Visualizing the Input Pattern • What input pattern originally caused a given activation in the feature maps? Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]
  • 45. Layer 1 Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]
  • 46. Layer 2 Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]
  • 47. Layer 3 Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]
  • 48. Layer 4 and 5 Visualizing and Understanding Convolutional Networks [Zeiler and Fergus, ECCV 2014]
  • 49. Invert CNN features • Reconstruct an image from CNN features Understanding deep image representations by inverting them [Mahendran and Vedaldi CVPR 2015]
  • 50. CNN Reconstruction Reconstruction from different layers Multiple reconstructions Understanding deep image representations by inverting them [Mahendran and Vedaldi CVPR 2015]
  • 51. Breaking CNNs Intriguing properties of neural networks [Szegedy ICLR 2014]
  • 52. What is going on? x x ¬ x+a ¶E ¶x ¶E ¶x http://karpathy.github.io/2015/03/30/breaking-convnets/ Explaining and Harnessing Adversarial Examples [Goodfellow ICLR 2015]
  • 53. What is going on? • Recall gradient descent training: modify the weights to reduce classifier error • Adversarial examples: modify the image to increase classifier error http://karpathy.github.io/2015/03/30/breaking-convnets/ Explaining and Harnessing Adversarial Examples [Goodfellow ICLR 2015] w ww    E  x ¬ x+a ¶E ¶x
  • 54. Fooling a linear classifier • Perceptron weight update: add a small multiple of the example to the weight vector: w  w + αx • To fool a linear classifier, add a small multiple of the weight vector to the training example: x  x + αw http://karpathy.github.io/2015/03/30/breaking-convnets/ Explaining and Harnessing Adversarial Examples [Goodfellow ICLR 2015]
  • 55. Fooling a linear classifier http://karpathy.github.io/2015/03/30/breaking-convnets/
  • 56. Breaking CNNs Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images [Nguyen et al. CVPR 2015]
  • 57. Images that both CNN and Human can recognize Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images [Nguyen et al. CVPR 2015]
  • 58. Direct Encoding Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images [Nguyen et al. CVPR 2015]
  • 59. Indirect Encoding Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images [Nguyen et al. CVPR 2015]
  • 60. Take a break… Image source: http://mehimandthecats.com/feline-care-guide/
  • 61. Training Convolutional Neural Networks • Backpropagation + stochastic gradient descent with momentum – Neural Networks: Tricks of the Trade • Dropout • Data augmentation • Batch normalization • Initialization – Transfer learning
  • 62. Dropout Dropout: A simple way to prevent neural networks from overfitting [Srivastava JMLR 2014]
  • 63. Data Augmentation (Jittering) • Create virtual training samples – Horizontal flip – Random crop – Color casting – Geometric distortion Deep Image [Wu et al. 2015]
  • 64. Parametric Rectified Linear Unit Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification [He et al. 2015]
  • 65. Batch Normalization Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [Ioffe and Szegedy 2015]
  • 66. Transfer Learning • Improvement of learning in a new task through the transfer of knowledge from a related task that has already been learned. • Weight initialization for CNN Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks [Oquab et al. CVPR 2014]
  • 67. Convolutional activation features [Donahue et al. ICML 2013] CNN Features off-the-shelf: an Astounding Baseline for Recognition [Razavian et al. 2014]
  • 68. How transferable are features in CNN? How transferable are features in deep neural networks [Yosinski NIPS 2014]
  • 69. Deep Neural Networks Rival the Representation of Primate Inferior Temporal Cortex Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition [Cadieu et al. PLOS 2014]
  • 70. Deep Neural Networks Rival the Representation of Primate Inferior Temporal Cortex Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition [Cadieu et al. PLOS 2014]
  • 71. Deep Rendering Model (DRM) A Probabilistic Theory of Deep Learning [Patel, Nguyen, and Baraniuk 2015]
  • 72. CNN as a Max-Sum Inference A Probabilistic Theory of Deep Learning [Patel, Nguyen, and Baraniuk 2015]
  • 74. Tools • Caffe • cuda-convnet2 • Torch • MatConvNet • Pylearn2
  • 76. Things to remember • Overview – Neuroscience, Perceptron, multi-layer neural networks • Convolutional neural network (CNN) – Convolution, nonlinearity, max pooling – CNN for classification and beyond • Understanding and visualizing CNN – Find images that maximize some class scores; visualize individual neuron activation, input pattern and images; breaking CNNs • Training CNN – Dropout, data augmentation; batch normalization; transfer learning • Probabilistic interpretation – Deep rendering model; CNN forward-propagation as max- sum inference; training as an EM algorithm

Editor's Notes

  1. Afternoon Image and region categorization Use one image , what do you see Can see different, transfer knowledge Can know a lot of things by seeing it a dog Type of categorization
  2. At testing time, we are given with an new image, we also represent it in terms of image features and apply the trained model to obtain the prediction
  3. At testing time, we are given with an new image, we also represent it in terms of image features and apply the trained model to obtain the prediction
  4. Make clear that classic methods, e.g. convnets are purely supervised.
  5. Two-GPU architecture
  6. Best non-convnet in 2012: 26.2%