Image Classification with Deep Learning | DevFest + GDay, George Town, Malaysia 2017

Ta Virot Chiraphadhanakul, PhD (@tvirot)
Google Developer Expert in Machine Learning  
Managing Director, Skooldio
Image Classification
with Deep Learning

Source: https://xkcd.com/1425/

Source: http://cs231n.github.io/classification/

Not anymore!
Source: https://xkcd.com/1425/

The Google self-driving car project
became Waymo with a mission to make
it easy and safe for people and things to
move around
Waymo
Video: Waymo

A deep learning algorithm capable of
interpreting signs of Diabetic
Retinopathy (DR) in retinal photographs.  
DR — an eye condition that affects
people with diabetes — is the fastest
growing cause of blindness, with nearly
415 million diabetic patients at risk
worldwide.
Detecting Diabetic
Eye Disease
Photo: Google Blog

An artificial intelligence trained to classify images
of skin lesions as benign lesions or malignant skin
cancers achieves the accuracy of board-certified
dermatologists. 
In this work, we pretrain a deep neural network at
general object recognition, then fine-tune it on a
dataset of ~130,000 skin lesion images comprised
of over 2000 diseases.
Identifying Skin
Cancer
Photos: Nature

"Farmers want to focus and spend their
time on growing delicious vegetables.”
— Makoto Koike
Cucumber  
Sorter
Photos: Google Cloud Platform / Kaz Sato

Demo
https://deeplearnjs.org/demos/teachable_gaming/
https://teachablemachine.withgoogle.com/

01
Intro to Deep Learning
02
Convolutional Neural Network (CNN)
03
Transfer Learning

Σ
x1
x2
1
O
input
output
A Perceptron

Σ
x1
x2
1
weighted  
sumw1
w2
b
w1x1 + w2x2 + b
bias
A Perceptron

Ow1x1 + w2x2 + b
Activation  
Function
-1
1
0
A Perceptron

Σ
x1
x2
1
w1
w2
b
O
1 if w1x1 + w2x2 + b > 0
-1 if w1x1 + w2x2 + b < 0
A Perceptron

x1 + x2 > 0
x1
x2
x1 + x2 = 0
x1 + x2 < 0

http://playground.tensorflow.org/

Other Activation Functions
Source: http://introtodeeplearning.com/ (Lecture 1)

x1
x2
h11
x3
input layer hidden layers
h12
O1
output layer
h21
h22
Deep Neural Network

x1
x2
x12 + x22 = 9
x12 + x22 > 9
x12 + x22 < 9

https://codelabs.developers.google.com/codelabs/cloud-tensorflow-mnist/

https://www.tensorflow.org/get_started/mnist/beginners
Red represents negative weights, while blue represents positive weights.
See more: scs.ryerson.ca/~aharley/vis/fc

Challenges
Doesn’t scale e.g., 300 x 300 RGB image would require 270,000 weights for
each neuron in the first hidden layer of the neural network
Easily overfit

CNN
Convolutional Neural Network

• Apply filters to one small region at a time to detect certain
features of an object (edge, circle, or certain shapes)
• A feature of an object is translation invariant, and the filters
applied to each small region can share weights/parameters
• Fewer parameters required. More robust.

Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," in Proceedings of the IEEE,
vol. 86, no. 11, pp. 2278-2324, Nov 1998.

Visualizing and Understanding Convolutional Networks (Zeiler and Fergus, 2014)

1. Convolutional Layer
2. Pooling Layer
3. Fully Connected Layer
Layers in CNNs
Source: http://cs231n.github.io/convolutional-networks/
INPUT
32x32
Convolutions SubsamplingConvolutions
C1: feature maps
6@28x28
Subsampling
S2: f. maps
6@14x14
S4: f. maps 16@5x5
C5: layer
120
C3: f. maps 16@10x10
F6: layer
84
Full connection
Full connection
Gaussian conne
OUTPUT
10

• Apply filters (or kernels) to the input to
produce the output volume
• The filter size (or receptive field) is small
spatially (3x3 or 5x5) but extends through the
full depth of the input volume
• Slide each filter across the width and height
of the input volume and compute dot
products between the filter and the input at
any position
Convolutional Layer
5 ﬁlters applied to the input

Convolutional Layer

• Reduce the spatial size of the
representation to limit the number of
parameters and avoid overfitting
• Downsample the input spatially using
MAX operation
• Operate independently on every depth
slice / feature map of the input (Thus,
the depth remains unchanged)
Pooling Layer

Inception V3
Source: Google Codelabs

Don’t be a hero.
Transfer learning as a shortcut

Transfer Learning
Retraining existing models

Machine Learning on Google Cloud Platform
Pre-trained ML models Custom ML models

Understand the content of images
● Label Detection
● Optical Character Recognition
● Explicit Content Detection
● Face Detection
etc.
Google Cloud  
Vision API
Photos: Google Cloud Platform / Kaz Sato
@tvirot

Demo
https://cloud.google.com/vision/

Custom ML Model
with TensorFlow

Demo
Based on https://www.tensorflow.org/tutorials/image_retraining

Demo
Codelab: https://goo.gl/qi5kiA

Thank you!
Ta Virot Chiraphadhanakul, PhD (@tvirot)
Google Developer Expert in Machine Learning  
Managing Director, Skooldio

Image Classification with Deep Learning | DevFest + GDay, George Town, Malaysia 2017

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Image Classification with Deep Learning | DevFest + GDay, George Town, Malaysia 2017

Ähnlich wie Image Classification with Deep Learning | DevFest + GDay, George Town, Malaysia 2017 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Image Classification with Deep Learning | DevFest + GDay, George Town, Malaysia 2017