This document presents research on using convolutional neural networks (CNNs) to detect skin lesions from dermoscopic images. The researchers:
1. Developed a CNN (U-Net) to segment skin lesions from images, achieving a Dice coefficient of 0.8689.
2. Used a fine-tuned VGG-16 network to classify images as benign or malignant. They found that using their automatic segmentations as input improved sensitivity over using unaltered images.
3. Concluded that their deep learning approach can help dermatologists diagnose skin cancer, and that automatic segmentation improves classification sensitivity compared to using whole images, even without perfect segmentation. This verifies their hypothesis that segmentation enhances classification.
5. Background of the problem
▣ Skin cancer: most predominant type of cancer
▣ The frequency of melanoma doubles every 20 years
▣ Each year (in USA):
□ 76,380 new cases of melanoma
□ 6,750 deaths
▣ Melanoma is a deadly form of skin cancer, but survival rates
are high if detected and diagnosed early
▣ Melanoma detection: rely on hand-crafted features
□ ABCDE rule (Asymmetry, Border, Color, Dermoscopic
structure, and Evolving)
□ CASH rule (Color, Architecture, Symmetry, and
Homogeneity)
5
6. Background of the problem
▣ Discriminating between benign and malignant skin lesions is
challenging
▣ Without computer-based assistance: 60~80% detection
accuracy
6
7. Scope and goals
▣ Scope:
□ Assist physicians in classifying skin lesions (especially in
melanoma detection: 2-class classifier problem)
▣ Goal:
□ Use state-of-the-art techniques, called Deep Learning, to
design an intelligent medical imaging-based skin lesion
diagnosis system
□ Achieve (or improve upon) state-of-the-art results for:
■ skin lesion segmentation, and
■ skin lesion classification
□ Evaluate the impact of skin lesion segmentation on the
accuracy of the classifier
7
8. Hypothesis
Previous segmentation of an image
containing a skin lesion (i.e., isolating the
lesion from the background) improves the
accuracy and sensitivity of a Deep Learning
classification model approach.
9. Challenges
▣ Dermoscopic images may:
■ Contain artifacts, such as: moles, freckles, hair,
patches, shading and noise.
■ Present low contrast images between lesion and
background
■ Contain multiple skin lesions
9
13. Deep learning motivation
▣ Image representations to:
□ Image classification
□ Object detection and recognition
□ Semantic Segmentation
13
Self-driving cars[Goodfellow et al. 2014]
[Ciresan et al. 2013]
[Turaga et al 2010]
Slide credit: Bay Area Deep Learning School Presentation by A. Karpathy
15. Why deep learning now?
15
Large datasets GPUs (Graphics
Processing Unit)
* Not applicable to medical imaging
[Deng et al. Russakovsky et al.]
[NVIDIA et al.]
Framework
17. Convolution layer
17
32
32
3
5x5x3 filter
32x32x3 image
Convolve the filter with the image
i.e. “slide over the image spatially,
computing dot products”
Filters always extend the full
depth of the input volume
Slide credit: Bay Area Deep Learning School Presentation by A. Karpathy
18. Convolution layer
18
32
32
3
32x32x3 image
1 number:
the result of taking a dot product between the
filter and a small 5x5x3 chunk of the image
(i.e. 5*5*3 = 75-dimensional dot product + bias)
Slide credit: Bay Area Deep Learning School Presentation by A. Karpathy
Linear function
5x5x3 filter → weights
(Learnt by
Backpropagation algorithms)
19. Activation layer
19
32
32
3
32x32x3 image
5x5x3 filter
Convolve (slide) over all
spatial locations
ReLU
(Rectified
Linear Units)
1
28
28
Slide credit: Bay Area Deep Learning School Presentation by A. Karpathy
activation map
20. Pooling layer
▣ Undersampling task
□ Makes the representation smaller and more
manageable
□ Operates over each activation map independently
20
Slide credit: Bay Area Deep Learning School Presentation by A. Karpathy
26. ConvNets for classification
▣ Classification → Scoring:
□ The CNN computes a class score {float} to each
image
□ This score will be related to a class label {integer}
26
[224x224x3]
f Class scores,
indicating class labels
training
Slide credit: Bay Area Deep Learning School Presentation by A. Karpathy
27. ConvNets for segmentation
▣ Segmentation → Localization:
□ The CNN assigns a class label to each pixel (classify
all pixels)
■ {0,1} → {absence of object, presence of object}
□
27
Slide credit: CS231n
29. Transfer learning
29
1. Train on
Imagenet
3. Medium dataset:
finetuning
more data = retrain more of
the network (or all of it)
2. Small dataset:
feature extractor
Freeze these
Train this
Freeze these
Train this
Slide credit: Bay Area Deep Learning School Presentation by A. Karpathy
Medical Imaging case
31. Framework
▣ Python environment:
□ Keras - Deep Learning Library for Theano or TensorFlow
□ OpenCV / PIL (Python Imaging Library)
□ SciPy (Library for Mathematics, Science and Engineering)
□ Scikit-learn (Machine Learning Library)
□ CUDA library for the GPUs
31
+ =
32. ISIC Archive dataset
▣ ISBI 2016 Challenge dataset
□ Skin Lesion Analysis towards melanoma detection
□ 1279 RGB images
□ Labeled as either benign or malignant
□ Includes the binary mask for each image
32
Class
Benign Malignant Total Images
Training subset 727 173 900
Validation subset 304 75 379
0 → outside lesion area
255 → inside lesion area
Binary mask
34. Data augmentation
▣ Enlarge our few training examples:
□ Re-scaling
□ 40 degrees rotations
□ Horizontal shifts
□ Zooming
□ Horizontal flips
34
Original image Random transformations
35. Preprocessing
▣ Mean subtraction: X -= np.mean(X, axis = 0)
▣ Image Normalization: X /= np.std(X, axis = 0)
▣ Image cropping & resizing
□ Segmentation model: 64 x 80 px
□ Classification model: 224 x 224 px
35
36. Segmentation model: U-Net architecture
36
▣ Convolutional Networks for Biomedical Image
Segmentation by Olaf Ronneberger et al.
Binary Mask
37. Segmentation model: training parameters
37
▣ U-Net trained from scratch (small image size)
▣ Weights randomly initialized
▣ Loss function:
□ Dice coefficient
▣ Adam optimizer (Stochastic gradient-based
optimization):
□ Learning rate: 10e-5
▣ Batch size: 32
▣ Training epochs: 500 epochs
▣ 13 sec / epoch on NVidia GeForce GTX TITAN X GPU
38. Objective
To verify our hypothesis:
1. Unaltered lesion classification
2. Perfectly segmented lesion classification
3. Automatically segmented lesion classification
38
Logical
AND
operation
Logical
AND
operation
Original Binary Mask (perfect)
Binary Mask obtained with the U-Net
Previous segmentation
of the skin lesion
improves the accuracy
and sensitivity of a Deep
Learning classification
model.
(1)
(2)
(3)
40. Classification Model: VGG-16 Architecture
40
▣ Five Convolutional
Blocks (2D conv.)
▣ 3 x 3 receptive field
▣ ReLU as Activation
Functions
▣ Max-Pooling
▣ Classifier block:
□ 3 FC Layers at the top
of the network
41. Fine-tuning the VGG-16 Architecture
41
▣ Weights
initialized with
the VGG-16
pretrained on
Imagenet
dataset
▣ Freeze bottom
of the network
▣ Just train the
top of the
VGG-16 Train this
41
Freeze these
42. Classification Model: Loss function
▣ Problem: ISIC dataset classes not balanced
□ Validation subset:
■ 304 benign images
■ 75 malignant images
▣ Weighted Loss function:
where ρ is defined as 1−frequency appearance (minor class)
42
43. Classification Model: Training parameters
43
▣ VGG-16 fine-tuned
▣ Weights initialized with the VGG-16 pretrained on
Imagenet dataset
▣ Loss function:
□ Weighted Loss function
▣ SGD optimizer (Stochastic gradient-based
optimization):
□ Learning rate: 10e-5
▣ Batch size: 32
▣ Training epochs: 50 epochs
▣ 35 sec / epoch on NVidia GeForce GTX TITAN X GPU
44. Overfitting
▣ When a model fits the training data too well
□ Noise in the training data is learned by the model
▣ How to prevent it?
□ Dropout
□ Choosing a reduced network (VGG-16 with 138M
param. rather than VGG-19 with 144M param.)
44
52. Sensitivity in Medical Settings
▣ Sensitivity is often considered the most
important metric in the medical setting
▣ For early diagnosis
□ By missing a False Negatives (true melanoma case)
the model would fail in the early diagnosis
□ It is better to raise a False Positive than to create a
False Negative
55
53. Classification evaluation
56
Model Accuracy Loss Sensitivity Precision
Unaltered lesion
clas.
0.8469 0.4723 0.8243 0.9523
Perfectly
segmented
lesion clas.
0.8390 0.4958 0.8648 0.9621
Automatically
segmented
lesion clas.
0.8174 0.5144
0.8918 0.9681
▣ And the Automatically Segmented Model is
even BETTER than the Perfectly Segmented
□ Physicians can avoid Manual Segmentation tasks
57. Conclusions
▣ DL solution for assisting dermatologists with
the diagnosis of skin lesions
□ Specifically, for early melanoma detection
▣ Does a previous semantic segmentation
improve the performance of a fine-tuned CNN
for a 2-class classifier?
□ Hypothesis verified
▣ Perfect Segmentation was not needed to
obtain the best classification result of the
model
□ DL Segmentation approach obtained the best
sensitivity classification result
60
58. Conclusions
▣ BioMed 2017 Conference → Paper Accepted
□ Title: “Skin Lesion Classification from Dermoscopic
Images Using Deep Learning Techniques”
▣ SIIM 2017 Meeting → Paper Accepted
□ Title: “The Impact of Segmentation on the Accuracy
and Sensitivity of a Melanoma Classifier Based on Skin
Lesion Images”
▣ MICCAI 2017 Conference → Intention of Paper
▣ MIUA 2017 Conference → Intention of Paper
▣ ISBI 2017 Challenge → Intention of Participation
□ Skin Lesion Analysis Towards Melanoma Detection
61