SlideShare ist ein Scribd-Unternehmen logo
1 von 49
Downloaden Sie, um offline zu lesen
Deep Learning for
image segmentation
Michael Jamroz & Matthew Opala
AGENDA
Deep Learning methods for image
segmentation
Case study -
clothing parsing
Segmentation in Computer Vision
Segmentation in Computer Vision1
Computer Vision tasks
DRESS HEELS
BAG
Classification Detection Segmentation
DRESS HEELS
BAG
DRESS HEELS
BAG
Semantic Segmentation
◦ Annotate each pixel
◦ Doesn’t differentiate instances
◦ Classic computer vision task
Instance Aware Segmentation
◦ Detect instances
◦ Annotate each pixel
◦ Simultaneous
detection and
segmentation
◦ Recent challenge in
MS-COCO
Traditional methods
Kota Yamaguchi, M Hadi Kiapour, Tamara L Berg, "Paper Doll Parsing:
Retrieving Similar Styles to Parse Clothing Items", ICCV 2013
● Multi-stage pipeline with image features engineered by
hand (HoGs, MR8 etc.)
● Segmentation -> classification of every pixel with linear
regression
Deep Learning methods for image
segmentation
2
Convolutional neural networks
● Firstly used successfully in classification task
● Three basic operations: convolution, pooling,
nonlinearity function
Semantic segmentation with CNN
CNN DRESS
Input Extract Patch Classify
center pixel
Repeat for each
pixel
Semantic segmentation with CNN
CNN Smaller output
due to pooling
Fully Convolutional Neural Networks
Long, Shelhamer and Darrell, “Fully Convolutional Networks For Semantic
Segmentation”, CVPR 2015
Fully Convolutional Neural Networks
Learnable upsampling: deconvolution
Typical 3 x 3 convolution, stride 1 pad 1
Input: 4 x 4 Output: 4 x 4
Learnable upsampling: deconvolution
Typical 3 x 3 convolution, stride 1 pad 1
Input: 4 x 4 Output: 4 x 4
Dot product
between
filter and
input
Learnable upsampling: deconvolution
Typical 3 x 3 convolution, stride 1 pad 1
Input: 4 x 4 Output: 4 x 4
Dot product
between
filter and
input
Learnable upsampling: deconvolution
Typical 3 x 3 convolution, stride 2 pad 1
Input: 4 x 4 Output: 2 x 2
Learnable upsampling: deconvolution
Typical 3 x 3 convolution, stride 2 pad 1
Input: 4 x 4 Output: 2 x 2
Dot product
between filter
and input
Learnable upsampling: deconvolution
Typical 3 x 3 convolution, stride 2 pad 1
Input: 4 x 4 Output: 2 x 2
Dot product
between filter
and input
Learnable upsampling: deconvolution
3 x 3 “deconvolution”, stride 2 pad 1
Input: 2 x 2 Output: 4 x 4
Learnable upsampling: deconvolution
3 x 3 “deconvolution”, stride 2 pad 1
Input: 2 x 2 Output: 4 x 4
Input gives
weight for filter
Learnable upsampling: deconvolution
3 x 3 “deconvolution”, stride 2 pad 1
Input: 2 x 2 Output: 4 x 4
Input gives
weight for filter
Learnable upsampling: deconvolution
3 x 3 “deconvolution”, stride 2 pad 1
Input: 2 x 2 Output: 4 x 4
Input gives
weight for filter
Sum where
output overlaps
Deconvolution Network for Semantic Segmentation
Normal VGG “Upside down”
VGG
Noh, Hong and Hang, “Learning Deconvolution Network for Semantic
Segmentation”, arXiv 2015
Deconvolution Network: Pooling
Input
Pooled map
Switch
Variables
Deconvolution Network: Unpooling
Input
Pooled map
Switch
Variables
DeconvNet vs. FCN
Input Ground
truth
FCN DeconvNet EDeconvNet EDeconvNet
+ CRF
DeepLab: Atrous Convolution and Fully Connected CRFs
Chen, Papandreou, Kokkinos, Murphy, Yuille “Semantic Image Segmentation with Deep
Convolutional Nets and Fully Connected CRFs”, ICLR 2015
● Conditional random field used as a post-processing
step
Conditional Random Field
Atrous convolution
● Convolution “with holes”
● Performing convolution with larger receptive field without losing performance
Atrous convolution
● Performing convolution on downsampled input, later upsampling the result to
original resolution
● Performing convolution with holes on originally-sized input
Case study - clothing parsing3
Clothing parsing
◦ Goal: detect and segment some basic clothing
categories: dresses, bags, shoes, trousers etc. on
humans
◦ We need precise clothing masks for further
processing (image search, color detection)
◦ The biggest publicly available dataset contains 7,7k
images
ATR Dataset
◦ Images with ground-truth labels, 7.7k examples
◦ 18 clothing categories
◦ https://github.com/lemondan/HumanParsing-Dataset
ATR Dataset
Clothing parsing with general segmentation
◦ DeepLab model basing on VGG-16 architecture
◦ Both variants: with and without CRF post-processing
◦ Finetuning from VGG-16 trained on ImageNet
classification challenge
◦ Images resized to 513 x 513 resolution
◦ Training details
▫ Batch size: 8
▫ 20k iterations - 10 epochs
▫ Dataset divided into train/test in ratio = 0.9
Clothing parsing with general segmentation: results
Input
DeepLab
+ CRFDeepLab
Ground
truth
Clothing parsing with general segmentation: results
DeepLab:
DeepLab
+ CRF:
Ground
truth
Input
Clothing parsing with general segmentation: metrics
Bags:
Dresses:
model accuracy precision recall f1-score IoU
DeepLab 0,9903 0,64 0,51 0,54 0,45
DeepLab +
CRF
0,9908 0,664 0,525 0,553 0,48
model accuracy precision recall f1-score IoU
DeepLab 0,9586 0,481 0,39 0,399 0,349
DeepLab +
CRF
0,9558 0,506 0,436 0,438 0,397
Clothing parsing with detection and segmentation
● Detecting category with
object detector like R-CNN,
SSD, YOLO etc.
● Segmenting the object inside
bounding box with models
like DeepLab, DeepCut etc.
● Motivation: it’s much faster
to gather bounding box level
annotations than pixel-wise
annotations
● Hypothesis: given correct
bounding box it’s easier to
segment clothing item than
on whole image
Single Shot Multibox Detector (SSD)
Wen Liu et. al,, "SSD: Single Shot Multibox Detector",
2016
4135/360
Bags train/test size
11740/ 3990
Dresses train/test size
0.93
Bags mAP
0.7
Dresses mAP
model accuracy precision recall f1-score IoU
DeepLab 0,9903 0,64 0,51 0,54 0,45
DeepLab +
CRF
0,9908 0,664 0,525 0,553 0,48
D&S 0,993 0,765 0,709 0,731 0,64
Clothing parsing with detection and segmentation: bags
metrics
model accuracy precision recall f1-score IoU
DeepLab 0,9586 0,481 0,39 0,399 0,349
DeepLab +
CRF
0,9558 0,506 0,436 0,438 0,397
D&S 0,931 0,416 0,409 0,407 0,378
Clothing parsing with detection and segmentation:
dresses metrics
Visualisations of Detection & Segmentation approach
Visualisations of Detection & Segmentation approach
Visualisations of Detection & Segmentation approach
What have we used?
◦ Caffe & Python
◦ https://github.com/weiliu89/caff
e/tree/ssd
◦ https://bitbucket.org/aquariusja
y/deeplab-public-ver2
Thanks!
Q&A
You can contact us at:
michaljamroz@craftinity.com
mateuszopala@craftinity.com

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

YOLO9000 - PR023
YOLO9000 - PR023YOLO9000 - PR023
YOLO9000 - PR023
 
Review-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningReview-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learning
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs Retinanet
 
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network Approaches
 
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)
 
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
 
Anomaly Detection and Localization Using GAN and One-Class Classifier
Anomaly Detection and Localization  Using GAN and One-Class ClassifierAnomaly Detection and Localization  Using GAN and One-Class Classifier
Anomaly Detection and Localization Using GAN and One-Class Classifier
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
 
Perceptrons (D1L2 2017 UPC Deep Learning for Computer Vision)
Perceptrons (D1L2 2017 UPC Deep Learning for Computer Vision)Perceptrons (D1L2 2017 UPC Deep Learning for Computer Vision)
Perceptrons (D1L2 2017 UPC Deep Learning for Computer Vision)
 
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
 
Single Shot Multibox Detector
Single Shot Multibox DetectorSingle Shot Multibox Detector
Single Shot Multibox Detector
 
Deep Learning for Computer Vision: Segmentation (UPC 2016)
Deep Learning for Computer Vision: Segmentation (UPC 2016)Deep Learning for Computer Vision: Segmentation (UPC 2016)
Deep Learning for Computer Vision: Segmentation (UPC 2016)
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용
 

Ähnlich wie #6 PyData Warsaw: Deep learning for image segmentation

Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Universitat Politècnica de Catalunya
 

Ähnlich wie #6 PyData Warsaw: Deep learning for image segmentation (20)

Eye deep
Eye deepEye deep
Eye deep
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
 
DL (v2).pptx
DL (v2).pptxDL (v2).pptx
DL (v2).pptx
 
PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...
PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...
PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...
 
IRJET- Generating 3D Models Using 3D Generative Adversarial Network
IRJET- Generating 3D Models Using 3D Generative Adversarial NetworkIRJET- Generating 3D Models Using 3D Generative Adversarial Network
IRJET- Generating 3D Models Using 3D Generative Adversarial Network
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
 
Waste Classification System using Convolutional Neural Networks.pptx
Waste Classification System using Convolutional Neural Networks.pptxWaste Classification System using Convolutional Neural Networks.pptx
Waste Classification System using Convolutional Neural Networks.pptx
 
Getting Started with Machine Learning
Getting Started with Machine LearningGetting Started with Machine Learning
Getting Started with Machine Learning
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slides
 
(Paper Review)3D shape reconstruction from sketches via multi view convolutio...
(Paper Review)3D shape reconstruction from sketches via multi view convolutio...(Paper Review)3D shape reconstruction from sketches via multi view convolutio...
(Paper Review)3D shape reconstruction from sketches via multi view convolutio...
 
Deep MIML Network
Deep MIML NetworkDeep MIML Network
Deep MIML Network
 
KaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep LearningKaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep Learning
 
DALL-E.pdf
DALL-E.pdfDALL-E.pdf
DALL-E.pdf
 
You only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detectionYou only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detection
 
20190927 generative models_aia
20190927 generative models_aia20190927 generative models_aia
20190927 generative models_aia
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection Process
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesis
 
Yolo releases gianmaria
Yolo releases gianmariaYolo releases gianmaria
Yolo releases gianmaria
 

Kürzlich hochgeladen

dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 

Kürzlich hochgeladen (20)

GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 o
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 

#6 PyData Warsaw: Deep learning for image segmentation