SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Downloaden Sie, um offline zu lesen
1
NeoNet: Object centric training
for image recognition
Daniel Fontijne, Koen E. A. van de Sande, Eren Gölge,
R. Blythe Towal, Anthony Sarah, Cees G. M. Snoek
Qualcomm Technologies, Inc., December 17, 2015
Presented by:
Daniel Fontijne
Senior Staff Engineer
2
Summary
Key component: object centric training
Score Ranking
Classification 4.8 -
Localization 12.6 3
Detection 53.6 2
Places 2 17.6 3
3
Agenda
Foundation
1
Classification
2
Localization
3
Detection
4
Places 2
5
4
The base network for all our submissions is the inceptionnetwork as
introduced in the batch normalization paper by Ioffe & Szegedy.
Foundation: Batch-normalized inception
Ioffe & Szegedy ICML 2015
5
Network in an inception module
Note: the 5x5 path is not used.
Lin et al. ICLR 2014
6
Agenda
Foundation
1
Classification
2
Localization
3
Detection
4
Places 2
5
7
Ensemble of 12 networks
Train ‘really long’, 350 epochs.
Randomized RELU.
Test at 14 scales, 10 crops.
Object preserving crops.
Classification overview
Xu et al. ICML workshop 2015
8
Quiz: What is this?
9
Answer: Flower
10
Quiz: In case you got that right, what is this?
11
Answer: Butterfly
12
Random crop selection might miss the object of interest.
Network tries to remember ‘butterfly’ when presented with leaves.
Solution: use provided boxes to assure crop contains the object.
− For images without box annotation, use best box predicted by localization system.
Object preserving crops
X
13
Epochs Single view Multi-view
First attempt at inception + batch norm 112 8.63% 6.58%
Train ~325 epochs 324 8.77% 6.34%
32 images / mini-batch 130 8.74% 6.68%
Object preserving, 32 images/mini-batch 120 8.59% 6.51%
Object preserving with generated boxes 130 8.47% 6.46%
Ensemble of 12 - - 4.84%
Component breakdown
14
Final classification results
16.4
11.7
6.7
4.9
4.8
4.6
3.6
3.6
0 5 10 15 20
SuperVision ('12)
Clarifai ('13)
GoogLeNet ('14)
Ioffe & Szegedy, ICML '15
NeoNet
Trimps-Soushen
ReCeption
MSRA
Top-5 classification error on test set
NeoNet is competitive on object classification
15
Agenda
Foundation
1
Classification
2
Localization
3
Detection
4
Places 2
5
16
Foundations.
− Generate box proposals using fast selective search.
− Train box-classification networks on crops.
Object centric training.
− Object pre-training network.
− Object localization network.
− Object alignment network.
Localization overview
Girshik et al. PAMI 2016
Uijlings et al. IJCV 2013
17
Use the bounding box annotations for pre-training.
Increase the number of classes from N to 2*N+1:
− N classes for the object, well-framed.
− N classes for partially framed objects.
− 1 class for ‘background’, i.e., object not visible.
1% – 1.5% improvement compared to standard pre-training.
Object centric pre-training
18
Dual-head network to account for missing bounding boxes.
− One with 1000 outputs.
− One with 2001 outputs. No error gradient when box annotation is missing.
Object centric pre-training
19
Fully connected layer on top of Inception 4e and 5b.
Re-train Inception 5b and new head.
Then fine-tune entire network.
Object localization network
20
Quiz: Is this an entire skyscraper?
21
A 40% border worked best.
− Such that in 7x7 resolution of Inception 5b there is a 1 pixel border.
Bordering the object
22
Extra head for object box alignment.
Classification head is also used, but with cross entropy cost.
Object alignment network
23
Object box alignment moves corners up to 50% of the width and height.
100% border allows network to ‘see’ full range of possible alignments.
~2% gain.
Object alignment border
24
Component breakdown
Top-5 localization error
First attempt 24.0%
40% border, FC on top of inception 5b 22.5%
FC on top of inception 5b+4e 21.8%
Object centric pre-training 20.3%
Ensemble of 8 17.5%
Object alignment 15.5%
Final result with ILSVRC blacklist applied 14.5%
25
Final localization results
42.5
34.2
30.0
25.3
12.6
12.3
9.0
0 5 10 15 20 25 30 35 40 45
UvA ('11)
SuperVision ('12)
OverFeat ('13)
VGG ('14)
NeoNet
Trimps-Soushen
MSRA
Top-5 localization error on test set
NeoNet is competitive on object localization
26
Agenda
Foundation
1
Classification
2
Localization
3
Detection
4
Places 2
5
27
Improved selective search
Fast Improved
Color spaces 2 3
Segmentations 2 4
Similarity functions 2 4
Average boxes 1,600 5,000
MABO 77.5 82.6
Time (s) 0.8 2.4
mAP 41.2 44.0
28
Five inception-style networks for feature extraction
− Two trained on 1,000 object classes, no input border, fine-tuning on detection boxes
− Three trained on 1,000 object windows with input border, no fine tuning
Object detection network
29
Component breakdown
mAP on validation set
Best object class network 44.6
Best object centric network 47.7
Ensemble of 5 51.9
30
Component breakdown
mAP on validation set
Best object class network 44.6
Best object centric network 47.7
Ensemble of 5 51.9
+ context 53.2
Four classification networks
fine tuned with
200 detection class labels
31
mAP on validation set
Best object class network 44.6
Best object centric network 47.7
Ensemble of 5 51.9
+ context 53.2
+ object alignment 54.6
Component breakdown
32
Final detection results
22.6
43.9
52.7
53.6
62.1
0 10 20 30 40 50 60 70
UvA/Euvision ('13)
GoogLeNet ('14)
Deep-ID Net
NeoNet
MSRA
Mean average precision on test set
NeoNet is competitive on object detection
33
Agenda
Foundation
1
Classification
2
Localization
3
Detection
4
Places 2
5
34
Our best submission: an ensemble of two inception nets.
− Reduce fully connected layer from 1,000 to 401 outputs.
− Use pre-trained weights from ImageNet 1,000 (~325 epochs).
− Train Inception 5b and fully connected layer for two epochs.
− Fine-tune entire network for eight epochs.
Adding other networks reduced the accuracy
Places 2 overview
35
Component breakdown (top-5 error)
Single view Multi view
~325 epochs pre-training 17.9% 16.8%
First attempt. 112 epochs pre-training. 19.1% 17.9%
512 channel 5b, Alex-style FC head 20.0% 18.4%
32 images / batch 18.7% 17.6%
Randomized RELU 18.2% 17.5%
Ensemble of 7 - 16.7%
Ensemble of 2 - 16.5%
36
Final places 2 results
20
19.4
19.3
18.0
17.6
17.4
16.9
15 16 17 18 19 20 21
HiVision
MERL
ntu_rose
Trimps-Soushen
NeoNet
SIAT_MMLAB
WM
Top-5 classification error on test set
NeoNet is competitive on scene classification
37
On device recognition at 18 ms
38
Summary
Key component: object centric training
Score Ranking
Classification 4.8 -
Localization 12.6 3
Detection 53.6 2
Places 2 17.6 3
39
Nothing in these materials is an offer to sell any of the components or devices referenced herein.
©2013-2015 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Qualcomm and Snapdragon are trademarks of Qualcomm Incorporated, registered in the United States and other
countries. Zeroth is a trademark of Qualcomm Incorporated. Other products and brand names may be trademarks or
registered trademarks of their respective owners.
References in this presentation to “Qualcomm” may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or
other subsidiaries or business units within the Qualcomm corporate structure, as applicable.
Qualcomm Incorporated includes Qualcomm’s licensing business, QTL, and the vast majority of its patent portfolio.
Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries,
substantially all of Qualcomm’s engineering, research and development functions, and substantially all of its product and
services businesses, including its semiconductor business, QCT.
For more information on Qualcomm, visit us at:
www.qualcomm.com & www.qualcomm.com/blog
Thank you
Follow us on:

Weitere ähnliche Inhalte

Was ist angesagt?

Face detection ppt by Batyrbek
Face detection ppt by Batyrbek Face detection ppt by Batyrbek
Face detection ppt by Batyrbek Batyrbek Ryskhan
 
auto-assistance system for visually impaired person
auto-assistance system for visually impaired personauto-assistance system for visually impaired person
auto-assistance system for visually impaired personshahsamkit73
 
Smriti's research paper
Smriti's research paperSmriti's research paper
Smriti's research paperSmriti Tikoo
 
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...Universitat Politècnica de Catalunya
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General AudiencesSangwoo Mo
 
Face detection ppt
Face detection pptFace detection ppt
Face detection pptPooja R
 
Real-time Face Recognition & Detection Systems 1
Real-time Face Recognition & Detection Systems 1Real-time Face Recognition & Detection Systems 1
Real-time Face Recognition & Detection Systems 1Suvadip Shome
 
Rapid object detection using boosted cascade of simple features
Rapid object detection using boosted  cascade of simple featuresRapid object detection using boosted  cascade of simple features
Rapid object detection using boosted cascade of simple featuresHirantha Pradeep
 
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...IRJET Journal
 
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream) Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream) IT Arena
 
Face Detection System on Ada boost Algorithm Using Haar Classifiers
Face Detection System on Ada boost Algorithm Using Haar ClassifiersFace Detection System on Ada boost Algorithm Using Haar Classifiers
Face Detection System on Ada boost Algorithm Using Haar ClassifiersIJMER
 
Human age and gender Detection
Human age and gender DetectionHuman age and gender Detection
Human age and gender DetectionAbhiAchalla
 
Introduction to Face Processing with Computer Vision
Introduction to Face Processing with Computer VisionIntroduction to Face Processing with Computer Vision
Introduction to Face Processing with Computer VisionAll Things Open
 
Face Recognition Methods based on Convolutional Neural Networks
Face Recognition Methods based on Convolutional Neural NetworksFace Recognition Methods based on Convolutional Neural Networks
Face Recognition Methods based on Convolutional Neural NetworksElaheh Rashedi
 
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)Universitat Politècnica de Catalunya
 
Face Detection techniques
Face Detection techniquesFace Detection techniques
Face Detection techniquesAbhineet Bhamra
 
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamScene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamWithTheBest
 
Recognition of Partially Occluded Face Using Gradientface and Local Binary Pa...
Recognition of Partially Occluded Face Using Gradientface and Local Binary Pa...Recognition of Partially Occluded Face Using Gradientface and Local Binary Pa...
Recognition of Partially Occluded Face Using Gradientface and Local Binary Pa...Win Yu
 
Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1 Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1 ahmed mokhtar
 
3D Dynamic Facial Sequences Analsysis for face recognition and emotion detection
3D Dynamic Facial Sequences Analsysis for face recognition and emotion detection3D Dynamic Facial Sequences Analsysis for face recognition and emotion detection
3D Dynamic Facial Sequences Analsysis for face recognition and emotion detectionTaleb ALASHKAR
 

Was ist angesagt? (20)

Face detection ppt by Batyrbek
Face detection ppt by Batyrbek Face detection ppt by Batyrbek
Face detection ppt by Batyrbek
 
auto-assistance system for visually impaired person
auto-assistance system for visually impaired personauto-assistance system for visually impaired person
auto-assistance system for visually impaired person
 
Smriti's research paper
Smriti's research paperSmriti's research paper
Smriti's research paper
 
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
 
Face detection ppt
Face detection pptFace detection ppt
Face detection ppt
 
Real-time Face Recognition & Detection Systems 1
Real-time Face Recognition & Detection Systems 1Real-time Face Recognition & Detection Systems 1
Real-time Face Recognition & Detection Systems 1
 
Rapid object detection using boosted cascade of simple features
Rapid object detection using boosted  cascade of simple featuresRapid object detection using boosted  cascade of simple features
Rapid object detection using boosted cascade of simple features
 
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...
 
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream) Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
 
Face Detection System on Ada boost Algorithm Using Haar Classifiers
Face Detection System on Ada boost Algorithm Using Haar ClassifiersFace Detection System on Ada boost Algorithm Using Haar Classifiers
Face Detection System on Ada boost Algorithm Using Haar Classifiers
 
Human age and gender Detection
Human age and gender DetectionHuman age and gender Detection
Human age and gender Detection
 
Introduction to Face Processing with Computer Vision
Introduction to Face Processing with Computer VisionIntroduction to Face Processing with Computer Vision
Introduction to Face Processing with Computer Vision
 
Face Recognition Methods based on Convolutional Neural Networks
Face Recognition Methods based on Convolutional Neural NetworksFace Recognition Methods based on Convolutional Neural Networks
Face Recognition Methods based on Convolutional Neural Networks
 
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
 
Face Detection techniques
Face Detection techniquesFace Detection techniques
Face Detection techniques
 
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamScene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
 
Recognition of Partially Occluded Face Using Gradientface and Local Binary Pa...
Recognition of Partially Occluded Face Using Gradientface and Local Binary Pa...Recognition of Partially Occluded Face Using Gradientface and Local Binary Pa...
Recognition of Partially Occluded Face Using Gradientface and Local Binary Pa...
 
Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1 Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1
 
3D Dynamic Facial Sequences Analsysis for face recognition and emotion detection
3D Dynamic Facial Sequences Analsysis for face recognition and emotion detection3D Dynamic Facial Sequences Analsysis for face recognition and emotion detection
3D Dynamic Facial Sequences Analsysis for face recognition and emotion detection
 

Ähnlich wie Qualcomm research-imagenet2015

Computer vision for transportation
Computer vision for transportationComputer vision for transportation
Computer vision for transportationWanjin Yu
 
Александр Заричковый "Faster than real-time face detection"
Александр Заричковый "Faster than real-time face detection"Александр Заричковый "Faster than real-time face detection"
Александр Заричковый "Faster than real-time face detection"Fwdays
 
IRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
IRJET- Object Detection and Recognition using Single Shot Multi-Box DetectorIRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
IRJET- Object Detection and Recognition using Single Shot Multi-Box DetectorIRJET Journal
 
Comparison of Learning Algorithms for Handwritten Digit Recognition
Comparison of Learning Algorithms for Handwritten Digit RecognitionComparison of Learning Algorithms for Handwritten Digit Recognition
Comparison of Learning Algorithms for Handwritten Digit RecognitionSafaa Alnabulsi
 
Eva Mohedano, "Investigating EEG for Saliency and Segmentation Applications i...
Eva Mohedano, "Investigating EEG for Saliency and Segmentation Applications i...Eva Mohedano, "Investigating EEG for Saliency and Segmentation Applications i...
Eva Mohedano, "Investigating EEG for Saliency and Segmentation Applications i...Universitat Politècnica de Catalunya
 
Surveillance scene classification using machine learning
Surveillance scene classification using machine learningSurveillance scene classification using machine learning
Surveillance scene classification using machine learningUtkarsh Contractor
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer VisionSungjoon Choi
 
Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용홍배 김
 
Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료taeseon ryu
 
FINAL_Team_4.pptx
FINAL_Team_4.pptxFINAL_Team_4.pptx
FINAL_Team_4.pptxnitin571047
 
YOLO9000 - PR023
YOLO9000 - PR023YOLO9000 - PR023
YOLO9000 - PR023Jinwon Lee
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecognIlyas CHAOUA
 
Fast Fingerprint Classification with Deep Neural Networks
Fast Fingerprint Classification with Deep Neural NetworksFast Fingerprint Classification with Deep Neural Networks
Fast Fingerprint Classification with Deep Neural NetworksDaniel Michelsanti
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用CHENHuiMei
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETMarco Parenzan
 
Pruning convolutional neural networks for resource efficient inference
Pruning convolutional neural networks for resource efficient inferencePruning convolutional neural networks for resource efficient inference
Pruning convolutional neural networks for resource efficient inferenceKaushalya Madhawa
 
Master Thesis Object Tracking in Video with TensorFlow
Master Thesis Object Tracking in Video with TensorFlowMaster Thesis Object Tracking in Video with TensorFlow
Master Thesis Object Tracking in Video with TensorFlowAndrea Ferri
 
Scaling Deep Learning Algorithms on Extreme Scale Architectures
Scaling Deep Learning Algorithms on Extreme Scale ArchitecturesScaling Deep Learning Algorithms on Extreme Scale Architectures
Scaling Deep Learning Algorithms on Extreme Scale Architecturesinside-BigData.com
 
Image Classification using deep learning
Image Classification using deep learning Image Classification using deep learning
Image Classification using deep learning Asma-AH
 

Ähnlich wie Qualcomm research-imagenet2015 (20)

Computer vision for transportation
Computer vision for transportationComputer vision for transportation
Computer vision for transportation
 
Александр Заричковый "Faster than real-time face detection"
Александр Заричковый "Faster than real-time face detection"Александр Заричковый "Faster than real-time face detection"
Александр Заричковый "Faster than real-time face detection"
 
SIG-NOC Tools survey results
SIG-NOC Tools survey resultsSIG-NOC Tools survey results
SIG-NOC Tools survey results
 
IRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
IRJET- Object Detection and Recognition using Single Shot Multi-Box DetectorIRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
IRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
 
Comparison of Learning Algorithms for Handwritten Digit Recognition
Comparison of Learning Algorithms for Handwritten Digit RecognitionComparison of Learning Algorithms for Handwritten Digit Recognition
Comparison of Learning Algorithms for Handwritten Digit Recognition
 
Eva Mohedano, "Investigating EEG for Saliency and Segmentation Applications i...
Eva Mohedano, "Investigating EEG for Saliency and Segmentation Applications i...Eva Mohedano, "Investigating EEG for Saliency and Segmentation Applications i...
Eva Mohedano, "Investigating EEG for Saliency and Segmentation Applications i...
 
Surveillance scene classification using machine learning
Surveillance scene classification using machine learningSurveillance scene classification using machine learning
Surveillance scene classification using machine learning
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용
 
Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료
 
FINAL_Team_4.pptx
FINAL_Team_4.pptxFINAL_Team_4.pptx
FINAL_Team_4.pptx
 
YOLO9000 - PR023
YOLO9000 - PR023YOLO9000 - PR023
YOLO9000 - PR023
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecogn
 
Fast Fingerprint Classification with Deep Neural Networks
Fast Fingerprint Classification with Deep Neural NetworksFast Fingerprint Classification with Deep Neural Networks
Fast Fingerprint Classification with Deep Neural Networks
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
 
Pruning convolutional neural networks for resource efficient inference
Pruning convolutional neural networks for resource efficient inferencePruning convolutional neural networks for resource efficient inference
Pruning convolutional neural networks for resource efficient inference
 
Master Thesis Object Tracking in Video with TensorFlow
Master Thesis Object Tracking in Video with TensorFlowMaster Thesis Object Tracking in Video with TensorFlow
Master Thesis Object Tracking in Video with TensorFlow
 
Scaling Deep Learning Algorithms on Extreme Scale Architectures
Scaling Deep Learning Algorithms on Extreme Scale ArchitecturesScaling Deep Learning Algorithms on Extreme Scale Architectures
Scaling Deep Learning Algorithms on Extreme Scale Architectures
 
Image Classification using deep learning
Image Classification using deep learning Image Classification using deep learning
Image Classification using deep learning
 

Kürzlich hochgeladen

G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptxkhadijarafiq2012
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 

Kürzlich hochgeladen (20)

G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptx
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 

Qualcomm research-imagenet2015

  • 1. 1 NeoNet: Object centric training for image recognition Daniel Fontijne, Koen E. A. van de Sande, Eren Gölge, R. Blythe Towal, Anthony Sarah, Cees G. M. Snoek Qualcomm Technologies, Inc., December 17, 2015 Presented by: Daniel Fontijne Senior Staff Engineer
  • 2. 2 Summary Key component: object centric training Score Ranking Classification 4.8 - Localization 12.6 3 Detection 53.6 2 Places 2 17.6 3
  • 4. 4 The base network for all our submissions is the inceptionnetwork as introduced in the batch normalization paper by Ioffe & Szegedy. Foundation: Batch-normalized inception Ioffe & Szegedy ICML 2015
  • 5. 5 Network in an inception module Note: the 5x5 path is not used. Lin et al. ICLR 2014
  • 7. 7 Ensemble of 12 networks Train ‘really long’, 350 epochs. Randomized RELU. Test at 14 scales, 10 crops. Object preserving crops. Classification overview Xu et al. ICML workshop 2015
  • 10. 10 Quiz: In case you got that right, what is this?
  • 12. 12 Random crop selection might miss the object of interest. Network tries to remember ‘butterfly’ when presented with leaves. Solution: use provided boxes to assure crop contains the object. − For images without box annotation, use best box predicted by localization system. Object preserving crops X
  • 13. 13 Epochs Single view Multi-view First attempt at inception + batch norm 112 8.63% 6.58% Train ~325 epochs 324 8.77% 6.34% 32 images / mini-batch 130 8.74% 6.68% Object preserving, 32 images/mini-batch 120 8.59% 6.51% Object preserving with generated boxes 130 8.47% 6.46% Ensemble of 12 - - 4.84% Component breakdown
  • 14. 14 Final classification results 16.4 11.7 6.7 4.9 4.8 4.6 3.6 3.6 0 5 10 15 20 SuperVision ('12) Clarifai ('13) GoogLeNet ('14) Ioffe & Szegedy, ICML '15 NeoNet Trimps-Soushen ReCeption MSRA Top-5 classification error on test set NeoNet is competitive on object classification
  • 16. 16 Foundations. − Generate box proposals using fast selective search. − Train box-classification networks on crops. Object centric training. − Object pre-training network. − Object localization network. − Object alignment network. Localization overview Girshik et al. PAMI 2016 Uijlings et al. IJCV 2013
  • 17. 17 Use the bounding box annotations for pre-training. Increase the number of classes from N to 2*N+1: − N classes for the object, well-framed. − N classes for partially framed objects. − 1 class for ‘background’, i.e., object not visible. 1% – 1.5% improvement compared to standard pre-training. Object centric pre-training
  • 18. 18 Dual-head network to account for missing bounding boxes. − One with 1000 outputs. − One with 2001 outputs. No error gradient when box annotation is missing. Object centric pre-training
  • 19. 19 Fully connected layer on top of Inception 4e and 5b. Re-train Inception 5b and new head. Then fine-tune entire network. Object localization network
  • 20. 20 Quiz: Is this an entire skyscraper?
  • 21. 21 A 40% border worked best. − Such that in 7x7 resolution of Inception 5b there is a 1 pixel border. Bordering the object
  • 22. 22 Extra head for object box alignment. Classification head is also used, but with cross entropy cost. Object alignment network
  • 23. 23 Object box alignment moves corners up to 50% of the width and height. 100% border allows network to ‘see’ full range of possible alignments. ~2% gain. Object alignment border
  • 24. 24 Component breakdown Top-5 localization error First attempt 24.0% 40% border, FC on top of inception 5b 22.5% FC on top of inception 5b+4e 21.8% Object centric pre-training 20.3% Ensemble of 8 17.5% Object alignment 15.5% Final result with ILSVRC blacklist applied 14.5%
  • 25. 25 Final localization results 42.5 34.2 30.0 25.3 12.6 12.3 9.0 0 5 10 15 20 25 30 35 40 45 UvA ('11) SuperVision ('12) OverFeat ('13) VGG ('14) NeoNet Trimps-Soushen MSRA Top-5 localization error on test set NeoNet is competitive on object localization
  • 27. 27 Improved selective search Fast Improved Color spaces 2 3 Segmentations 2 4 Similarity functions 2 4 Average boxes 1,600 5,000 MABO 77.5 82.6 Time (s) 0.8 2.4 mAP 41.2 44.0
  • 28. 28 Five inception-style networks for feature extraction − Two trained on 1,000 object classes, no input border, fine-tuning on detection boxes − Three trained on 1,000 object windows with input border, no fine tuning Object detection network
  • 29. 29 Component breakdown mAP on validation set Best object class network 44.6 Best object centric network 47.7 Ensemble of 5 51.9
  • 30. 30 Component breakdown mAP on validation set Best object class network 44.6 Best object centric network 47.7 Ensemble of 5 51.9 + context 53.2 Four classification networks fine tuned with 200 detection class labels
  • 31. 31 mAP on validation set Best object class network 44.6 Best object centric network 47.7 Ensemble of 5 51.9 + context 53.2 + object alignment 54.6 Component breakdown
  • 32. 32 Final detection results 22.6 43.9 52.7 53.6 62.1 0 10 20 30 40 50 60 70 UvA/Euvision ('13) GoogLeNet ('14) Deep-ID Net NeoNet MSRA Mean average precision on test set NeoNet is competitive on object detection
  • 34. 34 Our best submission: an ensemble of two inception nets. − Reduce fully connected layer from 1,000 to 401 outputs. − Use pre-trained weights from ImageNet 1,000 (~325 epochs). − Train Inception 5b and fully connected layer for two epochs. − Fine-tune entire network for eight epochs. Adding other networks reduced the accuracy Places 2 overview
  • 35. 35 Component breakdown (top-5 error) Single view Multi view ~325 epochs pre-training 17.9% 16.8% First attempt. 112 epochs pre-training. 19.1% 17.9% 512 channel 5b, Alex-style FC head 20.0% 18.4% 32 images / batch 18.7% 17.6% Randomized RELU 18.2% 17.5% Ensemble of 7 - 16.7% Ensemble of 2 - 16.5%
  • 36. 36 Final places 2 results 20 19.4 19.3 18.0 17.6 17.4 16.9 15 16 17 18 19 20 21 HiVision MERL ntu_rose Trimps-Soushen NeoNet SIAT_MMLAB WM Top-5 classification error on test set NeoNet is competitive on scene classification
  • 38. 38 Summary Key component: object centric training Score Ranking Classification 4.8 - Localization 12.6 3 Detection 53.6 2 Places 2 17.6 3
  • 39. 39 Nothing in these materials is an offer to sell any of the components or devices referenced herein. ©2013-2015 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved. Qualcomm and Snapdragon are trademarks of Qualcomm Incorporated, registered in the United States and other countries. Zeroth is a trademark of Qualcomm Incorporated. Other products and brand names may be trademarks or registered trademarks of their respective owners. References in this presentation to “Qualcomm” may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries or business units within the Qualcomm corporate structure, as applicable. Qualcomm Incorporated includes Qualcomm’s licensing business, QTL, and the vast majority of its patent portfolio. Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of Qualcomm’s engineering, research and development functions, and substantially all of its product and services businesses, including its semiconductor business, QCT. For more information on Qualcomm, visit us at: www.qualcomm.com & www.qualcomm.com/blog Thank you Follow us on: