SlideShare ist ein Scribd-Unternehmen logo
1 von 61
Downloaden Sie, um offline zu lesen
Gang Yu
旷 视 研 究 院
Object Detection in Recent 3 Years
Beyond RetinaNet and Mask R-CNN
Schedule of Tutorial
• Lecture 1: Beyond RetinaNet and Mask R-CNN (Gang Yu)
• Lecture 2: AutoML for Object Detection (Xiangyu Zhang)
• Lecture 3: Finegrained Visual Analysis (Xiu-shen Wei)
Outline
• Introduction to Object Detection
• Modern Object detectors
• One Stage detector vs Two-stage detector
• Challenges
• Backbone
• Head
• Pretraining
• Scale
• Batch Size
• Crowd
• NAS
• Fine-Grained
• Conclusion
Outline
• Introduction to Object Detection
• Modern Object detectors
• One Stage detector vs Two-stage detector
• Challenges
• Backbone
• Head
• Pretraining
• Scale
• Batch Size
• Crowd
• NAS
• Fine-Grained
• Conclusion
What is object detection?
What is object detection?
Detection - Evaluation Criteria
Average Precision (AP) and mAP
Figures are from wikipedia
Detection - Evaluation Criteria
mmAP
Figures are from http://cocodataset.org
How to perform a detection?
• Sliding window: enumerate all the windows (up to millions of windows)
• VJ detector: cascade chain
• Fully Convolutional network
• shared computation
Robust Real-time Object Detection; Viola, Jones; IJCV 2001
http://www.vision.caltech.edu/html-files/EE148-2005-Spring/pprs/viola04ijcv.pdf
General Detection Before Deep Learning
• Feature + classifier
• Feature
• Haar Feature
• HOG (Histogram of Gradient)
• LBP (Local Binary Pattern)
• ACF (Aggregated Channel Feature)
• …
• Classifier
• SVM
• Bootsing
• Random Forest
Traditional Hand-crafted Feature: HoG
Traditional Hand-crafted Feature: HoG
General Detection Before Deep Learning
Traditional Methods
• Pros
• Efficient to compute (e.g., HAAR, ACF) on CPU
• Easy to debug, analyze the bad cases
• reasonable performance on limited training data
• Cons
• Limited performance on large dataset
• Hard to be accelerated by GPU
Deep Learning for Object Detection
Based on the whether following the “proposal and refine”
• One Stage
• Example: Densebox, YOLO (YOLO v2), SSD, Retina Net
• Keyword: Anchor, Divide and conquer, loss sampling
• Two Stage
• Example: RCNN (Fast RCNN, Faster RCNN), RFCN, FPN,
MaskRCNN
• Keyword: speed, performance
A bit of History
Image
Feature
Extractor
classification
localization
(bbox)
One stage detector
Densebox (2015) UnitBox (2016) EAST (2017)
YOLO (2015) Anchor Free
Anchor importedYOLOv2 (2016)
SSD (2015)
RON(2017)
RetinaNet(2017)
DSSD (2017)
two stages detector
Image
Feature
Extractor
classification
localization
(bbox)
Proposal
classification
localization
(bbox)
Refine
RCNN (2014) Fast RCNN(2015)
Faster RCNN (2015)
RFCN (2016)
MultiBox(2014)
RFCN++ (2017)
FPN (2017)
Mask RCNN (2017)
OverFeat(2013)
Outline
• Introduction to Object Detection
• Modern Object detectors
• One Stage detector vs Two-stage detector
• Challenges
• Backbone
• Head
• Pretraining
• Scale
• Batch Size
• Crowd
• NAS
• Fine-Grained
• Conclusion
Modern Object detectors
Backbone Head
• Modern object detectors
• RetinaNet
• f1-f7 for backbone, f3-f7 with 4 convs for head
• FPN with ROIAlign
• f1-f6 for backbone, two fcs for head
• Recall vs localization
• One stage detector: Recall is high but compromising the localization ability
• Two stage detector: Strong localization ability
Postprocess
NMS
One Stage detector: RetinaNet
• FPN Structure
• Focal loss
Focal Loss for Dense Object Detection, Lin etc, ICCV 2017 Best student paper
One Stage detector: RetinaNet
• FPN Structure
• Focal loss
Focal Loss for Dense Object Detection, Lin etc, ICCV 2017 Best student paper
Two-Stage detector: FPN/Mask R-CNN
• FPN Structure
• ROIAlign
Mask R-CNN, He etc, ICCV 2017 Best paper
What is next for object detection?
• The pipeline seems to be mature
• There still exists a large gap between existing state-of-arts and product
requirements
• The devil is in the detail
Outline
• Introduction to Object Detection
• Modern Object detectors
• One Stage detector vs Two-stage detector
• Challenges
• Backbone
• Head
• Pretraining
• Scale
• Batch Size
• Crowd
• NAS
• Fine-Grained
• Conclusion
Challenges Overview
• Backbone
• Head
• Pretraining
• Scale
• Batch Size
• Crowd
• NAS
• Fine-grained
Backbone Head Postprocess
NMS
Challenges - Backbone
• Backbone network is designed for classification task but not for
localization task
• Receptive Field vs Spatial resolution
• Only f1-f5 is pretrained but randomly initializing f6 and f7 (if applicable)
Backbone - DetNet
• DetNet: A Backbone network for Object Detection, Li etc, 2018,
https://arxiv.org/pdf/1804.06215.pdf
Backbone - DetNet
Backbone - DetNet
Backbone - DetNet
Backbone - DetNet
Backbone - DetNet
Challenges - Head
• Speed is significantly improved for the two-stage detector
• RCNN - > Fast RCNN -> Faster RCNN - > RFCN
• How to obtain efficient speed as one stage detector like YOLO, SSD?
• Small Backbone
• Light Head
Head – Light head RCNN
• Light-Head R-CNN: In Defense of Two-Stage Object Detector, 2017,
https://arxiv.org/pdf/1711.07264.pdf
Code: https://github.com/zengarden/light_head_rcnn
Head – Light head RCNN
• Backbone
• L: Resnet101
• S: Xception145
• Thin Feature map
• L:C_{mid} = 256
• S: C_{mid} =64
• C_{out} = 10 * 7 * 7
• R-CNN subnet
• A fc layer is connected to the PS ROI pool/Align
Head – Light head RCNN
Head – Light head RCNN
Head – Light head RCNN
• Mobile Version
• ThunderNet: Towards Real-time Generic Object Detection, Qin etc, Arxiv
2019
• https://arxiv.org/abs/1903.11752
Pretraining – Objects365
• ImageNet pretraining is usually employed for backbone training
• Training from Scratch
• Scratch Det claims GN/BN is important
• Rethinking ImageNet Pretraining validates that training time is important
Pretraining – Objects365
• Objects365 Dataset
Pretraining – Objects365
• Pretraining with Objects365 vs ImageNet vs from Sctratch
Pretraining – Objects365
• Pretraining on Backbone or Pretraining on both backbone and head
Pretraining – Objects365
• Results on VOC Detection & VOC Segmentation
Pretraining – Objects365
• Summary
• Pretraining is important to reduce the training time
• Pretraining with a large dataset is beneficial for the performance
Challenges - Scale
• Scale variations is extremely large for object detection
Challenges - Scale
• Scale variations is extremely large for object detection
• Previous works
• Divide and Conquer: SSD, DSSD, RON, FPN, …
• Limited Scale variation
• Scale Normalization for Image Pyramids, Singh etc, CVPR2018
• Slow inference speed
• How to address extremely large scale variation without compromising
inference speed?
Scale - SFace
• SFace: An Efficient Network for Face Detection in Large Scale Variations,
2018, http://cn.arxiv.org/pdf/1804.06559.pdf
• Anchor-based:
• Good localization for the scales which are covered by anchors
• Difficult to address all the scale ranges of faces
• Anchor-free:
• Able to cover various face scales
• Not good for the localization ability
Scale - SFace
Scale - SFace
Scale - SFace
Scale - SFace
• Summary:
• Integrate anchor-based and anchor-free for the scale issue
• A new benchmark for face detection with large scale variations: 4K Face
Challenges - Batchsize
• Small mini-batchsize for general object detection
• 2 for R-CNN, Faster RCNN
• 16 for RetinaNet, Mask RCNN
• Problem with small mini-batchsize
• Long training time
• Insufficient BN statistics
• Inbalanced pos/neg ratio
Batchsize – MegDet
• MegDet: A Large Mini-Batch Object Detector, CVPR2018,
https://arxiv.org/pdf/1711.07240.pdf
Batchsize – MegDet
• Techniques
• Learning rate warmup
• Cross-GPU Batch Normalization
Challenges - Crowd
• NMS is a post-processing step to eliminate multiple responses on one object
instance
• Reasonable for mild crowdness like COCO and VOC
• Will Fail in the case when the objects are in a crowd
Challenges - Crowd
• A few works have been devoted to this topic
• Softnms, Bodla etc, ICCV 2017, http://www.cs.umd.edu/~bharat/snms.pdf
• Relation Networks, Hu etc, CVPR 2018,
https://arxiv.org/pdf/1711.11575.pdf
• Lacking a good benchmark for evaluation in the literature
Crowd - CrowdHuman
• CrowdHuman: A Benchmark for Detecting Human in a Crowd, 2018,
https://arxiv.org/pdf/1805.00123.pdf, http://www.crowdhuman.org/
• A benchmark with Head, Visible Human, Full body bounding-box
• Generalization ability for other head/pedestrian datasets
• Crowdness
Crowd - CrowdHuman
Crowd-CrowdHuman
Crowd-CrowdHuman
• Generalization
• Head
• Pedestrian
• COCO
Conclusion
• The task of object detection is still far from solved
• Details are important to further improve the performance
• Backbone
• Head
• Pretraining
• Scale
• Batchsize
• Crowd
• The improvement of object detection will be a significantly boost for the
computer vision industry
广告部分
• Megvii Detection 知乎专栏
Email: yugang@megvii.com
Object Detection Beyond Mask R-CNN and RetinaNet I

Weitere ähnliche Inhalte

Was ist angesagt?

Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Edureka!
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetSungminYou
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networksSi Haem
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkNader Karimi
 
Recursive Neural Networks
Recursive Neural NetworksRecursive Neural Networks
Recursive Neural NetworksSangwoo Mo
 
Resnet.pptx
Resnet.pptxResnet.pptx
Resnet.pptxYanhuaSi
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)Fellowship at Vodafone FutureLab
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learningAntonio Rueda-Toicen
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep LearningJulien SIMON
 
Quantum neural network
Quantum neural networkQuantum neural network
Quantum neural networksurat murthy
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationDat Nguyen
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural networkFerdous ahmed
 
Image Classification using deep learning
Image Classification using deep learning Image Classification using deep learning
Image Classification using deep learning Asma-AH
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Sujit Pal
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Simplilearn
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기NAVER Engineering
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detectionWenjing Chen
 

Was ist angesagt? (20)

Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNet
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
 
Recursive Neural Networks
Recursive Neural NetworksRecursive Neural Networks
Recursive Neural Networks
 
Resnet.pptx
Resnet.pptxResnet.pptx
Resnet.pptx
 
Yolo releases gianmaria
Yolo releases gianmariaYolo releases gianmaria
Yolo releases gianmaria
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep Learning
 
Recurrent neural network
Recurrent neural networkRecurrent neural network
Recurrent neural network
 
Quantum neural network
Quantum neural networkQuantum neural network
Quantum neural network
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
Image Classification using deep learning
Image Classification using deep learning Image Classification using deep learning
Image Classification using deep learning
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
 
Yolo
YoloYolo
Yolo
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
 

Ähnlich wie Object Detection Beyond Mask R-CNN and RetinaNet I

Computer vision for transportation
Computer vision for transportationComputer vision for transportation
Computer vision for transportationWanjin Yu
 
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...NAVER Engineering
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNNJunho Cho
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...Edge AI and Vision Alliance
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition Intel Nervana
 
Object extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learningObject extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learningAly Abdelkareem
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Jihong Kang
 
FPT17: An object detector based on multiscale sliding window search using a f...
FPT17: An object detector based on multiscale sliding window search using a f...FPT17: An object detector based on multiscale sliding window search using a f...
FPT17: An object detector based on multiscale sliding window search using a f...Hiroki Nakahara
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewLEE HOSEONG
 
Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19ExtremeEarth
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with TransformersDatabricks
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesUnited States Air Force Academy
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)DonghyunKang12
 
How static analysis supports quality over 50 million lines of C++ code
How static analysis supports quality over 50 million lines of C++ codeHow static analysis supports quality over 50 million lines of C++ code
How static analysis supports quality over 50 million lines of C++ codecppfrug
 
Information Exploitation at BBN
Information Exploitation at BBNInformation Exploitation at BBN
Information Exploitation at BBNPlamen Petrov
 
Week5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxWeek5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxfahmi324663
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012Jinwon Lee
 

Ähnlich wie Object Detection Beyond Mask R-CNN and RetinaNet I (20)

Computer vision for transportation
Computer vision for transportationComputer vision for transportation
Computer vision for transportation
 
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNN
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition
 
Object extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learningObject extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learning
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
 
FPT17: An object detector based on multiscale sliding window search using a f...
FPT17: An object detector based on multiscale sliding window search using a f...FPT17: An object detector based on multiscale sliding window search using a f...
FPT17: An object detector based on multiscale sliding window search using a f...
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
 
Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with Transformers
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large Repositories
 
kanimozhi2019.pdf
kanimozhi2019.pdfkanimozhi2019.pdf
kanimozhi2019.pdf
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
OBDPC 2022
OBDPC 2022OBDPC 2022
OBDPC 2022
 
How static analysis supports quality over 50 million lines of C++ code
How static analysis supports quality over 50 million lines of C++ codeHow static analysis supports quality over 50 million lines of C++ code
How static analysis supports quality over 50 million lines of C++ code
 
Information Exploitation at BBN
Information Exploitation at BBNInformation Exploitation at BBN
Information Exploitation at BBN
 
Week5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxWeek5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptx
 
Manycores for the Masses
Manycores for the MassesManycores for the Masses
Manycores for the Masses
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 

Mehr von Wanjin Yu

Architecture Design for Deep Neural Networks III
Architecture Design for Deep Neural Networks IIIArchitecture Design for Deep Neural Networks III
Architecture Design for Deep Neural Networks IIIWanjin Yu
 
Intelligent Multimedia Recommendation
Intelligent Multimedia RecommendationIntelligent Multimedia Recommendation
Intelligent Multimedia RecommendationWanjin Yu
 
Architecture Design for Deep Neural Networks II
Architecture Design for Deep Neural Networks IIArchitecture Design for Deep Neural Networks II
Architecture Design for Deep Neural Networks IIWanjin Yu
 
Architecture Design for Deep Neural Networks I
Architecture Design for Deep Neural Networks IArchitecture Design for Deep Neural Networks I
Architecture Design for Deep Neural Networks IWanjin Yu
 
Causally regularized machine learning
Causally regularized machine learningCausally regularized machine learning
Causally regularized machine learningWanjin Yu
 
Object Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet IIIObject Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet IIIWanjin Yu
 
Object Detection Beyond Mask R-CNN and RetinaNet II
Object Detection Beyond Mask R-CNN and RetinaNet IIObject Detection Beyond Mask R-CNN and RetinaNet II
Object Detection Beyond Mask R-CNN and RetinaNet IIWanjin Yu
 
Visual Search and Question Answering II
Visual Search and Question Answering IIVisual Search and Question Answering II
Visual Search and Question Answering IIWanjin Yu
 
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
 
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
 
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
 
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
 
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...Wanjin Yu
 
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...Wanjin Yu
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Wanjin Yu
 

Mehr von Wanjin Yu (15)

Architecture Design for Deep Neural Networks III
Architecture Design for Deep Neural Networks IIIArchitecture Design for Deep Neural Networks III
Architecture Design for Deep Neural Networks III
 
Intelligent Multimedia Recommendation
Intelligent Multimedia RecommendationIntelligent Multimedia Recommendation
Intelligent Multimedia Recommendation
 
Architecture Design for Deep Neural Networks II
Architecture Design for Deep Neural Networks IIArchitecture Design for Deep Neural Networks II
Architecture Design for Deep Neural Networks II
 
Architecture Design for Deep Neural Networks I
Architecture Design for Deep Neural Networks IArchitecture Design for Deep Neural Networks I
Architecture Design for Deep Neural Networks I
 
Causally regularized machine learning
Causally regularized machine learningCausally regularized machine learning
Causally regularized machine learning
 
Object Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet IIIObject Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet III
 
Object Detection Beyond Mask R-CNN and RetinaNet II
Object Detection Beyond Mask R-CNN and RetinaNet IIObject Detection Beyond Mask R-CNN and RetinaNet II
Object Detection Beyond Mask R-CNN and RetinaNet II
 
Visual Search and Question Answering II
Visual Search and Question Answering IIVisual Search and Question Answering II
Visual Search and Question Answering II
 
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
 
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
 
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
 
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
 
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
 
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
 

Kürzlich hochgeladen

Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.soniya singh
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Servicegwenoracqe6
 
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe
 
SEO Growth Program-Digital optimization Specialist
SEO Growth Program-Digital optimization SpecialistSEO Growth Program-Digital optimization Specialist
SEO Growth Program-Digital optimization SpecialistKHM Anwar
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girladitipandeya
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Radiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsRadiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsstephieert
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
horny (9316020077 ) Goa Call Girls Service by VIP Call Girls in Goa
horny (9316020077 ) Goa  Call Girls Service by VIP Call Girls in Goahorny (9316020077 ) Goa  Call Girls Service by VIP Call Girls in Goa
horny (9316020077 ) Goa Call Girls Service by VIP Call Girls in Goasexy call girls service in goa
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...Diya Sharma
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLimonikaupta
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Servicesexy call girls service in goa
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceDelhi Call girls
 
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya Shirtrahman018755
 

Kürzlich hochgeladen (20)

Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 
SEO Growth Program-Digital optimization Specialist
SEO Growth Program-Digital optimization SpecialistSEO Growth Program-Digital optimization Specialist
SEO Growth Program-Digital optimization Specialist
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
 
Radiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsRadiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girls
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
horny (9316020077 ) Goa Call Girls Service by VIP Call Girls in Goa
horny (9316020077 ) Goa  Call Girls Service by VIP Call Girls in Goahorny (9316020077 ) Goa  Call Girls Service by VIP Call Girls in Goa
horny (9316020077 ) Goa Call Girls Service by VIP Call Girls in Goa
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
 
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICECall Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
 
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
 

Object Detection Beyond Mask R-CNN and RetinaNet I

  • 1. Gang Yu 旷 视 研 究 院 Object Detection in Recent 3 Years Beyond RetinaNet and Mask R-CNN
  • 2. Schedule of Tutorial • Lecture 1: Beyond RetinaNet and Mask R-CNN (Gang Yu) • Lecture 2: AutoML for Object Detection (Xiangyu Zhang) • Lecture 3: Finegrained Visual Analysis (Xiu-shen Wei)
  • 3. Outline • Introduction to Object Detection • Modern Object detectors • One Stage detector vs Two-stage detector • Challenges • Backbone • Head • Pretraining • Scale • Batch Size • Crowd • NAS • Fine-Grained • Conclusion
  • 4. Outline • Introduction to Object Detection • Modern Object detectors • One Stage detector vs Two-stage detector • Challenges • Backbone • Head • Pretraining • Scale • Batch Size • Crowd • NAS • Fine-Grained • Conclusion
  • 5. What is object detection?
  • 6. What is object detection?
  • 7. Detection - Evaluation Criteria Average Precision (AP) and mAP Figures are from wikipedia
  • 8. Detection - Evaluation Criteria mmAP Figures are from http://cocodataset.org
  • 9. How to perform a detection? • Sliding window: enumerate all the windows (up to millions of windows) • VJ detector: cascade chain • Fully Convolutional network • shared computation Robust Real-time Object Detection; Viola, Jones; IJCV 2001 http://www.vision.caltech.edu/html-files/EE148-2005-Spring/pprs/viola04ijcv.pdf
  • 10. General Detection Before Deep Learning • Feature + classifier • Feature • Haar Feature • HOG (Histogram of Gradient) • LBP (Local Binary Pattern) • ACF (Aggregated Channel Feature) • … • Classifier • SVM • Bootsing • Random Forest
  • 13. General Detection Before Deep Learning Traditional Methods • Pros • Efficient to compute (e.g., HAAR, ACF) on CPU • Easy to debug, analyze the bad cases • reasonable performance on limited training data • Cons • Limited performance on large dataset • Hard to be accelerated by GPU
  • 14. Deep Learning for Object Detection Based on the whether following the “proposal and refine” • One Stage • Example: Densebox, YOLO (YOLO v2), SSD, Retina Net • Keyword: Anchor, Divide and conquer, loss sampling • Two Stage • Example: RCNN (Fast RCNN, Faster RCNN), RFCN, FPN, MaskRCNN • Keyword: speed, performance
  • 15. A bit of History Image Feature Extractor classification localization (bbox) One stage detector Densebox (2015) UnitBox (2016) EAST (2017) YOLO (2015) Anchor Free Anchor importedYOLOv2 (2016) SSD (2015) RON(2017) RetinaNet(2017) DSSD (2017) two stages detector Image Feature Extractor classification localization (bbox) Proposal classification localization (bbox) Refine RCNN (2014) Fast RCNN(2015) Faster RCNN (2015) RFCN (2016) MultiBox(2014) RFCN++ (2017) FPN (2017) Mask RCNN (2017) OverFeat(2013)
  • 16. Outline • Introduction to Object Detection • Modern Object detectors • One Stage detector vs Two-stage detector • Challenges • Backbone • Head • Pretraining • Scale • Batch Size • Crowd • NAS • Fine-Grained • Conclusion
  • 17. Modern Object detectors Backbone Head • Modern object detectors • RetinaNet • f1-f7 for backbone, f3-f7 with 4 convs for head • FPN with ROIAlign • f1-f6 for backbone, two fcs for head • Recall vs localization • One stage detector: Recall is high but compromising the localization ability • Two stage detector: Strong localization ability Postprocess NMS
  • 18. One Stage detector: RetinaNet • FPN Structure • Focal loss Focal Loss for Dense Object Detection, Lin etc, ICCV 2017 Best student paper
  • 19. One Stage detector: RetinaNet • FPN Structure • Focal loss Focal Loss for Dense Object Detection, Lin etc, ICCV 2017 Best student paper
  • 20. Two-Stage detector: FPN/Mask R-CNN • FPN Structure • ROIAlign Mask R-CNN, He etc, ICCV 2017 Best paper
  • 21. What is next for object detection? • The pipeline seems to be mature • There still exists a large gap between existing state-of-arts and product requirements • The devil is in the detail
  • 22. Outline • Introduction to Object Detection • Modern Object detectors • One Stage detector vs Two-stage detector • Challenges • Backbone • Head • Pretraining • Scale • Batch Size • Crowd • NAS • Fine-Grained • Conclusion
  • 23. Challenges Overview • Backbone • Head • Pretraining • Scale • Batch Size • Crowd • NAS • Fine-grained Backbone Head Postprocess NMS
  • 24. Challenges - Backbone • Backbone network is designed for classification task but not for localization task • Receptive Field vs Spatial resolution • Only f1-f5 is pretrained but randomly initializing f6 and f7 (if applicable)
  • 25. Backbone - DetNet • DetNet: A Backbone network for Object Detection, Li etc, 2018, https://arxiv.org/pdf/1804.06215.pdf
  • 31. Challenges - Head • Speed is significantly improved for the two-stage detector • RCNN - > Fast RCNN -> Faster RCNN - > RFCN • How to obtain efficient speed as one stage detector like YOLO, SSD? • Small Backbone • Light Head
  • 32. Head – Light head RCNN • Light-Head R-CNN: In Defense of Two-Stage Object Detector, 2017, https://arxiv.org/pdf/1711.07264.pdf Code: https://github.com/zengarden/light_head_rcnn
  • 33. Head – Light head RCNN • Backbone • L: Resnet101 • S: Xception145 • Thin Feature map • L:C_{mid} = 256 • S: C_{mid} =64 • C_{out} = 10 * 7 * 7 • R-CNN subnet • A fc layer is connected to the PS ROI pool/Align
  • 34. Head – Light head RCNN
  • 35. Head – Light head RCNN
  • 36. Head – Light head RCNN • Mobile Version • ThunderNet: Towards Real-time Generic Object Detection, Qin etc, Arxiv 2019 • https://arxiv.org/abs/1903.11752
  • 37. Pretraining – Objects365 • ImageNet pretraining is usually employed for backbone training • Training from Scratch • Scratch Det claims GN/BN is important • Rethinking ImageNet Pretraining validates that training time is important
  • 38. Pretraining – Objects365 • Objects365 Dataset
  • 39. Pretraining – Objects365 • Pretraining with Objects365 vs ImageNet vs from Sctratch
  • 40. Pretraining – Objects365 • Pretraining on Backbone or Pretraining on both backbone and head
  • 41. Pretraining – Objects365 • Results on VOC Detection & VOC Segmentation
  • 42. Pretraining – Objects365 • Summary • Pretraining is important to reduce the training time • Pretraining with a large dataset is beneficial for the performance
  • 43. Challenges - Scale • Scale variations is extremely large for object detection
  • 44. Challenges - Scale • Scale variations is extremely large for object detection • Previous works • Divide and Conquer: SSD, DSSD, RON, FPN, … • Limited Scale variation • Scale Normalization for Image Pyramids, Singh etc, CVPR2018 • Slow inference speed • How to address extremely large scale variation without compromising inference speed?
  • 45. Scale - SFace • SFace: An Efficient Network for Face Detection in Large Scale Variations, 2018, http://cn.arxiv.org/pdf/1804.06559.pdf • Anchor-based: • Good localization for the scales which are covered by anchors • Difficult to address all the scale ranges of faces • Anchor-free: • Able to cover various face scales • Not good for the localization ability
  • 49. Scale - SFace • Summary: • Integrate anchor-based and anchor-free for the scale issue • A new benchmark for face detection with large scale variations: 4K Face
  • 50. Challenges - Batchsize • Small mini-batchsize for general object detection • 2 for R-CNN, Faster RCNN • 16 for RetinaNet, Mask RCNN • Problem with small mini-batchsize • Long training time • Insufficient BN statistics • Inbalanced pos/neg ratio
  • 51. Batchsize – MegDet • MegDet: A Large Mini-Batch Object Detector, CVPR2018, https://arxiv.org/pdf/1711.07240.pdf
  • 52. Batchsize – MegDet • Techniques • Learning rate warmup • Cross-GPU Batch Normalization
  • 53. Challenges - Crowd • NMS is a post-processing step to eliminate multiple responses on one object instance • Reasonable for mild crowdness like COCO and VOC • Will Fail in the case when the objects are in a crowd
  • 54. Challenges - Crowd • A few works have been devoted to this topic • Softnms, Bodla etc, ICCV 2017, http://www.cs.umd.edu/~bharat/snms.pdf • Relation Networks, Hu etc, CVPR 2018, https://arxiv.org/pdf/1711.11575.pdf • Lacking a good benchmark for evaluation in the literature
  • 55. Crowd - CrowdHuman • CrowdHuman: A Benchmark for Detecting Human in a Crowd, 2018, https://arxiv.org/pdf/1805.00123.pdf, http://www.crowdhuman.org/ • A benchmark with Head, Visible Human, Full body bounding-box • Generalization ability for other head/pedestrian datasets • Crowdness
  • 59. Conclusion • The task of object detection is still far from solved • Details are important to further improve the performance • Backbone • Head • Pretraining • Scale • Batchsize • Crowd • The improvement of object detection will be a significantly boost for the computer vision industry
  • 60. 广告部分 • Megvii Detection 知乎专栏 Email: yugang@megvii.com