SlideShare a Scribd company logo
1 of 22
Download to read offline
R-FCN: Object Detection via Region-based Fully
Convolutional Networks
Paper : https://arxiv.org/abs/1605.06409
Authors:
Jifeng Dai Microsoft Research
Yi Li∗ Tsinghua University
Kaiming He Microsoft Research
Jian Sun Microsoft Research
Presented By:
Ashish
R-CNN based detectors
● Fast R-CNN or Faster R-CNN,
○ Fast R-CNN computes the feature maps from the whole image once.
○ It then derives the region proposals (ROIs) from the feature maps directly.
○ For every ROI, no more feature extraction is needed (about 2000 ROIs)
● Processes object detection in 2 stages,
○ Generate region proposals (ROIs), and
○ Make classification and localization (boundary boxes) predictions from ROIs.
R-FCN
● R-FCN improves speed by reducing the amount of work needed for each ROI
● The region-based feature maps are independent of ROIs and can be computed outside
each ROI.
● R-FCN is faster than Fast R-CNN or Faster R-CNN.
● R-FCN is Region-based, fully convolutional network
Backbone Architecture
1. R-FCN in this paper is based on ResNet-101,
2. ResNet-101 has 100 convolutional layers followed by global average pooling and a
1000-class fc layer.
3. We remove the average pooling layer and the fc layer and only use the convolutional
layers to compute feature maps.
4. We use ResNet-101 pre-trained on ImageNet .
5. The last convolutional block in ResNet-101 is 2048-d, and we attach a randomly
initialized 1024-d 1×1 convolutional layer for reducing dimension.
6. Then we apply the k 2 (C + 1)-channel convolutional layer to generate score maps, as
introduced next.
R-FCN
Data Flow R-FCN
Region based, fully convolutional network for accurate and efficient object
detection.
Position Sensitive Score Maps
● Each map detects (scores) a sub-region of the object.
Source: https://medium.com/@jonathan_hui/understanding-region-based-fully-convolutional-networks-r-fcn-for-object-detection-828316f07c99
position-sensitive ROI-pool
Source: https://medium.com/@jonathan_hui/understanding-region-based-fully-convolutional-networks-r-fcn-for-object-detection-828316f07c99
Class Score
Let’s say we have C classes to detect.
We expand it to C + 1 classes so we include a new class for the background (non-object).
Each class will have its own 3 × 3 score maps and therefore a total of (C+1) × 3 × 3 score maps.
Source: https://medium.com/@jonathan_hui/understanding-region-based-fully-convolutional-networks-r-fcn-for-object-detection-828316f07c99
Classification
Using its own set of score maps, we predict a class score for each class.
Then we apply a softmax on those scores to compute the probability for each class.
Position Sensitive Score map (i,j):
Training
1. With pre-computed region proposals, it is easy to end-to-end train the R-FCN
architecture.
2. loss function defined on each RoI is the summation of the cross-entropy loss
and the box regression loss
3. positive RoIs that have intersection-over-union (IoU) overlap with a
ground-truth box of at least 0.5, and negative otherwise.
Uses online hard example mining (OHEM) [22] during training. Online Hard
Example Mining (OHEM) is an online bootstrapping algorithm for training
region-based ConvNet object detectors like Fast R-CNN.
Our negligible per-RoI computation enables nearly cost-free example mining.
Training
1. Assuming N proposals per image, in the forward pass, we evaluate the loss of
all N proposals.
2. Then we sort all RoIs (positive and negative) by loss and select B RoIs that
have the highest loss.
3. Backpropagation is performed.
4. weight decay of 0.0005 and a momentum of 0.9.
5. By default single-scale training: images are resized such that the scale (shorter
side of image) is 600 pixels.
6. Each GPU holds 1 image and selects B = 128 RoIs for backprop.
7. We train the model with 8 GPUs.
8. We fine-tune R-FCN using a learning rate of 0.001 for 20k mini-batches and
0.0001 for 10k mini-batches on VOC.
Loss Function
● Cross entropy for classification
● Bounding Box Regression loss
Inference
1. The feature maps shared between RPN and R-FCN are computed (on an
image with a single scale of 600).
2. Then the RPN part proposes RoIs, on which the R-FCN part evaluates
category-wise scores and regresses bounding boxes.
3. During inference we evaluate 300 RoIs.
4. The results are post-processed by non-maximum suppression (NMS) using a
Impact of Depth
Impact of Region Proposals
Results
R-FCN demonstrates 20x faster than the Faster R-CNN.
Conclusion

More Related Content

What's hot

Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012Jinwon Lee
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter TuningJon Lederman
 
Visual Object Tracking: review
Visual Object Tracking: reviewVisual Object Tracking: review
Visual Object Tracking: reviewDmytro Mishkin
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Universitat Politècnica de Catalunya
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learningSushant Shrivastava
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningShubhmay Potdar
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation岳華 杜
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnnKuppusamy P
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersSeunghyun Hwang
 
Brief intro : Invariance and Equivariance
Brief intro : Invariance and EquivarianceBrief intro : Invariance and Equivariance
Brief intro : Invariance and Equivariance홍배 김
 
Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)Muhammad Haroon
 
Presentation on unsupervised learning
Presentation on unsupervised learning Presentation on unsupervised learning
Presentation on unsupervised learning ANKUSH PAL
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionDADAJONJURAKUZIEV
 
Domain adaptation for Image Segmentation
Domain adaptation for Image SegmentationDomain adaptation for Image Segmentation
Domain adaptation for Image SegmentationDeepak Thukral
 
Transformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptxTransformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptxDeep Learning Italia
 
Real Time Object Tracking
Real Time Object TrackingReal Time Object Tracking
Real Time Object TrackingVanya Valindria
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learningAntonio Rueda-Toicen
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Preferred Networks
 

What's hot (20)

Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
Introduction of Faster R-CNN
Introduction of Faster R-CNNIntroduction of Faster R-CNN
Introduction of Faster R-CNN
 
Visual Object Tracking: review
Visual Object Tracking: reviewVisual Object Tracking: review
Visual Object Tracking: review
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
 
Brief intro : Invariance and Equivariance
Brief intro : Invariance and EquivarianceBrief intro : Invariance and Equivariance
Brief intro : Invariance and Equivariance
 
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
 
Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)
 
Presentation on unsupervised learning
Presentation on unsupervised learning Presentation on unsupervised learning
Presentation on unsupervised learning
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
 
Domain adaptation for Image Segmentation
Domain adaptation for Image SegmentationDomain adaptation for Image Segmentation
Domain adaptation for Image Segmentation
 
Transformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptxTransformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptx
 
Real Time Object Tracking
Real Time Object TrackingReal Time Object Tracking
Real Time Object Tracking
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
 

Similar to R-FCN : object detection via region-based fully convolutional networks

object detection paper review
object detection paper reviewobject detection paper review
object detection paper reviewYoonho Na
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkNader Karimi
 
Aerial detection part3
Aerial detection part3Aerial detection part3
Aerial detection part3ssuser456ad6
 
Auro tripathy - Localizing with CNNs
Auro tripathy -  Localizing with CNNsAuro tripathy -  Localizing with CNNs
Auro tripathy - Localizing with CNNsAuro Tripathy
 
Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...Universitat de Barcelona
 
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...inside-BigData.com
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNNJunho Cho
 
Comparative Study of Object Detection Algorithms
Comparative Study of Object Detection AlgorithmsComparative Study of Object Detection Algorithms
Comparative Study of Object Detection AlgorithmsIRJET Journal
 
fusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving IIfusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving IIYu Huang
 
Fast methods for deep learning based object detection
Fast methods for deep learning based object detectionFast methods for deep learning based object detection
Fast methods for deep learning based object detectionBrodmann17
 
Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image searchUniversitat Politècnica de Catalunya
 
Cheatsheet convolutional-neural-networks
Cheatsheet convolutional-neural-networksCheatsheet convolutional-neural-networks
Cheatsheet convolutional-neural-networksSteve Nouri
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learningYu Huang
 
Week5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxWeek5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxfahmi324663
 
LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)Yu Huang
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術CHENHuiMei
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...Edge AI and Vision Alliance
 

Similar to R-FCN : object detection via region-based fully convolutional networks (20)

object detection paper review
object detection paper reviewobject detection paper review
object detection paper review
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
 
Aerial detection part3
Aerial detection part3Aerial detection part3
Aerial detection part3
 
Auro tripathy - Localizing with CNNs
Auro tripathy -  Localizing with CNNsAuro tripathy -  Localizing with CNNs
Auro tripathy - Localizing with CNNs
 
Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...
 
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNN
 
Comparative Study of Object Detection Algorithms
Comparative Study of Object Detection AlgorithmsComparative Study of Object Detection Algorithms
Comparative Study of Object Detection Algorithms
 
fusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving IIfusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving II
 
R-FCN.pptx
R-FCN.pptxR-FCN.pptx
R-FCN.pptx
 
Fast methods for deep learning based object detection
Fast methods for deep learning based object detectionFast methods for deep learning based object detection
Fast methods for deep learning based object detection
 
Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image search
 
Cheatsheet convolutional-neural-networks
Cheatsheet convolutional-neural-networksCheatsheet convolutional-neural-networks
Cheatsheet convolutional-neural-networks
 
object-detection.pptx
object-detection.pptxobject-detection.pptx
object-detection.pptx
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learning
 
Week5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxWeek5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptx
 
LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
Adaptive object detection using adjacency and zoom prediction
Adaptive object detection using adjacency and zoom predictionAdaptive object detection using adjacency and zoom prediction
Adaptive object detection using adjacency and zoom prediction
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
 

More from Entrepreneur / Startup

You only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detectionYou only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detectionEntrepreneur / Startup
 
Machine Learning Algorithms in Enterprise Applications
Machine Learning Algorithms in Enterprise ApplicationsMachine Learning Algorithms in Enterprise Applications
Machine Learning Algorithms in Enterprise ApplicationsEntrepreneur / Startup
 
Build a Neural Network for ITSM with TensorFlow
Build a Neural Network for ITSM with TensorFlowBuild a Neural Network for ITSM with TensorFlow
Build a Neural Network for ITSM with TensorFlowEntrepreneur / Startup
 
Understanding Autoencoder (Deep Learning Book, Chapter 14)
Understanding Autoencoder  (Deep Learning Book, Chapter 14)Understanding Autoencoder  (Deep Learning Book, Chapter 14)
Understanding Autoencoder (Deep Learning Book, Chapter 14)Entrepreneur / Startup
 
Building chat bots using ai platforms (wit.ai or api.ai) in nodejs
Building chat bots using ai platforms (wit.ai or api.ai) in nodejsBuilding chat bots using ai platforms (wit.ai or api.ai) in nodejs
Building chat bots using ai platforms (wit.ai or api.ai) in nodejsEntrepreneur / Startup
 

More from Entrepreneur / Startup (13)

You only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detectionYou only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detection
 
Machine Learning Algorithms in Enterprise Applications
Machine Learning Algorithms in Enterprise ApplicationsMachine Learning Algorithms in Enterprise Applications
Machine Learning Algorithms in Enterprise Applications
 
OpenAI Gym & Universe
OpenAI Gym & UniverseOpenAI Gym & Universe
OpenAI Gym & Universe
 
Build a Neural Network for ITSM with TensorFlow
Build a Neural Network for ITSM with TensorFlowBuild a Neural Network for ITSM with TensorFlow
Build a Neural Network for ITSM with TensorFlow
 
Understanding Autoencoder (Deep Learning Book, Chapter 14)
Understanding Autoencoder  (Deep Learning Book, Chapter 14)Understanding Autoencoder  (Deep Learning Book, Chapter 14)
Understanding Autoencoder (Deep Learning Book, Chapter 14)
 
Build an AI based virtual agent
Build an AI based virtual agent Build an AI based virtual agent
Build an AI based virtual agent
 
Building Bots Using IBM Watson
Building Bots Using IBM WatsonBuilding Bots Using IBM Watson
Building Bots Using IBM Watson
 
Building chat bots using ai platforms (wit.ai or api.ai) in nodejs
Building chat bots using ai platforms (wit.ai or api.ai) in nodejsBuilding chat bots using ai platforms (wit.ai or api.ai) in nodejs
Building chat bots using ai platforms (wit.ai or api.ai) in nodejs
 
Building mobile apps using meteorJS
Building mobile apps using meteorJSBuilding mobile apps using meteorJS
Building mobile apps using meteorJS
 
Building iOS app using meteor
Building iOS app using meteorBuilding iOS app using meteor
Building iOS app using meteor
 
Understanding angular meteor
Understanding angular meteorUnderstanding angular meteor
Understanding angular meteor
 
Introducing ElasticSearch - Ashish
Introducing ElasticSearch - AshishIntroducing ElasticSearch - Ashish
Introducing ElasticSearch - Ashish
 
Meteor Introduction - Ashish
Meteor Introduction - AshishMeteor Introduction - Ashish
Meteor Introduction - Ashish
 

Recently uploaded

Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01KreezheaRecto
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spaintimesproduction05
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLManishPatel169454
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Christo Ananth
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICSUNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICSrknatarajan
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdfSuman Jyoti
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 

Recently uploaded (20)

Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICSUNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 

R-FCN : object detection via region-based fully convolutional networks

  • 1. R-FCN: Object Detection via Region-based Fully Convolutional Networks Paper : https://arxiv.org/abs/1605.06409 Authors: Jifeng Dai Microsoft Research Yi Li∗ Tsinghua University Kaiming He Microsoft Research Jian Sun Microsoft Research Presented By: Ashish
  • 2. R-CNN based detectors ● Fast R-CNN or Faster R-CNN, ○ Fast R-CNN computes the feature maps from the whole image once. ○ It then derives the region proposals (ROIs) from the feature maps directly. ○ For every ROI, no more feature extraction is needed (about 2000 ROIs) ● Processes object detection in 2 stages, ○ Generate region proposals (ROIs), and ○ Make classification and localization (boundary boxes) predictions from ROIs.
  • 3. R-FCN ● R-FCN improves speed by reducing the amount of work needed for each ROI ● The region-based feature maps are independent of ROIs and can be computed outside each ROI. ● R-FCN is faster than Fast R-CNN or Faster R-CNN. ● R-FCN is Region-based, fully convolutional network
  • 4. Backbone Architecture 1. R-FCN in this paper is based on ResNet-101, 2. ResNet-101 has 100 convolutional layers followed by global average pooling and a 1000-class fc layer. 3. We remove the average pooling layer and the fc layer and only use the convolutional layers to compute feature maps. 4. We use ResNet-101 pre-trained on ImageNet . 5. The last convolutional block in ResNet-101 is 2048-d, and we attach a randomly initialized 1024-d 1×1 convolutional layer for reducing dimension. 6. Then we apply the k 2 (C + 1)-channel convolutional layer to generate score maps, as introduced next.
  • 7. Region based, fully convolutional network for accurate and efficient object detection.
  • 8. Position Sensitive Score Maps ● Each map detects (scores) a sub-region of the object. Source: https://medium.com/@jonathan_hui/understanding-region-based-fully-convolutional-networks-r-fcn-for-object-detection-828316f07c99
  • 10. Class Score Let’s say we have C classes to detect. We expand it to C + 1 classes so we include a new class for the background (non-object). Each class will have its own 3 × 3 score maps and therefore a total of (C+1) × 3 × 3 score maps. Source: https://medium.com/@jonathan_hui/understanding-region-based-fully-convolutional-networks-r-fcn-for-object-detection-828316f07c99
  • 11. Classification Using its own set of score maps, we predict a class score for each class. Then we apply a softmax on those scores to compute the probability for each class.
  • 13. Training 1. With pre-computed region proposals, it is easy to end-to-end train the R-FCN architecture. 2. loss function defined on each RoI is the summation of the cross-entropy loss and the box regression loss 3. positive RoIs that have intersection-over-union (IoU) overlap with a ground-truth box of at least 0.5, and negative otherwise. Uses online hard example mining (OHEM) [22] during training. Online Hard Example Mining (OHEM) is an online bootstrapping algorithm for training region-based ConvNet object detectors like Fast R-CNN. Our negligible per-RoI computation enables nearly cost-free example mining.
  • 14. Training 1. Assuming N proposals per image, in the forward pass, we evaluate the loss of all N proposals. 2. Then we sort all RoIs (positive and negative) by loss and select B RoIs that have the highest loss. 3. Backpropagation is performed. 4. weight decay of 0.0005 and a momentum of 0.9. 5. By default single-scale training: images are resized such that the scale (shorter side of image) is 600 pixels. 6. Each GPU holds 1 image and selects B = 128 RoIs for backprop. 7. We train the model with 8 GPUs. 8. We fine-tune R-FCN using a learning rate of 0.001 for 20k mini-batches and 0.0001 for 10k mini-batches on VOC.
  • 15. Loss Function ● Cross entropy for classification ● Bounding Box Regression loss
  • 16. Inference 1. The feature maps shared between RPN and R-FCN are computed (on an image with a single scale of 600). 2. Then the RPN part proposes RoIs, on which the R-FCN part evaluates category-wise scores and regresses bounding boxes. 3. During inference we evaluate 300 RoIs. 4. The results are post-processed by non-maximum suppression (NMS) using a
  • 17.
  • 18.
  • 19.
  • 20. Impact of Depth Impact of Region Proposals
  • 21. Results R-FCN demonstrates 20x faster than the Faster R-CNN.