Deformable ConvNets V2, DCNV2

•Als PPTX, PDF herunterladen•

1 gefällt mir•641 views

This document summarizes a presentation on Deformable ConvNets v2, which improves on the original Deformable ConvNets model. It introduces modulated deformable convolution modules to allow modulation of input feature amplitudes. It also presents a technique called R-CNN feature mimicking to focus the spatial support of predictions on the region of interest, reducing influence from irrelevant image content. Visualization of spatial support shows predictions can be affected by content outside the region of interest. The paper achieves significant gains on COCO for object detection and segmentation by increasing modeling power and stronger training of Deformable ConvNets.

Ingenieurwesen

Deformable ConvNets v2: More
Deformable, Better Results
Presentation by Haiyan Wang
Media Lab
12/11/2018

Outline
• Deformable convolution
• Deformable convolution V2
• Analysis of Deformable ConvNet Behavior
• Observations from these visualization
• More Deformable ConvNets
• Results and ablation study
• Conclusion

Analysis of Deformable ConvNet Behavior
• Spatial Support Visualization

Spatial Support Visualization
• Effective sampling / bin locations
• Effective receptive fields
• Error-bounded saliency regions
• We can determine a node’s support region as the
smallest image region giving the same response as the
full image, within a small error bound. We refer to this
as the error-bounded saliency region

Observations from these visualization
• Regular ConvNets can model geometric variations to some extent and the
network weights are learned to accommodate some degree of geometric
transformation.
• By introducing deformable convolution, the network’s ability to model geometric
transformation is considerably enhanced. However, while nodes on the
background have expanded support that encompasses greater context, the range
of spatial support may be inexact.
• In DCV Network, predictions are jointly affected by learned offsets and network
weights. Examining sampling locations alone can result in misleading conclusions
about Deformable ConvNets.

• the spatial support of the 2fc node in the per-RoI detection head
• The error-bounded saliency regions in both aligned RoIpooling and
Deformable RoIpooling are not fully focused on the object foreground,
which suggests that image content outside of the RoI affects the
prediction result.

More Deformable ConvNets
• Stacking More Deformable Conv Layers
• In this paper, deformable convolutions are applied in allthe3×3conv layers in
stages conv3, conv4, and conv5 inResNet-50.

Modulated Deformable Modules
• With it, the Deformable ConvNets modules can not only adjust offsets
in perceiving input features, but also modulate the input feature
amplitudes from different spatial locations / bins.

R-CNN Feature
Mimicking
• Image content outside of the RoI
may thus affect the extracted
features and consequently
degrade the final results of
object detection.

Ablation study on R-CNN feature mimicking

Conclusion and Inspiration
• Despite the superior performance of Deformable ConvNets in modeling
geometric variations, its spatial support extends well beyond the region of
interest, causing features to be influenced by irrelevant image content.
• In this paper, the author presents a reformulation of Deformable ConvNets which
improves its ability to focus on pertinent image regions, through increased
modeling power and stronger training. Significant performance gains are
obtained on the COCO benchmark for object detection and instance
segmentation.
• Small objects: context necessary?
• RCNN re-explore

Empfohlen

[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisKento Doi

実装レベルで学ぶVQVAEぱんいちすみもと

[G4]image deblurring, seeing the invisibleNAVER D2

【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat ModelsDeep Learning JP

Depth Estimation論文紹介Keio Robotics Association

SSII2021 [SS1] Transformer x Computer Visionの実活用可能性と展望〜 TransformerのCompute...SSII

[DL輪読会]YOLOv4: Optimal Speed and Accuracy of Object DetectionDeep Learning JP

A Generalist Agentharmonylab

Empfohlen

[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisKento Doi

実装レベルで学ぶVQVAEぱんいちすみもと

[G4]image deblurring, seeing the invisibleNAVER D2

【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat ModelsDeep Learning JP

Depth Estimation論文紹介Keio Robotics Association

SSII2021 [SS1] Transformer x Computer Visionの実活用可能性と展望〜 TransformerのCompute...SSII

[DL輪読会]YOLOv4: Optimal Speed and Accuracy of Object DetectionDeep Learning JP

A Generalist Agentharmonylab

TransPose: Towards Explainable Human Pose Estimation by TransformerYasutomo Kawanishi

[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...Deep Learning JP

【ECCV 2018】Exploring the Limits of Weakly Supervised Pretrainingcvpaper. challenge

Introduction to AutoencodersYan Xu

[DL輪読会]Hindsight Experience ReplayDeep Learning JP

Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...Yusuke Uchida

社内論文読み会資料 Image-to-Image Retrieval by Learning Similarity between Scene GraphsKazuhiro Ota

[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksDeep Learning JP

U-Net: Convolutional Networks for Biomedical Image Segmentationの紹介KCS Keio Computer Society

モデルアーキテクチャ観点からのDeep Neural Network高速化Yusuke Uchida

AIのラボからロボティクスへ --- 東大松尾研究室のWRS2020パートナーロボットチャレンジへの挑戦Tatsuya Matsushima

A100 GPU 搭載！ P4d インスタンス使いこなしのコツKuninobu SaSaki

[DL輪読会] MoCoGAN: Decomposing Motion and Content for Video GenerationDeep Learning JP

십분수학_Entropy and KL-DivergenceHyunKyu Jeon

【論文読み会】Deep Reinforcement Learning at the Edge of the Statistical PrecipiceARISE analytics

【DL輪読会】"A Generalist Agent"Deep Learning JP

Domain Transfer and Adaptation SurveySangwoo Mo

物体検出フレームワークMMDetectionで快適な開発Tatsuya Suzuki

[DL輪読会]A Simple Unified Framework for Detecting Out-of-Distribution Samples a...Deep Learning JP

【DL輪読会】“PanopticDepth: A Unified Framework for Depth-aware Panoptic Segmenta...Deep Learning JP

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptxssuser2624f71

PR-366: A ConvNet for 2020sJinwon Lee

Weitere ähnliche Inhalte

Was ist angesagt?

TransPose: Towards Explainable Human Pose Estimation by TransformerYasutomo Kawanishi

[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...Deep Learning JP

【ECCV 2018】Exploring the Limits of Weakly Supervised Pretrainingcvpaper. challenge

Introduction to AutoencodersYan Xu

[DL輪読会]Hindsight Experience ReplayDeep Learning JP

Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...Yusuke Uchida

社内論文読み会資料 Image-to-Image Retrieval by Learning Similarity between Scene GraphsKazuhiro Ota

[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksDeep Learning JP

U-Net: Convolutional Networks for Biomedical Image Segmentationの紹介KCS Keio Computer Society

モデルアーキテクチャ観点からのDeep Neural Network高速化Yusuke Uchida

AIのラボからロボティクスへ --- 東大松尾研究室のWRS2020パートナーロボットチャレンジへの挑戦Tatsuya Matsushima

A100 GPU 搭載！ P4d インスタンス使いこなしのコツKuninobu SaSaki

[DL輪読会] MoCoGAN: Decomposing Motion and Content for Video GenerationDeep Learning JP

십분수학_Entropy and KL-DivergenceHyunKyu Jeon

【論文読み会】Deep Reinforcement Learning at the Edge of the Statistical PrecipiceARISE analytics

【DL輪読会】"A Generalist Agent"Deep Learning JP

Domain Transfer and Adaptation SurveySangwoo Mo

物体検出フレームワークMMDetectionで快適な開発Tatsuya Suzuki

[DL輪読会]A Simple Unified Framework for Detecting Out-of-Distribution Samples a...Deep Learning JP

【DL輪読会】“PanopticDepth: A Unified Framework for Depth-aware Panoptic Segmenta...Deep Learning JP

Was ist angesagt? (20)

TransPose: Towards Explainable Human Pose Estimation by Transformer

[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...

【ECCV 2018】Exploring the Limits of Weakly Supervised Pretraining

Introduction to Autoencoders

[DL輪読会]Hindsight Experience Replay

Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...

社内論文読み会資料 Image-to-Image Retrieval by Learning Similarity between Scene Graphs

[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

U-Net: Convolutional Networks for Biomedical Image Segmentationの紹介

モデルアーキテクチャ観点からのDeep Neural Network高速化

AIのラボからロボティクスへ --- 東大松尾研究室のWRS2020パートナーロボットチャレンジへの挑戦

A100 GPU 搭載！ P4d インスタンス使いこなしのコツ

[DL輪読会] MoCoGAN: Decomposing Motion and Content for Video Generation

십분수학_Entropy and KL-Divergence

【論文読み会】Deep Reinforcement Learning at the Edge of the Statistical Precipice

【DL輪読会】"A Generalist Agent"

Domain Transfer and Adaptation Survey

物体検出フレームワークMMDetectionで快適な開発

[DL輪読会]A Simple Unified Framework for Detecting Out-of-Distribution Samples a...

【DL輪読会】“PanopticDepth: A Unified Framework for Depth-aware Panoptic Segmenta...

Ähnlich wie Deformable ConvNets V2, DCNV2

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptxssuser2624f71

PR-366: A ConvNet for 2020sJinwon Lee

04 Deep CNN (Ch_01 to Ch_3).pptxZainULABIDIN496386

ConvNeXt.pptxYanhuaSi

NS-CUK Seminar: S.T.Nguyen, Review on "On Generalized Degree Fairness in Grap...Network Science Lab, The Catholic University of Korea

PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee

ConvNeXt: A ConvNet for the 2020s explainedSushant Gautam

Deep learning for 3-D Scene Reconstruction and Modeling Yu Huang

A brief introduction to recent segmentation methodsShunta Saito

240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...thanhdowork

MobileNet V3Wonbeom Jang

Faster R-CNN - PR012Jinwon Lee

InternImage: Exploring Large-Scale Vision Foundation Models with Deformable C...taeseon ryu

A Generalization of Transformer Networks to Graphs.pptxssuser2624f71

Data argumentationAzizi Abdullah

Week5-Faster R-CNN.pptxfahmi324663

Image Segmentation Using Deep Learning : A surveyNUPUR YADAV

240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...thanhdowork

Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...Sergey Karayev

Depth Fusion from RGB and Depth Sensors by Deep LearningYu Huang

Ähnlich wie Deformable ConvNets V2, DCNV2 (20)

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptx

PR-366: A ConvNet for 2020s

04 Deep CNN (Ch_01 to Ch_3).pptx

ConvNeXt.pptx

NS-CUK Seminar: S.T.Nguyen, Review on "On Generalized Degree Fairness in Grap...

PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

ConvNeXt: A ConvNet for the 2020s explained

Deep learning for 3-D Scene Reconstruction and Modeling

A brief introduction to recent segmentation methods

240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...

MobileNet V3

Faster R-CNN - PR012

InternImage: Exploring Large-Scale Vision Foundation Models with Deformable C...

A Generalization of Transformer Networks to Graphs.pptx

Data argumentation

Week5-Faster R-CNN.pptx

Image Segmentation Using Deep Learning : A survey

240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...

Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...

Depth Fusion from RGB and Depth Sensors by Deep Learning

Kürzlich hochgeladen

CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani

Intze Overhead Water Tank Design by Working Stress - IS Method.pdfSuman Jyoti

AKTU Computer Networks notes --- Unit 3.pdfankushspencer015

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066

Generative AI or GenAI technology based PPTbhaskargani46

Online banking management system project.pdfKamal Acharya

Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile

Unit 1 - Soil Classification and Compaction.pdfRagavanV2

notes on Evolution Of Analytic Scalability.pptMsecMca

(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7Call Girls in Nagpur High Profile Call Girls

VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY

Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1

Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...tanu pandey

Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi

Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control

Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoordharasingh5698

KubeKraft presentation @CloudNativeHooghlysanyuktamishra911

Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Standamitlee9823

Kürzlich hochgeladen (20)

CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record

Intze Overhead Water Tank Design by Working Stress - IS Method.pdf

AKTU Computer Networks notes --- Unit 3.pdf

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756

Generative AI or GenAI technology based PPT

Online banking management system project.pdf

Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...

Unit 1 - Soil Classification and Compaction.pdf

notes on Evolution Of Analytic Scalability.ppt

(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7

VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...

Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service

Work-Permit-Receiver-in-Saudi-Aramco.pptx

Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...

Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking

Water Industry Process Automation & Control Monthly - April 2024

Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor

KubeKraft presentation @CloudNativeHooghly

Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand

Deformable ConvNets V2, DCNV2

1. Deformable ConvNets v2: More Deformable, Better Results Presentation by Haiyan Wang Media Lab 12/11/2018

2. Outline • Deformable convolution • Deformable convolution V2 • Analysis of Deformable ConvNet Behavior • Observations from these visualization • More Deformable ConvNets • Results and ablation study • Conclusion

3. Deformable Convolutional Networks

6. Analysis of Deformable ConvNet Behavior • Spatial Support Visualization

7. Spatial Support Visualization • Effective sampling / bin locations • Effective receptive fields • Error-bounded saliency regions • We can determine a node’s support region as the smallest image region giving the same response as the full image, within a small error bound. We refer to this as the error-bounded saliency region

8. Observations from these visualization • Regular ConvNets can model geometric variations to some extent and the network weights are learned to accommodate some degree of geometric transformation. • By introducing deformable convolution, the network’s ability to model geometric transformation is considerably enhanced. However, while nodes on the background have expanded support that encompasses greater context, the range of spatial support may be inexact. • In DCV Network, predictions are jointly affected by learned offsets and network weights. Examining sampling locations alone can result in misleading conclusions about Deformable ConvNets.

10. • the spatial support of the 2fc node in the per-RoI detection head • The error-bounded saliency regions in both aligned RoIpooling and Deformable RoIpooling are not fully focused on the object foreground, which suggests that image content outside of the RoI affects the prediction result.

11. More Deformable ConvNets • Stacking More Deformable Conv Layers • In this paper, deformable convolutions are applied in allthe3×3conv layers in stages conv3, conv4, and conv5 inResNet-50.

12. Modulated Deformable Modules • With it, the Deformable ConvNets modules can not only adjust offsets in perceiving input features, but also modulate the input feature amplitudes from different spatial locations / bins.

13. R-CNN Feature Mimicking • Image content outside of the RoI may thus affect the extracted features and consequently degrade the final results of object detection.

14. Ablation study on R-CNN feature mimicking

15. Application on Stronger Backbones

16. Conclusion and Inspiration • Despite the superior performance of Deformable ConvNets in modeling geometric variations, its spatial support extends well beyond the region of interest, causing features to be influenced by irrelevant image content. • In this paper, the author presents a reformulation of Deformable ConvNets which improves its ability to focus on pertinent image regions, through increased modeling power and stronger training. Significant performance gains are obtained on the COCO benchmark for object detection and instance segmentation. • Small objects: context necessary? • RCNN re-explore

Hinweis der Redaktion

The response of a net-work node will not change if we remove image regions that do not influence it, as demonstrated in recent research on image saliency
Togetherwith other motivations (e.g., to share fewer features betweenthe classification and bounding box regression branches),the authors propose to combine the classification scores ofFaster R-CNN and R-CNN to obtain the final detectionscore. Since R-CNN classification scores are focused oncropped image content from the input RoI, incorporatingthem would help to alleviate the redundant context problemand improve detection accuracy. However, the combinedsystem is slow because both the Faster-RCNN and R-CNNbranches need to be applied in both training and inference.