SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
Mobility Technologies Co., Ltd.
Tackling Open Images Challenge
- presented at the 26th Symposium on Sensing via Image
Information
June 12, 2020
Hiroto Honda, Mobility Technologies Co., Ltd.
Mobility Technologies Co., Ltd.2
1 About Me
Mobility Technologies Co., Ltd.3
About Me
Hiroto Honda
https://hirotomusiker.github.io/
kaggle name : Schwert
‘Schwert’ = sword in German
R&D of Imaging devices in a Japanese Electronics company
→ DeNA computer vision team →Mobility Technologies
 
Mobility Technologies Co., Ltd.4
Check out my Blog Series!
https://medium.com/@hirotoschwert/digging-into-detectron-2-47b2e794fabd
Digging into Detectron 2 (object detection)
Mobility Technologies Co., Ltd.5
2 Kaggle and Open Images Challenge
Mobility Technologies Co., Ltd.
Val Data
6
How to Try Kaggle
Test data
→private leaderboard
→public leaderboard
Train Data
How can you maximize your
model’s score on the HIDDEN
test data?
Evaluation metrics are described in the ‘Evaluation’ section - mean
average precision、Dice Coefficient, and so on. Sometimes non-standard
metrics are employed and discussed in the ‘Discussion’ threads.
Cross Validation and Test data
Val Data
Train Data
Val Data
Train Data
Mobility Technologies Co., Ltd.7
Open Images Dataset (v5) :
900 million images collected from Flickr
・16M Bounding box annotations of 600 classes on 1.9M images
・Segmentation polygons on 350-class instances
・329 inter-object relationship
Open Images Challenge
https://storage.googleapis.com/openimages/web/challenge.html
https://www.kaggle.com/c/open-images-2019-object-detection/
Mobility Technologies Co., Ltd.8
1GB of bounding box data!! (on 500GB of image data)
How Huge is Open Images Dataset ?
Mobility Technologies Co., Ltd.9
3 How to Tackle Object Detection
Challenges
Mobility Technologies Co., Ltd.10
Object Detection
- detects object positions, sizes and classes from an image
- tremendous success of deep-learning-based approaches
(e.g. Faster R-CNN, YOLO, and EfficientDet)
Mobility Technologies Co., Ltd.11
NOT RECOMMENDED!
Okay, Why Not Code Object Detectors
Mobility Technologies Co., Ltd.12
What an Object Detector Looks Like
https://medium.com/@hirotoschwert/digging-into-detectron-2-47b2e794fabd
Mobility Technologies Co., Ltd.13
Backbone Network
Region Proposal
Network
ROI Head
accuracy written in papers is achieved by managing
more than 100 config parameters
https://medium.com/@hirotoschwert/digging-into-detectron-2-47b2e794fabd
What an Object Detector Looks Like
Mobility Technologies Co., Ltd.14
How It Was Hard to Reproduce YOLOv3 in PyTorch
took months to perfectly reproduce the original repo’s accuracy.
implementation details such as weight init, loss definition, and lr schedule are
critical
https://github.com/DeNA/PyTorch_YOLOv3
blog: https://medium.com/@hirotoschwert/reproducing-training-performance-of-yolov3-in-pytorch-part-0-a792e15ac90d
Mobility Technologies Co., Ltd.15
You Should Care Tiny Accuracy Differences
Model Name AP
A: Faster R-CNN Res50 34.8
B: Faster R-CNN Res50 +
Feature Pyramid Network
36.7
C: RetinaNet (single-shot)
Res50 Feature Pyramid
Network + Focal Loss
35.7
NIPS’15
CVPR’17
ICCV’17
model B from a non-official repo with AP=33.0 is less accurate than
the official model A
Mobility Technologies Co., Ltd.16
MMDetection (CUHK) 
https://github.com/open-mmlab/mmdetection
Detectron 2 (Facebook)
https://github.com/facebookresearch/detectron2
automl/efficientdet (Google)
https://github.com/google/automl/tree/master/efficientdet
tpu/models (Google)
https://github.com/tensorflow/tpu/tree/master/models/official
R. Wightman repos (tf->pytorch, non-official)
https://github.com/rwightman
Popular and Reliable Detection Frameworks
Authors’ official repos are basically recommended
Schwert used
maskrcnn-benchmark for the
competition
Mobility Technologies Co., Ltd.
17
takes 1 GPU month to train one model!
How to Choose Approaches for Large-scale Detection Competition
1month
one attempt is so costly...
Mobility Technologies Co., Ltd.18
1:Last Year’s solutions
2:Detection papers (CVPR, ICCV…)
3:Benchmark website such as papers with code
are good resources to find:
“An Exclusive Feature that Apparently Contributes to the score” (EFAC)
How to Choose Approaches for Large-scale Detection Competition
Mobility Technologies Co., Ltd.19
Looks like ResNet50 works..
OK, let’s try ResNeXt101
...and why not adding Random Cropping_
Example of Bad Experiment
model 1 (baseline)
new
feature
A
new
feature
B
model 2
Important to add / remove one exclusive feature at a time!
Mobility Technologies Co., Ltd.20
4 Schwert’s Solution
Mobility Technologies Co., Ltd.21
Schwert’s ranks:
Detection Track: 6th / 558 (Gold) [1] [2]
Segmentation Track: 11th / 193 (Silver) [3]
Relationship Track: 30th / 201 (Silver)
Results of Open Images Competition (2019)
# Team Name # of
members
score
1 MMfruit 5 0.65887
2 imagesearch 7 0.65337
3 Prisms 6 0.64214
4 PFDet 6 0.62221
5 Omni-Detection 3 0.60406
6 Schwert 1 (solo) 0.60231
7 Team 5 5 0.60210
8 pudae 1 (solo) 0.59727
Got a solo gold medal at the first kaggle competition!
Mobility Technologies Co., Ltd.22
“An Exclusive Feature that Apparently Contributes to the score” (EFAC)
EFAC examples from the solution writeups of Open Images 2018 [4][5][6]
・class balancing (3rd、5pts↑)
・Ensemble (1st / 3rd、5pts↑)
・voting NMS (1st / 3rd)
・long cosine annealing (2nd)
・parent class expansion
・ResNext 152 + SE (1st, 2nd, 3rd)
class balancing and model ensemble are essential
Mobility Technologies Co., Ltd.23
mean Average Precision (mAP) at IoU > 0.5 , avg of 500 classes
1: EVERY class is equal, even if it’s extremely rare.
      images including ‘person’ instances:250,000
       ‘torch’ instances : 18
2: Strict localization is not required.
classification matters...
Evaluation Metrics
Mobility Technologies Co., Ltd.24
Method 1:Class Balancing [1]
- Equal probability for a model to encounter a certain class.
- Rare classes: increase sampling rate.
- Non-rare classes: limit number of images.
- Total number of images: 4k x 500 (2M) → efficient training
Mobility Technologies Co., Ltd.25
Method 2 : Ensembling Pipeline of Multiple Models [1]
・Baseliene model: ResNeXt152 [7] + Deformable Convnets v2 [8] + Feature
Pyramid Network [9]
・Train different types of models on training data with different seeds
・8 models are ensembled
Mobility Technologies Co., Ltd.26
Contribution of each exclusive feature on val and leaderboard accuracies
Ablation Study
Backbone Deformable
Convolutions
Parent
Expansion
Data Size val AP private LB
ResNeXt101 None Inference Time 4k per class 69.8 54.0
ResNeXt101 DCN v2 Inference Time 4k per class 72.2 (+2.4)
ResNeXt152 None Inference Time 4k per class 72.2 (+2.4)
ResNeXt152 None Inference Time 16k per class 72.4 (+2.6)
ResNeXt152 DCN v2 Inference Time 4k per class 73.2 (+3.4) 56.4 (best
single model)
ResNeXt152 None Training Time 4k per class 72.4 (+2.6)*
Mobility Technologies Co., Ltd.27
Method 3:Enhanced (Voting) NMS [6]
Non-Maximum Suppression for Model Ensembling
When the multiple boxes from different models are overlapped, the
resulting box earns added confidence scores
Mobility Technologies Co., Ltd.28
Result of 8 Model Ensembling
Backbone Deformable
Convolutions
Parent
Expansion
Data Size val AP private LB
ResNeXt152 DCN v2 Inference
Time
4k per class 73.2 (+3.4) 56.4 (best
single
model)
Ensemble of
8 models +
NMS tuned
60.23
~13th
place
6th
place!
Mobility Technologies Co., Ltd.29
Visualization Demo of the Best Single Model
Mobility Technologies Co., Ltd.30
Visualization Demo of the Best Single Model
Mobility Technologies Co., Ltd.31
Independently train detection and segmentation
Schwert’s Approach on Segmentation Track (11th Place) [2]
Inference results using detection model
Mobility Technologies Co., Ltd.32
5 Take-Home Messages
Mobility Technologies Co., Ltd.33
・Kaggle is a wonderful platform where you can learn cutting-edge computer vision
methods and implementations. Discussion with great kagglers is always fun
・Like research, it’s a tough but fun job to develop (or surpass) the state-of-the-art method
methods
・Choosing a reliable framework is a must for Object Detection competitions
・Understand the past solutions and pick an Exclusive Feature that Apparently Contributes to
the score (EFAC)
Take-Home Messages
Mobility Technologies Co., Ltd.34
[1] Hiroto Honda, “The 6th Place Solution for the Open Images 2019 Object Detection Track, ”
presented at ICCVW 2019, https://hirotomusiker.github.io/files/schwert_open_images_6th_solution_v1.pdf
[2] Hiroto Honda, “6th place solution” , discussion in Open Images 2019 Object Detection Track,
https://www.kaggle.com/c/open-images-2019-object-detection/discussion/110953
[3] Hiroto Honda, “11th place solution, discussion in Open Images 2019 Instance Segmentation Track,
https://www.kaggle.com/c/open-images-2019-instance-segmentation/discussion/111351
[4] kivajok, 1st place writeup, https://storage.googleapis.com/openimages/web/challenge.html
[5] Takuya Akiba et al., “PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection
Track”, arXiv:1809.00778
[6] Yuan Gao et al., “Solution for Large-Scale Hierarchical Object Detection Datasets with Incomplete
Annotation and Data Imbalance”, arXiv:1810.06208
[7] Saining Xie et al., “Aggregated Residual Transformations for Deep Neural Networks,” CVPR 2017
[8] Xizhou Zhu et al., “Deformable ConvNets v2: More Deformable, Better Results”, CVPR 2019
[9] Tsung-Yi Lin et al., “Feature Pyramid Networks for Object Detection”, CVPR 2017
* All the photos used in this presentation were taken by Hiroto Honda
References
文章·画像等の内容の無断転載及び複製等の行為はご遠慮ください。
Mobility Technologies Co., Ltd.
35

Weitere ähnliche Inhalte

Was ist angesagt?

Indoor Point Cloud Processing - Deep learning for semantic segmentation of in...
Indoor Point Cloud Processing - Deep learning for semantic segmentation of in...Indoor Point Cloud Processing - Deep learning for semantic segmentation of in...
Indoor Point Cloud Processing - Deep learning for semantic segmentation of in...
CubiCasa
 
Generation of Planar Radiographs from 3D Anatomical Models Using the GPU
Generation of Planar Radiographs from 3D Anatomical Models Using the GPUGeneration of Planar Radiographs from 3D Anatomical Models Using the GPU
Generation of Planar Radiographs from 3D Anatomical Models Using the GPU
thyandrecardoso
 
REVIEW ON SECRET IMAGE SHARING USING QR CODE GENERATION TECHNIC
REVIEW ON SECRET IMAGE SHARING USING QR CODE GENERATION TECHNICREVIEW ON SECRET IMAGE SHARING USING QR CODE GENERATION TECHNIC
REVIEW ON SECRET IMAGE SHARING USING QR CODE GENERATION TECHNIC
priyanka singh
 

Was ist angesagt? (20)

Indoor Point Cloud Processing - Deep learning for semantic segmentation of in...
Indoor Point Cloud Processing - Deep learning for semantic segmentation of in...Indoor Point Cloud Processing - Deep learning for semantic segmentation of in...
Indoor Point Cloud Processing - Deep learning for semantic segmentation of in...
 
画像生成・生成モデル メタサーベイ
画像生成・生成モデル メタサーベイ画像生成・生成モデル メタサーベイ
画像生成・生成モデル メタサーベイ
 
Generation of Planar Radiographs from 3D Anatomical Models Using the GPU
Generation of Planar Radiographs from 3D Anatomical Models Using the GPUGeneration of Planar Radiographs from 3D Anatomical Models Using the GPU
Generation of Planar Radiographs from 3D Anatomical Models Using the GPU
 
Exemplar: Designing Sensor-based interactions by demonstration... (a CHI2007 ...
Exemplar: Designing Sensor-based interactions by demonstration... (a CHI2007 ...Exemplar: Designing Sensor-based interactions by demonstration... (a CHI2007 ...
Exemplar: Designing Sensor-based interactions by demonstration... (a CHI2007 ...
 
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘
 
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
 
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
 
REVIEW ON SECRET IMAGE SHARING USING QR CODE GENERATION TECHNIC
REVIEW ON SECRET IMAGE SHARING USING QR CODE GENERATION TECHNICREVIEW ON SECRET IMAGE SHARING USING QR CODE GENERATION TECHNIC
REVIEW ON SECRET IMAGE SHARING USING QR CODE GENERATION TECHNIC
 
Chris Varekamp (Philips Group Innovation, Research): Depth estimation, Proces...
Chris Varekamp (Philips Group Innovation, Research): Depth estimation, Proces...Chris Varekamp (Philips Group Innovation, Research): Depth estimation, Proces...
Chris Varekamp (Philips Group Innovation, Research): Depth estimation, Proces...
 
Corrosion Detection Using A.I : A Comparison of Standard Computer Vision Tech...
Corrosion Detection Using A.I : A Comparison of Standard Computer Vision Tech...Corrosion Detection Using A.I : A Comparison of Standard Computer Vision Tech...
Corrosion Detection Using A.I : A Comparison of Standard Computer Vision Tech...
 
Master Thesis of Computer Engineering SuperResoluton Giuseppe Caliendo
Master Thesis of Computer Engineering SuperResoluton Giuseppe CaliendoMaster Thesis of Computer Engineering SuperResoluton Giuseppe Caliendo
Master Thesis of Computer Engineering SuperResoluton Giuseppe Caliendo
 
【CVPR 2020 メタサーベイ】Video Analysis and Understanding
【CVPR 2020 メタサーベイ】Video Analysis and Understanding【CVPR 2020 メタサーベイ】Video Analysis and Understanding
【CVPR 2020 メタサーベイ】Video Analysis and Understanding
 
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTechEmerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
 
SSII2021 [OS3-01] 設備や環境の高品質計測点群取得と自動モデル化技術
SSII2021 [OS3-01] 設備や環境の高品質計測点群取得と自動モデル化技術SSII2021 [OS3-01] 設備や環境の高品質計測点群取得と自動モデル化技術
SSII2021 [OS3-01] 設備や環境の高品質計測点群取得と自動モデル化技術
 
Tactile Internet with Human-in-the-Loop
Tactile Internet with Human-in-the-LoopTactile Internet with Human-in-the-Loop
Tactile Internet with Human-in-the-Loop
 
High level-api in tensorflow
High level-api in tensorflowHigh level-api in tensorflow
High level-api in tensorflow
 
Sparse Isotropic Hashing
Sparse Isotropic HashingSparse Isotropic Hashing
Sparse Isotropic Hashing
 
Perceptually Lossless Compression with Error Concealment for Periscope and So...
Perceptually Lossless Compression with Error Concealment for Periscope and So...Perceptually Lossless Compression with Error Concealment for Periscope and So...
Perceptually Lossless Compression with Error Concealment for Periscope and So...
 
PointNet
PointNetPointNet
PointNet
 
210610 SSIIi2021 Computer Vision x Trasnformer
210610 SSIIi2021 Computer Vision x Trasnformer210610 SSIIi2021 Computer Vision x Trasnformer
210610 SSIIi2021 Computer Vision x Trasnformer
 

Ähnlich wie Tackling Open Images Challenge (2019)

Rapid object detection using boosted cascade of simple features
Rapid object detection using boosted  cascade of simple featuresRapid object detection using boosted  cascade of simple features
Rapid object detection using boosted cascade of simple features
Hirantha Pradeep
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
Justin Basilico
 

Ähnlich wie Tackling Open Images Challenge (2019) (20)

Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
 
2a Mini-conf PredictCovid. Field: Artificial Intelligence
2a Mini-conf PredictCovid. Field: Artificial Intelligence2a Mini-conf PredictCovid. Field: Artificial Intelligence
2a Mini-conf PredictCovid. Field: Artificial Intelligence
 
IRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
IRJET- Object Detection and Recognition using Single Shot Multi-Box DetectorIRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
IRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
 
IRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural NetworkIRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural Network
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in Images
 
IRJET- Implementation of Gender Detection with Notice Board using Raspberry Pi
IRJET- Implementation of Gender Detection with Notice Board using Raspberry PiIRJET- Implementation of Gender Detection with Notice Board using Raspberry Pi
IRJET- Implementation of Gender Detection with Notice Board using Raspberry Pi
 
Rapid object detection using boosted cascade of simple features
Rapid object detection using boosted  cascade of simple featuresRapid object detection using boosted  cascade of simple features
Rapid object detection using boosted cascade of simple features
 
Motion capture for Animation
Motion capture for AnimationMotion capture for Animation
Motion capture for Animation
 
IRJET - Automated Fraud Detection Framework in Examination Halls
 IRJET - Automated Fraud Detection Framework in Examination Halls IRJET - Automated Fraud Detection Framework in Examination Halls
IRJET - Automated Fraud Detection Framework in Examination Halls
 
IRJET- Generating 3D Models Using 3D Generative Adversarial Network
IRJET- Generating 3D Models Using 3D Generative Adversarial NetworkIRJET- Generating 3D Models Using 3D Generative Adversarial Network
IRJET- Generating 3D Models Using 3D Generative Adversarial Network
 
An Introduction to Face Detection
An Introduction to Face DetectionAn Introduction to Face Detection
An Introduction to Face Detection
 
230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx
 
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
 
2013 Lecture 5: AR Tools and Interaction
2013 Lecture 5: AR Tools and Interaction 2013 Lecture 5: AR Tools and Interaction
2013 Lecture 5: AR Tools and Interaction
 
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDSFACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
 
ROAD POTHOLE DETECTION USING YOLOV4 DARKNET
ROAD POTHOLE DETECTION USING YOLOV4 DARKNETROAD POTHOLE DETECTION USING YOLOV4 DARKNET
ROAD POTHOLE DETECTION USING YOLOV4 DARKNET
 
Efficient Point Cloud Pre-processing using The Point Cloud Library
Efficient Point Cloud Pre-processing using The Point Cloud LibraryEfficient Point Cloud Pre-processing using The Point Cloud Library
Efficient Point Cloud Pre-processing using The Point Cloud Library
 
Efficient Point Cloud Pre-processing using The Point Cloud Library
Efficient Point Cloud Pre-processing using The Point Cloud LibraryEfficient Point Cloud Pre-processing using The Point Cloud Library
Efficient Point Cloud Pre-processing using The Point Cloud Library
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Human pose detection using machine learning by Grandel
Human pose detection using machine learning by GrandelHuman pose detection using machine learning by Grandel
Human pose detection using machine learning by Grandel
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Tackling Open Images Challenge (2019)

  • 1. Mobility Technologies Co., Ltd. Tackling Open Images Challenge - presented at the 26th Symposium on Sensing via Image Information June 12, 2020 Hiroto Honda, Mobility Technologies Co., Ltd.
  • 2. Mobility Technologies Co., Ltd.2 1 About Me
  • 3. Mobility Technologies Co., Ltd.3 About Me Hiroto Honda https://hirotomusiker.github.io/ kaggle name : Schwert ‘Schwert’ = sword in German R&D of Imaging devices in a Japanese Electronics company → DeNA computer vision team →Mobility Technologies  
  • 4. Mobility Technologies Co., Ltd.4 Check out my Blog Series! https://medium.com/@hirotoschwert/digging-into-detectron-2-47b2e794fabd Digging into Detectron 2 (object detection)
  • 5. Mobility Technologies Co., Ltd.5 2 Kaggle and Open Images Challenge
  • 6. Mobility Technologies Co., Ltd. Val Data 6 How to Try Kaggle Test data →private leaderboard →public leaderboard Train Data How can you maximize your model’s score on the HIDDEN test data? Evaluation metrics are described in the ‘Evaluation’ section - mean average precision、Dice Coefficient, and so on. Sometimes non-standard metrics are employed and discussed in the ‘Discussion’ threads. Cross Validation and Test data Val Data Train Data Val Data Train Data
  • 7. Mobility Technologies Co., Ltd.7 Open Images Dataset (v5) : 900 million images collected from Flickr ・16M Bounding box annotations of 600 classes on 1.9M images ・Segmentation polygons on 350-class instances ・329 inter-object relationship Open Images Challenge https://storage.googleapis.com/openimages/web/challenge.html https://www.kaggle.com/c/open-images-2019-object-detection/
  • 8. Mobility Technologies Co., Ltd.8 1GB of bounding box data!! (on 500GB of image data) How Huge is Open Images Dataset ?
  • 9. Mobility Technologies Co., Ltd.9 3 How to Tackle Object Detection Challenges
  • 10. Mobility Technologies Co., Ltd.10 Object Detection - detects object positions, sizes and classes from an image - tremendous success of deep-learning-based approaches (e.g. Faster R-CNN, YOLO, and EfficientDet)
  • 11. Mobility Technologies Co., Ltd.11 NOT RECOMMENDED! Okay, Why Not Code Object Detectors
  • 12. Mobility Technologies Co., Ltd.12 What an Object Detector Looks Like https://medium.com/@hirotoschwert/digging-into-detectron-2-47b2e794fabd
  • 13. Mobility Technologies Co., Ltd.13 Backbone Network Region Proposal Network ROI Head accuracy written in papers is achieved by managing more than 100 config parameters https://medium.com/@hirotoschwert/digging-into-detectron-2-47b2e794fabd What an Object Detector Looks Like
  • 14. Mobility Technologies Co., Ltd.14 How It Was Hard to Reproduce YOLOv3 in PyTorch took months to perfectly reproduce the original repo’s accuracy. implementation details such as weight init, loss definition, and lr schedule are critical https://github.com/DeNA/PyTorch_YOLOv3 blog: https://medium.com/@hirotoschwert/reproducing-training-performance-of-yolov3-in-pytorch-part-0-a792e15ac90d
  • 15. Mobility Technologies Co., Ltd.15 You Should Care Tiny Accuracy Differences Model Name AP A: Faster R-CNN Res50 34.8 B: Faster R-CNN Res50 + Feature Pyramid Network 36.7 C: RetinaNet (single-shot) Res50 Feature Pyramid Network + Focal Loss 35.7 NIPS’15 CVPR’17 ICCV’17 model B from a non-official repo with AP=33.0 is less accurate than the official model A
  • 16. Mobility Technologies Co., Ltd.16 MMDetection (CUHK)  https://github.com/open-mmlab/mmdetection Detectron 2 (Facebook) https://github.com/facebookresearch/detectron2 automl/efficientdet (Google) https://github.com/google/automl/tree/master/efficientdet tpu/models (Google) https://github.com/tensorflow/tpu/tree/master/models/official R. Wightman repos (tf->pytorch, non-official) https://github.com/rwightman Popular and Reliable Detection Frameworks Authors’ official repos are basically recommended Schwert used maskrcnn-benchmark for the competition
  • 17. Mobility Technologies Co., Ltd. 17 takes 1 GPU month to train one model! How to Choose Approaches for Large-scale Detection Competition 1month one attempt is so costly...
  • 18. Mobility Technologies Co., Ltd.18 1:Last Year’s solutions 2:Detection papers (CVPR, ICCV…) 3:Benchmark website such as papers with code are good resources to find: “An Exclusive Feature that Apparently Contributes to the score” (EFAC) How to Choose Approaches for Large-scale Detection Competition
  • 19. Mobility Technologies Co., Ltd.19 Looks like ResNet50 works.. OK, let’s try ResNeXt101 ...and why not adding Random Cropping_ Example of Bad Experiment model 1 (baseline) new feature A new feature B model 2 Important to add / remove one exclusive feature at a time!
  • 20. Mobility Technologies Co., Ltd.20 4 Schwert’s Solution
  • 21. Mobility Technologies Co., Ltd.21 Schwert’s ranks: Detection Track: 6th / 558 (Gold) [1] [2] Segmentation Track: 11th / 193 (Silver) [3] Relationship Track: 30th / 201 (Silver) Results of Open Images Competition (2019) # Team Name # of members score 1 MMfruit 5 0.65887 2 imagesearch 7 0.65337 3 Prisms 6 0.64214 4 PFDet 6 0.62221 5 Omni-Detection 3 0.60406 6 Schwert 1 (solo) 0.60231 7 Team 5 5 0.60210 8 pudae 1 (solo) 0.59727 Got a solo gold medal at the first kaggle competition!
  • 22. Mobility Technologies Co., Ltd.22 “An Exclusive Feature that Apparently Contributes to the score” (EFAC) EFAC examples from the solution writeups of Open Images 2018 [4][5][6] ・class balancing (3rd、5pts↑) ・Ensemble (1st / 3rd、5pts↑) ・voting NMS (1st / 3rd) ・long cosine annealing (2nd) ・parent class expansion ・ResNext 152 + SE (1st, 2nd, 3rd) class balancing and model ensemble are essential
  • 23. Mobility Technologies Co., Ltd.23 mean Average Precision (mAP) at IoU > 0.5 , avg of 500 classes 1: EVERY class is equal, even if it’s extremely rare.       images including ‘person’ instances:250,000        ‘torch’ instances : 18 2: Strict localization is not required. classification matters... Evaluation Metrics
  • 24. Mobility Technologies Co., Ltd.24 Method 1:Class Balancing [1] - Equal probability for a model to encounter a certain class. - Rare classes: increase sampling rate. - Non-rare classes: limit number of images. - Total number of images: 4k x 500 (2M) → efficient training
  • 25. Mobility Technologies Co., Ltd.25 Method 2 : Ensembling Pipeline of Multiple Models [1] ・Baseliene model: ResNeXt152 [7] + Deformable Convnets v2 [8] + Feature Pyramid Network [9] ・Train different types of models on training data with different seeds ・8 models are ensembled
  • 26. Mobility Technologies Co., Ltd.26 Contribution of each exclusive feature on val and leaderboard accuracies Ablation Study Backbone Deformable Convolutions Parent Expansion Data Size val AP private LB ResNeXt101 None Inference Time 4k per class 69.8 54.0 ResNeXt101 DCN v2 Inference Time 4k per class 72.2 (+2.4) ResNeXt152 None Inference Time 4k per class 72.2 (+2.4) ResNeXt152 None Inference Time 16k per class 72.4 (+2.6) ResNeXt152 DCN v2 Inference Time 4k per class 73.2 (+3.4) 56.4 (best single model) ResNeXt152 None Training Time 4k per class 72.4 (+2.6)*
  • 27. Mobility Technologies Co., Ltd.27 Method 3:Enhanced (Voting) NMS [6] Non-Maximum Suppression for Model Ensembling When the multiple boxes from different models are overlapped, the resulting box earns added confidence scores
  • 28. Mobility Technologies Co., Ltd.28 Result of 8 Model Ensembling Backbone Deformable Convolutions Parent Expansion Data Size val AP private LB ResNeXt152 DCN v2 Inference Time 4k per class 73.2 (+3.4) 56.4 (best single model) Ensemble of 8 models + NMS tuned 60.23 ~13th place 6th place!
  • 29. Mobility Technologies Co., Ltd.29 Visualization Demo of the Best Single Model
  • 30. Mobility Technologies Co., Ltd.30 Visualization Demo of the Best Single Model
  • 31. Mobility Technologies Co., Ltd.31 Independently train detection and segmentation Schwert’s Approach on Segmentation Track (11th Place) [2] Inference results using detection model
  • 32. Mobility Technologies Co., Ltd.32 5 Take-Home Messages
  • 33. Mobility Technologies Co., Ltd.33 ・Kaggle is a wonderful platform where you can learn cutting-edge computer vision methods and implementations. Discussion with great kagglers is always fun ・Like research, it’s a tough but fun job to develop (or surpass) the state-of-the-art method methods ・Choosing a reliable framework is a must for Object Detection competitions ・Understand the past solutions and pick an Exclusive Feature that Apparently Contributes to the score (EFAC) Take-Home Messages
  • 34. Mobility Technologies Co., Ltd.34 [1] Hiroto Honda, “The 6th Place Solution for the Open Images 2019 Object Detection Track, ” presented at ICCVW 2019, https://hirotomusiker.github.io/files/schwert_open_images_6th_solution_v1.pdf [2] Hiroto Honda, “6th place solution” , discussion in Open Images 2019 Object Detection Track, https://www.kaggle.com/c/open-images-2019-object-detection/discussion/110953 [3] Hiroto Honda, “11th place solution, discussion in Open Images 2019 Instance Segmentation Track, https://www.kaggle.com/c/open-images-2019-instance-segmentation/discussion/111351 [4] kivajok, 1st place writeup, https://storage.googleapis.com/openimages/web/challenge.html [5] Takuya Akiba et al., “PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track”, arXiv:1809.00778 [6] Yuan Gao et al., “Solution for Large-Scale Hierarchical Object Detection Datasets with Incomplete Annotation and Data Imbalance”, arXiv:1810.06208 [7] Saining Xie et al., “Aggregated Residual Transformations for Deep Neural Networks,” CVPR 2017 [8] Xizhou Zhu et al., “Deformable ConvNets v2: More Deformable, Better Results”, CVPR 2019 [9] Tsung-Yi Lin et al., “Feature Pyramid Networks for Object Detection”, CVPR 2017 * All the photos used in this presentation were taken by Hiroto Honda References