SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Best of both worlds:
Human-machine
collaboration
for object annotation (CVPR2015)
visionNoob
(이재원)
PR-157:
Olga Russakovksy, Li-Jia Li, Li Fei-Fei
Stanford University, Snapchat
[paper] http://ai.stanford.edu/~olga/papers/RussakovskyCVPR15.pdf
[CVPR’15 poster] http://ai.stanford.edu/~olga/posters/cvpr15-poster.pdf
[supplements] http://ai.stanford.edu/~olga/papers/RussakovskyCVPR15_supp.pdf
[slides made by first author] http://ai.stanford.edu/~olga/slides/best_of_both_worlds_slides.pdf
Goal
efficiently and accurately detect all objects in an image
Green boxes
RCNN results (with NMS)
Yellow boxes
ILSVRC dataset classes
(but RCNN fail)
Pink boxes
outside of range of capabilities
of object detectors
Related Works
1. Recognition with human in the loop
2. Better object detection
3. Cheaper manual annotation
Related Works
1. Recognition with human in the loop
2. Better object detection
3. Cheaper manual annotation
Related Works
1. Recognition with human in the loop
2. Better object detection
3. Cheaper manual annotation
- Weakly supervised learning [42, 23, 52, 8, 24, 15]
- Active learning [32, 56] (see also [PR-119])
- Mine the web for object detection [8, 11, 15]
-> minimize human annotation
http://mpawankumar.info/tutorials/cvpr2013/index.html
Related Works
1. Recognition with human in the loop
2. Better object detection
3. Cheaper manual annotation
Crowdsourcing techniques
- Annotation games [57, 12, 30]
- Tricks to reduce the annotation search space [13, 4]
- Effective user interface design [50, 58]
- Making use of existing annotations [5]
Making use of weak human supervision [26, 7]
Accurately computing the number of required workers [46]
System Overview
System Overview
input
1. image to label
2. Constraints
- utility
- precision
- and/or budget
output
Bi: bounding box
Ci: class label
pi: confident (prob of detection being correct)
System Overview
Constraints
- utility (𝑈∗
)
- precision(𝑃∗
)
- budget (𝐵∗
) : cost of human time
= 1 (in this paper)
3가지 중
2가지만 선택
Method
Model : Markov Decision Process (MDP)
State
Action
Transition probability
Reward
Optimization
Method
Model : Markov Decision Process (MDP)
State : set of object detections, with probabilities
cls(C | I, U)
det(B, C | I, U)
moreinst(B, X | I, U)
obj(B | I , U)
morecls(C | I, U)
Method
Model : Markov Decision Process (MDP)
State : set of object detections, with probabilities
Action : a question to ask humans
Method
Model : Markov Decision Process (MDP)
State : set of object detections, with probabilities
Action : a questions to ask humans
Transition probability : probability distribution over user responses
Reward : increase in estimated quality of labeling divided by the cost of actions
Optimization : 2-step lookahead search
Method
Model : Markov Decision Process (MDP)
State : set of object detections, with probabilities
Action : a questions to ask humans
Transition probability : probability distribution over user responses
Reward : increase in estimated quality of labeling divided by the cost of actions
Optimization : 2-step lookahead search
Note that
Method
Model : Markov Decision Process (MDP)
State : set of object detections, with probabilities
Action : a questions to ask humans
Transition probability : probability distribution over user responses
Reward : increase in estimated quality of labeling divided by the cost of actions
Optimization : 2-step lookahead search
Note that
Method
Computing the transition probability
t t-1
total probability
Method
Computing the transition probability
priorBayes’ rule
∝
Examples)
P( C | I ) //classifier
P( B, C | I ) // obj detector
Method
Multiple computer vision models
Method
Pre-computed human error rates
Experimental Setup
dataset
ImageNet Large Scale Visual Recognition Challenge(ILSVRC) detection dataset
train set : 400K
validation : 200K (split the val set into two sets(val1, val2) for test)
computer vision models
1. Image classifier : 200 class CNN classifiers [Hoffman NIPS14]
2. Object detector : 200 class RCNN [Girshick CVPR14]
3. Probability of object region : Objectness measure [Alexe PAMI2012]
4. Probability of another instance of same class : statistics from ILSVRC2014 val-DET data
5. Probability of another class in image : statistics from ILSVRC2014 val-DET data
Experimental Results The ILSVRC detection system :
Step1 : determining what object classes are present in the images
Step2 : Asking users to draw bounding boxes.
Conclusions
We presented a principled approach to unifying multiple inputs
from both computer vision and humans to label objects in images.
Discussion

Weitere ähnliche Inhalte

Ähnlich wie PR157: Best of both worlds: human-machine collaboration for object annotation

Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slideswolf
 
Surveillance scene classification using machine learning
Surveillance scene classification using machine learningSurveillance scene classification using machine learning
Surveillance scene classification using machine learningUtkarsh Contractor
 
Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryUsing HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryWai Nwe Tun
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningBenjamin Bengfort
 
Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...Wesley De Neve
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection ProcessBenjamin Bengfort
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui
 
HOP-Rec_RecSys18
HOP-Rec_RecSys18HOP-Rec_RecSys18
HOP-Rec_RecSys18Matt Yang
 
Human_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_ModelHuman_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_ModelDavid Ritchie
 
Using AI Planning to Automate the Performance Analysis of Simulators
Using AI Planning to Automate the Performance Analysis of SimulatorsUsing AI Planning to Automate the Performance Analysis of Simulators
Using AI Planning to Automate the Performance Analysis of SimulatorsRoland Ewald
 
Learning To Rank User Queries to Detect Search Tasks
Learning To Rank User Queries to Detect Search TasksLearning To Rank User Queries to Detect Search Tasks
Learning To Rank User Queries to Detect Search TasksFranco Maria Nardini
 
UHDMML.pps
UHDMML.ppsUHDMML.pps
UHDMML.ppsbutest
 
MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...
MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...
MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...collwe
 

Ähnlich wie PR157: Best of both worlds: human-machine collaboration for object annotation (20)

Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slides
 
Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018
 
Surveillance scene classification using machine learning
Surveillance scene classification using machine learningSurveillance scene classification using machine learning
Surveillance scene classification using machine learning
 
Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryUsing HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learning
 
Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection Process
 
Seminar nov2017
Seminar nov2017Seminar nov2017
Seminar nov2017
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
HOP-Rec_RecSys18
HOP-Rec_RecSys18HOP-Rec_RecSys18
HOP-Rec_RecSys18
 
Learning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep visionLearning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep vision
 
Af03401810185..
Af03401810185..Af03401810185..
Af03401810185..
 
Human_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_ModelHuman_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_Model
 
Biehl hanze-2021
Biehl hanze-2021Biehl hanze-2021
Biehl hanze-2021
 
Using AI Planning to Automate the Performance Analysis of Simulators
Using AI Planning to Automate the Performance Analysis of SimulatorsUsing AI Planning to Automate the Performance Analysis of Simulators
Using AI Planning to Automate the Performance Analysis of Simulators
 
Learning To Rank User Queries to Detect Search Tasks
Learning To Rank User Queries to Detect Search TasksLearning To Rank User Queries to Detect Search Tasks
Learning To Rank User Queries to Detect Search Tasks
 
UHDMML.pps
UHDMML.ppsUHDMML.pps
UHDMML.pps
 
MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...
MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...
MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...
 
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)
 
[IJET V2I3P11] Authors: Payal More, Rohini Pandit, Supriya Makude, Harsh Nirb...
[IJET V2I3P11] Authors: Payal More, Rohini Pandit, Supriya Makude, Harsh Nirb...[IJET V2I3P11] Authors: Payal More, Rohini Pandit, Supriya Makude, Harsh Nirb...
[IJET V2I3P11] Authors: Payal More, Rohini Pandit, Supriya Makude, Harsh Nirb...
 

Mehr von jaewon lee

PR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild
PR-185: RetinaFace: Single-stage Dense Face Localisation in the WildPR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild
PR-185: RetinaFace: Single-stage Dense Face Localisation in the Wildjaewon lee
 
PR-199: SNIPER:Efficient Multi Scale Training
PR-199: SNIPER:Efficient Multi Scale TrainingPR-199: SNIPER:Efficient Multi Scale Training
PR-199: SNIPER:Efficient Multi Scale Trainingjaewon lee
 
PR-146: CornerNet detecting objects as paired keypoints
PR-146: CornerNet detecting objects as paired keypointsPR-146: CornerNet detecting objects as paired keypoints
PR-146: CornerNet detecting objects as paired keypointsjaewon lee
 
PR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural NetworksPR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural Networksjaewon lee
 
PR-122: Can-Creative Adversarial Networks
PR-122: Can-Creative Adversarial NetworksPR-122: Can-Creative Adversarial Networks
PR-122: Can-Creative Adversarial Networksjaewon lee
 
Pytorch kr devcon
Pytorch kr devconPytorch kr devcon
Pytorch kr devconjaewon lee
 
PR-134 How Does Batch Normalization Help Optimization?
PR-134 How Does Batch Normalization Help Optimization?PR-134 How Does Batch Normalization Help Optimization?
PR-134 How Does Batch Normalization Help Optimization?jaewon lee
 
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIPPR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIPjaewon lee
 

Mehr von jaewon lee (9)

PR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild
PR-185: RetinaFace: Single-stage Dense Face Localisation in the WildPR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild
PR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild
 
PR-199: SNIPER:Efficient Multi Scale Training
PR-199: SNIPER:Efficient Multi Scale TrainingPR-199: SNIPER:Efficient Multi Scale Training
PR-199: SNIPER:Efficient Multi Scale Training
 
PR-146: CornerNet detecting objects as paired keypoints
PR-146: CornerNet detecting objects as paired keypointsPR-146: CornerNet detecting objects as paired keypoints
PR-146: CornerNet detecting objects as paired keypoints
 
PR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural NetworksPR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural Networks
 
PR-122: Can-Creative Adversarial Networks
PR-122: Can-Creative Adversarial NetworksPR-122: Can-Creative Adversarial Networks
PR-122: Can-Creative Adversarial Networks
 
Rgb data
Rgb dataRgb data
Rgb data
 
Pytorch kr devcon
Pytorch kr devconPytorch kr devcon
Pytorch kr devcon
 
PR-134 How Does Batch Normalization Help Optimization?
PR-134 How Does Batch Normalization Help Optimization?PR-134 How Does Batch Normalization Help Optimization?
PR-134 How Does Batch Normalization Help Optimization?
 
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIPPR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
 

Kürzlich hochgeladen

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 

Kürzlich hochgeladen (20)

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 

PR157: Best of both worlds: human-machine collaboration for object annotation

  • 1. Best of both worlds: Human-machine collaboration for object annotation (CVPR2015) visionNoob (이재원) PR-157: Olga Russakovksy, Li-Jia Li, Li Fei-Fei Stanford University, Snapchat [paper] http://ai.stanford.edu/~olga/papers/RussakovskyCVPR15.pdf [CVPR’15 poster] http://ai.stanford.edu/~olga/posters/cvpr15-poster.pdf [supplements] http://ai.stanford.edu/~olga/papers/RussakovskyCVPR15_supp.pdf [slides made by first author] http://ai.stanford.edu/~olga/slides/best_of_both_worlds_slides.pdf
  • 2. Goal efficiently and accurately detect all objects in an image Green boxes RCNN results (with NMS) Yellow boxes ILSVRC dataset classes (but RCNN fail) Pink boxes outside of range of capabilities of object detectors
  • 3.
  • 4. Related Works 1. Recognition with human in the loop 2. Better object detection 3. Cheaper manual annotation
  • 5. Related Works 1. Recognition with human in the loop 2. Better object detection 3. Cheaper manual annotation
  • 6. Related Works 1. Recognition with human in the loop 2. Better object detection 3. Cheaper manual annotation - Weakly supervised learning [42, 23, 52, 8, 24, 15] - Active learning [32, 56] (see also [PR-119]) - Mine the web for object detection [8, 11, 15] -> minimize human annotation http://mpawankumar.info/tutorials/cvpr2013/index.html
  • 7. Related Works 1. Recognition with human in the loop 2. Better object detection 3. Cheaper manual annotation Crowdsourcing techniques - Annotation games [57, 12, 30] - Tricks to reduce the annotation search space [13, 4] - Effective user interface design [50, 58] - Making use of existing annotations [5] Making use of weak human supervision [26, 7] Accurately computing the number of required workers [46]
  • 8.
  • 10. System Overview input 1. image to label 2. Constraints - utility - precision - and/or budget output Bi: bounding box Ci: class label pi: confident (prob of detection being correct)
  • 11. System Overview Constraints - utility (𝑈∗ ) - precision(𝑃∗ ) - budget (𝐵∗ ) : cost of human time = 1 (in this paper) 3가지 중 2가지만 선택
  • 12. Method Model : Markov Decision Process (MDP) State Action Transition probability Reward Optimization
  • 13. Method Model : Markov Decision Process (MDP) State : set of object detections, with probabilities cls(C | I, U) det(B, C | I, U) moreinst(B, X | I, U) obj(B | I , U) morecls(C | I, U)
  • 14. Method Model : Markov Decision Process (MDP) State : set of object detections, with probabilities Action : a question to ask humans
  • 15. Method Model : Markov Decision Process (MDP) State : set of object detections, with probabilities Action : a questions to ask humans Transition probability : probability distribution over user responses Reward : increase in estimated quality of labeling divided by the cost of actions Optimization : 2-step lookahead search
  • 16. Method Model : Markov Decision Process (MDP) State : set of object detections, with probabilities Action : a questions to ask humans Transition probability : probability distribution over user responses Reward : increase in estimated quality of labeling divided by the cost of actions Optimization : 2-step lookahead search Note that
  • 17. Method Model : Markov Decision Process (MDP) State : set of object detections, with probabilities Action : a questions to ask humans Transition probability : probability distribution over user responses Reward : increase in estimated quality of labeling divided by the cost of actions Optimization : 2-step lookahead search Note that
  • 18. Method Computing the transition probability t t-1 total probability
  • 19. Method Computing the transition probability priorBayes’ rule ∝ Examples) P( C | I ) //classifier P( B, C | I ) // obj detector
  • 22. Experimental Setup dataset ImageNet Large Scale Visual Recognition Challenge(ILSVRC) detection dataset train set : 400K validation : 200K (split the val set into two sets(val1, val2) for test) computer vision models 1. Image classifier : 200 class CNN classifiers [Hoffman NIPS14] 2. Object detector : 200 class RCNN [Girshick CVPR14] 3. Probability of object region : Objectness measure [Alexe PAMI2012] 4. Probability of another instance of same class : statistics from ILSVRC2014 val-DET data 5. Probability of another class in image : statistics from ILSVRC2014 val-DET data
  • 23. Experimental Results The ILSVRC detection system : Step1 : determining what object classes are present in the images Step2 : Asking users to draw bounding boxes.
  • 24. Conclusions We presented a principled approach to unifying multiple inputs from both computer vision and humans to label objects in images.