SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
Taskonomy: Disentangling Task
Transfer Learning
2019 Feb. 14th

Tatsuya Shirakawa (ABEJA, Inc.)
Self Introduction
ABEJA, Inc. (Researcher)
- Deep Learning (CV, Graph, NLP, )
- Machine Learning
- Mathematical Optimization
- https://github.com/TatsuyaShirakawa
Tech blog http://tech-blog.abeja.asia/
Poincaré Embeddings Graph Convolution Annotation Hyperbolic
Today’s Paper
Exploring the Structure among Visual Tasks

by Measuring Transferability

(Taskonomy = Task + Taxonomy
http://taskonomy.stanford.edu/ http://taskonomy.vision/
+ Super Thorough Analysis
+ Potentially Promising Research Direction
= Super Interesting CVPR 2018 Best Paper !
+ Super Large Dataset with 26 Task Annotations
Paper Introduction
• Considering transferability among visual
tasks

• Analysis of the transferability by means of
AHP (Analytic Hierarchy Process)

• Combinatorial Optimization for extracting
visual Taskonomy

• Massive Dataset & Experiments 

(4.5M images, 26 tasks, 47,886 GPU hours)
http://taskonomy.stanford.edu/
http://taskonomy.vision/
Disclaimer
The paper, slides, live demos, and web pages are great already.

So, in this talk, let’s focus on the understanding 

- the motivation,

- the task,

- method and 

- some experimental results 

of Taskonomy.



In the following, I extensively quote some slides from

https://storage.googleapis.com/taskonomy_slides/taskonomy_slides.html
Contents
• Motivation & Task
• Dataset

• Method

• Experiments
Zamir et al. Taskonomy 2018
Question: Vision problems - related or independent?
Layout Objects
?
Depth Normals
Image
?
!2
Zamir et al. Taskonomy 2018
Zamir et al. Taskonomy 2018
Question: Vision problems - related or independent?
•Can be computationally measured
•Unified model for transfer learning
•Task relationships exist
•Tasks belonging to a structured space
Depth Normals
Layout Objects
Image
derivative
spatial
prior
!3
Goal — Task Transferability Structure
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Contents
• Motivation & Task

• Dataset
• Method

• Experiments
Zamir et al. Taskonomy 2018
Introduction Method Results Summary
Query Image
AutoencodingIn-painting
Object Class. Scene Class.
Jigsaw puzzle Colorization 2D Segm. 2.5D Segm. Semantic Segm.
Vanishing Points 2D Edges 3D Edges 2D Keypoints 3D Keypoints
3D Curvature Image Reshading Denoising
Cam. Pose (non-fixated) Cam. Pose(fixated) Triplet Cam. Pose Room Layout Point Matching
Top 5 prediction:
sliding door
home theater, home theatre
studio couch, day bed
china cabinet, china closet
entertainment center
Eucl. DistanceSurface Normals
Top 2 prediction:
living room
television room
!21
• Task Bank
• 26 Semantic, 2D, 3D, and tasks
• Dataset
• 4 million real images
• Each image has the GT label for all tasks
• Task-Specific Networks
• 26 x
https://storage.googleapis.com/taskonomy_slides/taskonomy_slides.html
Dataset Creation
• Semantic tasks (e.g. scene classification)

=> “Knowledge distillation” from known methods

= predictions of trained models are used as labels

• Non-Semantic Labels

=> Programatically computed from images from multiple RGB-D cameras
Contents
• Motivation & Task

• Dataset

• Method
• Experiments
Zamir et al. Taskonomy 2018
Modeling
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP)
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Introduction Method Results Summary
!15
Zamir et al. Taskonomy 2018
I: Task-Specific Modeling
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP)
Image Source Output
(normals)Training data
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Introduction Method Results Summary
!16
Same Image Resolution

Same Network Architecture

=> Same Latent Representation
Zamir et al. Taskonomy 2018
II: Transfer Modeling
Image
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP)
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Introduction Method Results Summary
Training data
Target Output
(Curvature)
!17Image Source Output
(normals)Training data
Zamir et al. Taskonomy 2018
II: Transfer Modeling
Image
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP)
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Introduction Method Results Summary
Training data
Target Output
(Curvature)
!18
Zamir et al. Taskonomy 2018
II: Transfer Modeling
Image
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP)
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Introduction Method Results Summary
Training data
Target Output
(Curvature)
!19
+ Higher Order Transfers (Beam Search)
Zamir et al. Taskonomy 2018
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP)
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Introduction Method Results Summary
III: Normalization
Adjacency Matrix (pre-normalization)
!20
Adjacency Matrix W
The (i, j)-th element is the raw loss/evaluation
when i-th/j-th tasks are taken as source/target
tasks.
• problematic (scale and space mismatch)

=> a proper normalization is needed
Zamir et al. Taskonomy 2018
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP)
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Introduction Method Results Summary
III: Normalization
Adjacency Matrix (pre-normalization)
!21
Adjacency Matrix W_t (t: target task)
The (i, j)-th element is the ratio of (a) / (b)
(a) number of images on which i-th task transfered

to target task t better than j-th task did
(b) number of images on which j-th task transfered

to target task t better than i-th task did
Zamir et al. Taskonomy 2018
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP)
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Introduction Method Results Summary
III: Normalization
Adjacency Matrix (pre-normalization) Adjacency Matrix (post-normalization)
!22
Ordinal Normalization -
Analytic Hierarchical Process.
(AHP)
AHP(Analytical Hierarchical Process)
Mathematical Background
Let us consider the ranking of n items {1, 2, …, n}.

Let A = (a_ij), a_ij measure how i-th item is superior to j-th item.

Assume matrix A = (a_ij) has the form of a_ij = u_i / u_j

Then,



(1) A is rank 1

(2) Au = nu (u is the unique non-zero eigenvector)

=> u: importance vector
AHP for Taskonomy
1. Take the win-lose ratio between 

(a) transfer s_i -> t and (b) transfer s_j -> t



2. Take the 1st principal component (normalized to sum to 1) of the matrix

3. Create the final matrix by

stacking the 1st principal 

components

Zamir et al. Taskonomy 2018
IV: Taxonomy Extraction
• Taxonomical structure:
• Sparsified
• What are best source tasks
• What sources for each target
• Out-of-dictionary tasks
• Maximize performance while
constrained by some budget
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP)
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Introduction Method Results Summary
Zamir et al. Taskonomy 2018
IV: Taxonomy Extraction
Source
tasks
Target
tasks
Dictionary= Sources ∪Targets
target-only (small data)source/targetsource-only
Introduction Method Results Summary
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Zamir et al. Taskonomy 2018
IV: Taxonomy Extraction
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Source
tasks
Target
tasks
Dictionary= Sources ∪Targets
Introduction Method Results Summary
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
target-only (small data)source/targetsource-only
Zamir et al. Taskonomy 2018
IV: Taxonomy Extraction
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.
Curvature
Denoising
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
Reshading
Distance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Source
tasks
Target
tasks
Dictionary= Sources ∪Targets
Introduction Method Results Summary
Constraint I:
only transfer from sources.
Constraint II:
all targets are transferred to.
Autoencoding
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Object Class.
Autoencoding
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
Novel Task 1
Novel Task 2
Novel Task 3
Vanishing Pts.
Semantic Segm.
2D Segm.
Object Class.
Autoencoding
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Object Class.
Autoencoding
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
Novel Task 1
Novel Task 2
Novel Task 3
Vanishing Pts.
Semantic Segm.
2D Segm.
Object Class.
Autoencoding
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Object Class.
Autoencoding
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
Novel Task 1
Novel Task 2
Novel Task 3
Vanishing Pts.
Semantic Segm.
2D Segm.
Object Class.
Constraint III:
not exceed budget.
Binary Integer
Program (BIP)
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
Autoencoding
Object Class.
Scene Class.Curvature
Denoising
2D Edges
Occlusion Edges
Egomotion
Cam. Pose (fix)
2D Keypoint
3D Keypoint
Cam. Pose (nonfix)
Matching
ReshadingDistance
Z-Depth
Normals
Layout
2.5D Segm.
2D Segm.
Semantic Segm.
Vanishing Pts.
Novel Task 1
Novel Task 2
Novel Task 3
target-only (small data)source/targetsource-only
Taxonomy Extraction
• Boolean Integer Programming (BIP)

— Finding the subgraph compose of tasks(nodes) and transfers(edges) 

which solve the all tasks in minimum cost
Constraint I

if a transfer is in the subgraph, all of its source nodes/tasks must be included too

Constraint II

each target task has exactly one transfer in

Constraint III

supervision budget is not exceeded
Contents
• Motivation & Task

• Dataset

• Method

• Experiments
Zamir et al. Taskonomy 2018
Experimental Results
Introduction Method Results Summary
!31
• 26 Task-Specific Networks
• 3000 Transfer Networks
• 47,829 GPU hours
• Transfers training data: 8x-120x less than task-specific
(“Normals” = diff. of “Depth” looks quite strong but many tasks are computed if 3D-reconstruction is done …)
Gain Quality
Gain Quality
 Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., 14th
 Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., 14th
 Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., 14th

Weitere ähnliche Inhalte

Ähnlich wie Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., 14th

Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMeetupDataScienceRoma
 
PR-272: Accelerating Large-Scale Inference with Anisotropic Vector Quantization
PR-272: Accelerating Large-Scale Inference with Anisotropic Vector QuantizationPR-272: Accelerating Large-Scale Inference with Anisotropic Vector Quantization
PR-272: Accelerating Large-Scale Inference with Anisotropic Vector QuantizationSunghoon Joo
 
Generation of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptGeneration of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptDivyaGugulothu
 
Object based image analysis tools for opticks
Object based image analysis tools for opticksObject based image analysis tools for opticks
Object based image analysis tools for opticksMohit Kumar
 
IISc Internship Report
IISc Internship ReportIISc Internship Report
IISc Internship ReportHarshilJain26
 
Project_Final_Review.pdf
Project_Final_Review.pdfProject_Final_Review.pdf
Project_Final_Review.pdfDivyaGugulothu
 
Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for GamesUmbra
 
Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for GamesSampo Lappalainen
 
one shot15729752 Deep Learning for AI and DS
one shot15729752 Deep Learning for AI and DSone shot15729752 Deep Learning for AI and DS
one shot15729752 Deep Learning for AI and DSManiMaran230751
 
The NASA Vision Workbench: Reflections on Image Processing in C++
The NASA Vision Workbench: Reflections on Image Processing in C++The NASA Vision Workbench: Reflections on Image Processing in C++
The NASA Vision Workbench: Reflections on Image Processing in C++Matt Hancher
 
Cloudera Data Science Challenge
Cloudera Data Science ChallengeCloudera Data Science Challenge
Cloudera Data Science ChallengeMark Nichols, P.E.
 
Data Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup GroupData Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup GroupDoug Needham
 
Yurii Pashchenko: Unlocking the potential of Segment Anything Model (UA)
Yurii Pashchenko: Unlocking the potential of Segment Anything Model (UA)Yurii Pashchenko: Unlocking the potential of Segment Anything Model (UA)
Yurii Pashchenko: Unlocking the potential of Segment Anything Model (UA)Lviv Startup Club
 
Virtual Simulation Of Systems
Virtual Simulation Of SystemsVirtual Simulation Of Systems
Virtual Simulation Of SystemsHites
 
Copy of Copy of Untitled presentation (1).pdf
Copy of Copy of Untitled presentation (1).pdfCopy of Copy of Untitled presentation (1).pdf
Copy of Copy of Untitled presentation (1).pdfjosephdonnelly2024
 
Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative AttributesVikas Jain
 
Introduction for Algorithm
Introduction for AlgorithmIntroduction for Algorithm
Introduction for AlgorithmJiayi Jiang
 

Ähnlich wie Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., 14th (20)

Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image Processing
 
PR-272: Accelerating Large-Scale Inference with Anisotropic Vector Quantization
PR-272: Accelerating Large-Scale Inference with Anisotropic Vector QuantizationPR-272: Accelerating Large-Scale Inference with Anisotropic Vector Quantization
PR-272: Accelerating Large-Scale Inference with Anisotropic Vector Quantization
 
Generation of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptGeneration of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.ppt
 
Object based image analysis tools for opticks
Object based image analysis tools for opticksObject based image analysis tools for opticks
Object based image analysis tools for opticks
 
pydataPointCloud.pptx
pydataPointCloud.pptxpydataPointCloud.pptx
pydataPointCloud.pptx
 
IISc Internship Report
IISc Internship ReportIISc Internship Report
IISc Internship Report
 
Project_Final_Review.pdf
Project_Final_Review.pdfProject_Final_Review.pdf
Project_Final_Review.pdf
 
DiscoGAN
DiscoGANDiscoGAN
DiscoGAN
 
Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for Games
 
Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for Games
 
one shot15729752 Deep Learning for AI and DS
one shot15729752 Deep Learning for AI and DSone shot15729752 Deep Learning for AI and DS
one shot15729752 Deep Learning for AI and DS
 
The NASA Vision Workbench: Reflections on Image Processing in C++
The NASA Vision Workbench: Reflections on Image Processing in C++The NASA Vision Workbench: Reflections on Image Processing in C++
The NASA Vision Workbench: Reflections on Image Processing in C++
 
Cloudera Data Science Challenge
Cloudera Data Science ChallengeCloudera Data Science Challenge
Cloudera Data Science Challenge
 
Data Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup GroupData Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup Group
 
Yurii Pashchenko: Unlocking the potential of Segment Anything Model (UA)
Yurii Pashchenko: Unlocking the potential of Segment Anything Model (UA)Yurii Pashchenko: Unlocking the potential of Segment Anything Model (UA)
Yurii Pashchenko: Unlocking the potential of Segment Anything Model (UA)
 
Svr Raskar
Svr RaskarSvr Raskar
Svr Raskar
 
Virtual Simulation Of Systems
Virtual Simulation Of SystemsVirtual Simulation Of Systems
Virtual Simulation Of Systems
 
Copy of Copy of Untitled presentation (1).pdf
Copy of Copy of Untitled presentation (1).pdfCopy of Copy of Untitled presentation (1).pdf
Copy of Copy of Untitled presentation (1).pdf
 
Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative Attributes
 
Introduction for Algorithm
Introduction for AlgorithmIntroduction for Algorithm
Introduction for Algorithm
 

Mehr von Tatsuya Shirakawa

NeurIPS2021読み会 Fairness in Ranking under Uncertainty
NeurIPS2021読み会 Fairness in Ranking under UncertaintyNeurIPS2021読み会 Fairness in Ranking under Uncertainty
NeurIPS2021読み会 Fairness in Ranking under UncertaintyTatsuya Shirakawa
 
2021 10-07 kdd2021読み会 uc phrase
2021 10-07 kdd2021読み会 uc phrase2021 10-07 kdd2021読み会 uc phrase
2021 10-07 kdd2021読み会 uc phraseTatsuya Shirakawa
 
医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19
医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19
医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19Tatsuya Shirakawa
 
Retail Face Analysis Inside-Out
Retail Face Analysis Inside-OutRetail Face Analysis Inside-Out
Retail Face Analysis Inside-OutTatsuya Shirakawa
 
データに内在する構造をみるための埋め込み手法
データに内在する構造をみるための埋め込み手法データに内在する構造をみるための埋め込み手法
データに内在する構造をみるための埋め込み手法Tatsuya Shirakawa
 
Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習
Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習
Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習Tatsuya Shirakawa
 
Learning to Compose Domain-Specific Transformations for Data Augmentation
Learning to Compose Domain-Specific Transformations for Data AugmentationLearning to Compose Domain-Specific Transformations for Data Augmentation
Learning to Compose Domain-Specific Transformations for Data AugmentationTatsuya Shirakawa
 
Poincare embeddings for Learning Hierarchical Representations
Poincare embeddings for Learning Hierarchical RepresentationsPoincare embeddings for Learning Hierarchical Representations
Poincare embeddings for Learning Hierarchical RepresentationsTatsuya Shirakawa
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowTatsuya Shirakawa
 

Mehr von Tatsuya Shirakawa (14)

NeurIPS2021読み会 Fairness in Ranking under Uncertainty
NeurIPS2021読み会 Fairness in Ranking under UncertaintyNeurIPS2021読み会 Fairness in Ranking under Uncertainty
NeurIPS2021読み会 Fairness in Ranking under Uncertainty
 
2021 10-07 kdd2021読み会 uc phrase
2021 10-07 kdd2021読み会 uc phrase2021 10-07 kdd2021読み会 uc phrase
2021 10-07 kdd2021読み会 uc phrase
 
医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19
医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19
医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19
 
ICCV2019 report
ICCV2019 reportICCV2019 report
ICCV2019 report
 
Retail Face Analysis Inside-Out
Retail Face Analysis Inside-OutRetail Face Analysis Inside-Out
Retail Face Analysis Inside-Out
 
データに内在する構造をみるための埋め込み手法
データに内在する構造をみるための埋め込み手法データに内在する構造をみるための埋め込み手法
データに内在する構造をみるための埋め込み手法
 
ヒトの機械学習
ヒトの機械学習ヒトの機械学習
ヒトの機械学習
 
Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習
Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習
Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習
 
Hyperbolic Neural Networks
Hyperbolic Neural NetworksHyperbolic Neural Networks
Hyperbolic Neural Networks
 
Learning to Compose Domain-Specific Transformations for Data Augmentation
Learning to Compose Domain-Specific Transformations for Data AugmentationLearning to Compose Domain-Specific Transformations for Data Augmentation
Learning to Compose Domain-Specific Transformations for Data Augmentation
 
Icml2017 overview
Icml2017 overviewIcml2017 overview
Icml2017 overview
 
Poincare embeddings for Learning Hierarchical Representations
Poincare embeddings for Learning Hierarchical RepresentationsPoincare embeddings for Learning Hierarchical Representations
Poincare embeddings for Learning Hierarchical Representations
 
Dynamic filter networks
Dynamic filter networksDynamic filter networks
Dynamic filter networks
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive Flow
 

Kürzlich hochgeladen

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 

Kürzlich hochgeladen (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 

Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., 14th

  • 1. Taskonomy: Disentangling Task Transfer Learning 2019 Feb. 14th Tatsuya Shirakawa (ABEJA, Inc.)
  • 2. Self Introduction ABEJA, Inc. (Researcher) - Deep Learning (CV, Graph, NLP, ) - Machine Learning - Mathematical Optimization - https://github.com/TatsuyaShirakawa Tech blog http://tech-blog.abeja.asia/ Poincaré Embeddings Graph Convolution Annotation Hyperbolic
  • 3. Today’s Paper Exploring the Structure among Visual Tasks
 by Measuring Transferability
 (Taskonomy = Task + Taxonomy http://taskonomy.stanford.edu/ http://taskonomy.vision/ + Super Thorough Analysis + Potentially Promising Research Direction = Super Interesting CVPR 2018 Best Paper ! + Super Large Dataset with 26 Task Annotations
  • 4. Paper Introduction • Considering transferability among visual tasks • Analysis of the transferability by means of AHP (Analytic Hierarchy Process) • Combinatorial Optimization for extracting visual Taskonomy • Massive Dataset & Experiments 
 (4.5M images, 26 tasks, 47,886 GPU hours) http://taskonomy.stanford.edu/
  • 6. Disclaimer The paper, slides, live demos, and web pages are great already.
 So, in this talk, let’s focus on the understanding 
 - the motivation,
 - the task,
 - method and 
 - some experimental results 
 of Taskonomy.
 
 In the following, I extensively quote some slides from
 https://storage.googleapis.com/taskonomy_slides/taskonomy_slides.html
  • 7. Contents • Motivation & Task • Dataset • Method • Experiments
  • 8. Zamir et al. Taskonomy 2018 Question: Vision problems - related or independent? Layout Objects ? Depth Normals Image ? !2 Zamir et al. Taskonomy 2018
  • 9. Zamir et al. Taskonomy 2018 Question: Vision problems - related or independent? •Can be computationally measured •Unified model for transfer learning •Task relationships exist •Tasks belonging to a structured space Depth Normals Layout Objects Image derivative spatial prior !3
  • 10. Goal — Task Transferability Structure Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3
  • 11. Contents • Motivation & Task • Dataset • Method • Experiments
  • 12. Zamir et al. Taskonomy 2018 Introduction Method Results Summary Query Image AutoencodingIn-painting Object Class. Scene Class. Jigsaw puzzle Colorization 2D Segm. 2.5D Segm. Semantic Segm. Vanishing Points 2D Edges 3D Edges 2D Keypoints 3D Keypoints 3D Curvature Image Reshading Denoising Cam. Pose (non-fixated) Cam. Pose(fixated) Triplet Cam. Pose Room Layout Point Matching Top 5 prediction: sliding door home theater, home theatre studio couch, day bed china cabinet, china closet entertainment center Eucl. DistanceSurface Normals Top 2 prediction: living room television room !21 • Task Bank • 26 Semantic, 2D, 3D, and tasks • Dataset • 4 million real images • Each image has the GT label for all tasks • Task-Specific Networks • 26 x https://storage.googleapis.com/taskonomy_slides/taskonomy_slides.html
  • 13. Dataset Creation • Semantic tasks (e.g. scene classification)
 => “Knowledge distillation” from known methods
 = predictions of trained models are used as labels • Non-Semantic Labels
 => Programatically computed from images from multiple RGB-D cameras
  • 14. Contents • Motivation & Task • Dataset • Method • Experiments
  • 15. Zamir et al. Taskonomy 2018 Modeling Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary !15
  • 16. Zamir et al. Taskonomy 2018 I: Task-Specific Modeling Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Image Source Output (normals)Training data Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary !16 Same Image Resolution
 Same Network Architecture
 => Same Latent Representation
  • 17. Zamir et al. Taskonomy 2018 II: Transfer Modeling Image Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary Training data Target Output (Curvature) !17Image Source Output (normals)Training data
  • 18. Zamir et al. Taskonomy 2018 II: Transfer Modeling Image Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary Training data Target Output (Curvature) !18
  • 19. Zamir et al. Taskonomy 2018 II: Transfer Modeling Image Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary Training data Target Output (Curvature) !19 + Higher Order Transfers (Beam Search)
  • 20. Zamir et al. Taskonomy 2018 Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary III: Normalization Adjacency Matrix (pre-normalization) !20 Adjacency Matrix W The (i, j)-th element is the raw loss/evaluation when i-th/j-th tasks are taken as source/target tasks. • problematic (scale and space mismatch)
 => a proper normalization is needed
  • 21. Zamir et al. Taskonomy 2018 Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary III: Normalization Adjacency Matrix (pre-normalization) !21 Adjacency Matrix W_t (t: target task) The (i, j)-th element is the ratio of (a) / (b) (a) number of images on which i-th task transfered
 to target task t better than j-th task did (b) number of images on which j-th task transfered
 to target task t better than i-th task did
  • 22. Zamir et al. Taskonomy 2018 Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary III: Normalization Adjacency Matrix (pre-normalization) Adjacency Matrix (post-normalization) !22 Ordinal Normalization - Analytic Hierarchical Process. (AHP)
  • 23. AHP(Analytical Hierarchical Process) Mathematical Background Let us consider the ranking of n items {1, 2, …, n}. Let A = (a_ij), a_ij measure how i-th item is superior to j-th item. Assume matrix A = (a_ij) has the form of a_ij = u_i / u_j Then,
 
 (1) A is rank 1
 (2) Au = nu (u is the unique non-zero eigenvector) => u: importance vector
  • 24. AHP for Taskonomy 1. Take the win-lose ratio between 
 (a) transfer s_i -> t and (b) transfer s_j -> t
 
 2. Take the 1st principal component (normalized to sum to 1) of the matrix 3. Create the final matrix by
 stacking the 1st principal 
 components

  • 25. Zamir et al. Taskonomy 2018 IV: Taxonomy Extraction • Taxonomical structure: • Sparsified • What are best source tasks • What sources for each target • Out-of-dictionary tasks • Maximize performance while constrained by some budget Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary
  • 26. Zamir et al. Taskonomy 2018 IV: Taxonomy Extraction Source tasks Target tasks Dictionary= Sources ∪Targets target-only (small data)source/targetsource-only Introduction Method Results Summary Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3
  • 27. Zamir et al. Taskonomy 2018 IV: Taxonomy Extraction Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Source tasks Target tasks Dictionary= Sources ∪Targets Introduction Method Results Summary Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 target-only (small data)source/targetsource-only
  • 28. Zamir et al. Taskonomy 2018 IV: Taxonomy Extraction Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Source tasks Target tasks Dictionary= Sources ∪Targets Introduction Method Results Summary Constraint I: only transfer from sources. Constraint II: all targets are transferred to. Autoencoding Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Object Class. Autoencoding Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. Novel Task 1 Novel Task 2 Novel Task 3 Vanishing Pts. Semantic Segm. 2D Segm. Object Class. Autoencoding Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Object Class. Autoencoding Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. Novel Task 1 Novel Task 2 Novel Task 3 Vanishing Pts. Semantic Segm. 2D Segm. Object Class. Autoencoding Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Object Class. Autoencoding Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. Novel Task 1 Novel Task 2 Novel Task 3 Vanishing Pts. Semantic Segm. 2D Segm. Object Class. Constraint III: not exceed budget. Binary Integer Program (BIP) Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 target-only (small data)source/targetsource-only
  • 29. Taxonomy Extraction • Boolean Integer Programming (BIP)
 — Finding the subgraph compose of tasks(nodes) and transfers(edges) 
 which solve the all tasks in minimum cost Constraint I
 if a transfer is in the subgraph, all of its source nodes/tasks must be included too Constraint II
 each target task has exactly one transfer in Constraint III
 supervision budget is not exceeded
  • 30. Contents • Motivation & Task • Dataset • Method • Experiments
  • 31. Zamir et al. Taskonomy 2018 Experimental Results Introduction Method Results Summary !31 • 26 Task-Specific Networks • 3000 Transfer Networks • 47,829 GPU hours • Transfers training data: 8x-120x less than task-specific
  • 32.
  • 33. (“Normals” = diff. of “Depth” looks quite strong but many tasks are computed if 3D-reconstruction is done …)