SlideShare a Scribd company logo
1 of 81
Download to read offline
Monocular Depth Cues in
Computer Vision Applications
Diego Cheda
Thesis Advisors:
Dr. Daniel Ponsa
Dr. Antonio L´opez
December 14, 2012
We don’t need two eyes to perceive depth.
[Edgar Muller]
Motivation
Human depth cues
There are different sources of information supporting depth
perception.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 3/64
Motivation
Depth estimation from a single image
Prior information
Our world is structured In an abstract world
Gloconde Blank check
The listening room Personal values
Ren´e Magritte
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 4/64
Outline
1 Objectives
2 Coarse depth map estimation
3 Egomotion estimation
4 Background estimation
5 Pedestrian candidate generation
6 Conclusions and future work
Outline
1 Objectives
2 Coarse depth map estimation
3 Egomotion estimation
4 Background estimation
5 Pedestrian candidate generation
6 Conclusions and future work
Objectives
• Coarse depth map estimation
simple and low-cost
low-level features based on pictorial cues
• Increasing the performance of many applications
Egomotion estimation
Background estimation
Pedestrian candidates generation
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 7/64
Objectives
Segmenting an image into depth categories
• Near
Depth is usually estimated by using a stereo configuration.
• Very-far
The effect of camera translation at faraway distances is
inappreciable.
• Medium and Far
Interesting for potential applications.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 8/64
Outline
1 Objectives
2 Coarse depth map estimation
3 Egomotion estimation
4 Background estimation
5 Pedestrian candidate generation
6 Conclusions and future work
Coarse depth map estimation
Method
Pipeline of our approach
• Multiclass classification problem
• Supervised learning approach
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 10/64
Coarse depth map estimation
Method
Ground truth dataset
• Set of urban outdoor images
Saxena et al.: 400 images for training and 134 for testing.
• Each image has an associated depth map acquired by a laser
scanner.
Thresholding depth map to be used as ground truth.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 11/64
Coarse depth map estimation
Method
Regions
Superpixels Regular grid
Superpixels conserve
intra-region similarities.
× Time consuming.
× Regular grids merge
information of different regions.
Once for a camera
configuration.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 12/64
Coarse depth map estimation
Method
Features
• Monocular pictorial cues are predominant beyond 30 m to estimate
depth.
• Low-level visual features to represent texture, relative height,
atmospheric scattering.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 13/64
Coarse depth map estimation
Method
Features - Texture
Paris street, rainy day - Gustave Caillebotte
At a greater distance, texture
patterns get finer and appear
smoother
To capture textures we use
• Weibull distribution
• Gabor filters
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 14/64
Coarse depth map estimation
Method
Features - Texture: Weibull distribution
• Compact representation
β parameter γ parameter
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 15/64
Coarse depth map estimation
Method
Features - Texture: Weibull distribution
• Compact representation
β parameter γ parameter
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 15/64
Coarse depth map estimation
Method
Features - Texture: Weibull distribution
• Compact representation
β parameter γ parameter
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 15/64
Coarse depth map estimation
Method
Features - Texture: Weibull distribution
• Compact representation
β parameter γ parameter
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 15/64
Coarse depth map estimation
Method
Features - Texture: Gabor filter
Images
Gabor filter responses
• Capture smoothed and
textured regions
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 16/64
Coarse depth map estimation
Method
Features - Relative height
When an object is near the
horizon, it is perceived as distant.
To capture relative height we use
• Location: x and y coordinates
in the image
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 17/64
Coarse depth map estimation
Method
Features - Location
near medium far
Depth average over ground truth
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 18/64
Coarse depth map estimation
Method
Features - Atmospheric scattering
The Virgin and Child with St. Anne - Leonardo Da Vinci
The further away objects are
unclearer and less detailed with
respect to those which are closer.
To capture atmospheric
scattering we use
• RGB histogram
• HSV histogram
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 19/64
Coarse depth map estimation
Method
Learning approach
One-vs-All
• Binary classifiers
• Training one classifier per class (near, medium, far, and very-far)
• Low-performance due to number of positive examples for medium
and far regions.
Our approach
• Training three classifiers: > 30, > 50, > 70 m.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 20/64
Coarse depth map estimation
Method
Training
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 21/64
Coarse depth map estimation
Method
Testing
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 22/64
Coarse depth map estimation
Method
Testing
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 23/64
Coarse depth map estimation
Method
Testing
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 23/64
Coarse depth map estimation
Method
Testing
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 23/64
Coarse depth map estimation
Method
Inference
• CRF
• Combining probabilities obtained from classifiers
• Associating neighboring regions belonging to the same depth
category.
• Graph cut to guarantee a global maximum likelihood result.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 24/64
Coarse depth map estimation
Experimental results
Performance measurement
• Measure of performance: Jaccard index.
TP
(TP + FP + FN)
Measures the level of agreement with respect to an ideal
classification result.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 25/64
Coarse depth map estimation
Experimental results
Different regions grouping
Performance our method using different oversegmentation configurations.
Regular
Grid
10 x 10 15 x 15 20 x 20
Turbo
Pixels
∼200 regions ∼400 regions ∼800 regions
Algorithm
Number of regions
20x20 15x15 10x10
Superpixels 0.3623 0.3567 0.3561
Grid 0.3586 0.3602 0.3570
• Best performing
configuration is using
superpixels
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 26/64
Coarse depth map estimation
Experimental results
Comparison w.r.t. state-of-the-art
Saxena et al.
• A more challenging goal: photo-realistic 3D model
• For each superpixel and its neighbors: features for occlusions,
geometric, statistical and spatial information, textures, at multiple
spatial scales.
• Inferences methods with a high computational.
• MRF
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 27/64
Coarse depth map estimation
Experimental results
Comparison w.r.t. state-of-the-art
Using a remarkable inferior number of low-level features (64
vs 646 respectively).
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 28/64
Coarse depth map estimation
Experimental results
Relevance of visual features
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 29/64
Coarse depth map estimation
Experimental results
Image Laser Depth Map Saxena et al. Our
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 30/64
Coarse depth map estimation
Experimental results
Image Laser Depth Map Saxena et al. Our
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 30/64
Coarse depth map estimation
Conclusions
We have presented
• A supervised learning approach to segment an image
according to certain depth categories.
• Our algorithm use a reduced number of low-level visual
features, which are based on monocular pictorial cues.
Our results show
• Monocular cues are useful for depth estimation.
• Close and distant regions are well-segmented by our approach.
• Regions at medium distances are more difficult to segment.
• In average, our method outperforms Saxena et al. method.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 31/64
Outline
1 Objectives
2 Coarse depth map estimation
3 Egomotion estimation
4 Background estimation
5 Pedestrian candidate generation
6 Conclusions and future work
Egomotion estimation
Motivation
Egomotion estimation
Estimating the vehicle position is a key component in many ADAS
systems
Autonomous navigation
Adaptive cruise control
Lane change assistance
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 33/64
Egomotion estimation
Problem definition
Egomotion problem
Determining the changes in the 3D rigid camera position and
orientation.
• Camera motion is described as a 3D rigid motion:
pt = Rtp0 + tt
• Six degrees of freedom (DOF).
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 34/64
Egomotion estimation
Problem definition
Egomotion problem
Determining the changes in the 3D rigid camera position and
orientation.
• Camera motion is described as a 3D rigid motion:
pt = Rtp0 + tt
• Six degrees of freedom (DOF).
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 34/64
Egomotion estimation
Problem definition
Egomotion problem
Determining the changes in the 3D rigid camera position and
orientation.
• Camera motion is described as a 3D rigid motion:
pt = Rtp0 + tt
• Six degrees of freedom (DOF).
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 34/64
Egomotion estimation
Problem definition
Egomotion problem
Determining the changes in the 3D rigid camera position and
orientation.
• Camera motion is described as a 3D rigid motion:
pt = Rtp0 + tt
• Six degrees of freedom (DOF).
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 34/64
Egomotion estimation
Goal
Distant regions behave as a plane at infinity
Properties
• It remains in the same image coordinates during translation
• It is only affected by camera rotation
Goal
• Identify distant regions in the image to estimate vehicle rotation
uncoupledly from vehicle translation.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 35/64
Egomotion estimation
Algorithm overview
Egomotion estimation based on distant points / regions
× Distant points are hard to be tracked since they are located at
low-textured regions.
Distant region algorithm does a maximal use of distant information.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
Egomotion estimation
Algorithm overview
Egomotion estimation based on distant points / regions
× Distant points are hard to be tracked since they are located at
low-textured regions.
Distant region algorithm does a maximal use of distant information.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
Egomotion estimation
Algorithm overview
Egomotion estimation based on distant points / regions
× Distant points are hard to be tracked since they are located at
low-textured regions.
Distant region algorithm does a maximal use of distant information.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
Egomotion estimation
Algorithm overview
Egomotion estimation based on distant points / regions
× Distant points are hard to be tracked since they are located at
low-textured regions.
Distant region algorithm does a maximal use of distant information.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
Egomotion estimation
Algorithm overview
Egomotion estimation based on distant points / regions
× Distant points are hard to be tracked since they are located at
low-textured regions.
Distant region algorithm does a maximal use of distant information.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
Egomotion estimation
Algorithm overview
Egomotion estimation based on distant points / regions
× Distant points are hard to be tracked since they are located at
low-textured regions.
Distant region algorithm does a maximal use of distant information.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
Egomotion estimation
Algorithm overview
Egomotion estimation based on distant points / regions
× Distant points are hard to be tracked since they are located at
low-textured regions.
Distant region algorithm does a maximal use of distant information.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
Egomotion estimation
Algorithm overview
Egomotion estimation based on distant points / regions
× Distant points are hard to be tracked since they are located at
low-textured regions.
Distant region algorithm does a maximal use of distant information.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
Egomotion estimation
Experimental results
Datasets
• Karlsruhe dataset: 8 sequences
• More than 8000 (∼ 3 km).
• GT: INS Sensor.
• Stereo depth maps
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 37/64
Egomotion estimation
Experimental results
Evaluation of our distant regions segmentation
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 38/64
Egomotion estimation
Experimental results
Comparison with other approaches
• The five-point algorithm (5pts) by Nister.
• The Burschka et al. method (RANSAC).
• The stereo-based algorithm by Kitt et al. (as a baseline).
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 39/64
Egomotion estimation
Experimental results
Rotation estimation performance Trajectory estimation performance
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 40/64
Egomotion estimation
Experimental results
Yaw angle comparison
GT (INS Sensor) DR DP
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 41/64
Egomotion estimation
Experimental results
Trajectory results
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 42/64
Egomotion estimation
Conclusions
In this section, we have
• Proposed two novel monocular egomotion methods based on
tracking distant points and distant regions.
Our results show
• Rotations are accurately estimated, since distant regions
provide strong indicators of camera rotation.
• In comparison with other state-of-the-art methods, our
approach outperforms them.
• Comparable performance with respect to the considered stereo
algorithm.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 43/64
Outline
1 Objectives
2 Coarse depth map estimation
3 Egomotion estimation
4 Background estimation
5 Pedestrian candidate generation
6 Conclusions and future work
Background estimation
Problem definition
Background estimation
Automatically remove transient and moving objects from a set of
images with the aim of obtaining an occlusion-free background
image of the scene.
Background model
• Represents objects whose distance to the camera is maximal.
• Background objects are stationary.
Goal
• Identify close regions to penalize deviations from our background
model.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 45/64
Background estimation
Experimental results
Example of labeling
Original images
Close/distant regions
Labeling Our result
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 46/64
Background estimation
Method
Energy function
E(f ) = p∈P Dp(fp)
Data term
+ p,q∈N Vp,q(fp, fq)
Smoothness term
Data term Penalizes deviation from our background model taking
into account color, motion and depth.
Dp(fp) = αDS
p (fp) + βDM
p (fp) + γDP
p
• Color variations between sort time intervals
• Moving objects by using motion boundaries
• Close objects using our approach
Smoothness term Penalizes the intensity differences between
neighboring regions, giving a higher cost when images do not
match well.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 47/64
Background estimation
Experimental results
Datasets
Towers City Train Market
#frames: 11 #frames: 7 #frames: 3 #frames: 8
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 48/64
Background estimation
Experimental results
Agarwala et al.
• State-of-the-art method.
• Require user intervention to refine results.
• Refined results used as ground truth.
Norm of absolute difference in RGB channels
Sequences
Towers City Train Market
0.0551 0.0804 0.0479 0.0603
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 49/64
Background estimation
Experimental results
Independent moving object
Original images
Our method Agarwala et al.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 50/64
Background estimation
Conclusions
In this section,
• We have presented a method to background estimation
containing moving/transient objects.
• This method uses depth information for such purpose by
penalizing close regions in a cost function.
Our results show that
• Our method significantly outperforms the median filter.
• Our approach is comparable to Agarwala et al. method,
without performing any user intervention.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 51/64
Outline
1 Objectives
2 Coarse depth map estimation
3 Egomotion estimation
4 Background estimation
5 Pedestrian candidate generation
6 Conclusions and future work
Pedestrian candidate generation
Problem definition
Pedestrian candidate generation Generating hypothesis to be
evaluated by a pedestrian classifier.
[Ger´onimo 2010]
Goal
Exploiting geometric and depth information available on single images
to reduce the number of windows to be further processed.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 53/64
Pedestrian candidate generation
Problem definition
Pedestrian candidate generation Generating hypothesis to be
evaluated by a pedestrian classifier.
[Ger´onimo 2010]
Goal
Exploiting geometric and depth information available on single images
to reduce the number of windows to be further processed.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 53/64
Pedestrian candidate generation
Method
Overview
a) Original Image
d) Pedestrian Candidate Windows
b) Geometric Information
c) Depth Information
Fusion
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 54/64
Pedestrian candidate generation
Method
Agglomerative clustering schema
• Regions over ground surface
• Agglomerating regions maintaining size coherence w.r.t. depth
Original
Image
Geometric and Depth
Information
Superpixels
(a) Geometric, Depth, and Spatial Information
(b) Superpixels are merged
Gravity
Depth
Size
Hierarchical clustering
(c) Bounding boxes surrounding regions
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 55/64
Pedestrian candidate generation
Experimental results
Dataset
• CVC Pedestrian dataset.
• 15 sequences taken from a stereo-rig rigidly mounted in a car
while it is driving on an urban scenario (4364 frames).
• 7983 manual annotated pedestrians visible at less than 50
meters.
Performance measures
• Number of pedestrian candidates generated.
• True Positive Rate TPR =
TP
TP + FN
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 56/64
Pedestrian candidate generation
Experimental results
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 57/64
Pedestrian candidate generation
Experimental results
Lost pedestrians
4 %
1 8 %
7 8 %
0 - 1 0 1 0 - 2 5 > 2 5
0
3 0 0
6 0 0
9 0 0
1 2 0 0
LostPedestrians
D i s t a n c e ( m )
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 58/64
Pedestrian candidate generation
Conclusions
In this section, we have presented:
• Novel monocular method for generating pedestrian candidates.
• It is based on geometric relationships and depth.
Our results show that:
• Our method overcome all considered methods because
significantly reduces the number of candidates.
• High value for TPR.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 59/64
Outline
1 Objectives
2 Coarse depth map estimation
3 Egomotion estimation
4 Background estimation
5 Pedestrian candidate generation
6 Conclusions and future work
Conclusions and future work
Conclusions
• We have proposed a supervised learning approach to classify the
pixels of outdoor images in just four categories: near,
medium-distance, far and very-far, based on monocular pictorial
cues.
• In comparison against the results of a most complex depth map
estimation method, our method overcomes the performance of it,
using low computational demanding techniques.
• We have demonstrated the usefulness of our coarse depth maps in
improving the results of egomotion estimation, background
estimation, and pedestrian candidates generation. In each
application, we have contributed with novel methods from a
different perspective based on the use of coarse depth.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 61/64
Conclusions and future work
Future work
• Extend our approach to consider more monocular depth cues like
occlusions, relative and familiar size, that could improve our coarse
estimation.
• Explore other possible applications of depth information (tracking,
for initializing 3D reconstruction algorithms, learning pedestrians
classifiers according with depth, etc).
• Integrate our depth estimation method in different ADAS modules.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 62/64
Conclusions and future work
Publications
This thesis take as bases the following publications:
Conference papers
• Camera Egomotion Estimation in the ADAS Context, D. Cheda, D. Ponsa and
A. M. L´opez, IEEE Conf. Intell. Transp. Syst., 2010.
• Monocular Egomotion Estimation based on Image Matching, D. Cheda, D.
Ponsa and A. M. L´opez, Int. Conf. Pattern Recognit. Appl. and Methods, 2012.
• Monocular Depth-based Background Estimation, D. Cheda, D. Ponsa and A. M.
L´opez, Int. Conf. Comput. Vision Theory Appl., 2012.
• Pedestrian Candidates Generation using Monocular Cues, D. Cheda, D. Ponsa
and A. M. L´opez, IEEE Intell. Vehicles Symposium, 2012.
Journal papers under reviewing
• Monocular Multilayer Depth Segmentation and Applications, D. Cheda, D.
Ponsa and A. M. L´opez, submitted to IJCV, Springer.
• Monocular Visual Odometry Boosted by Monocular Depth Cues, D. Cheda, D.
Ponsa and A. M. L´opez, submitted to ITS, IEEE.
Diego Cheda — Monocular Depth Cues in Computer Vision Applications 63/64
Thanks!

More Related Content

What's hot

Multiresolution SVD based Image Fusion
Multiresolution SVD based Image FusionMultiresolution SVD based Image Fusion
Multiresolution SVD based Image FusionIOSRJVSP
 
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUESA STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUEScscpconf
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...IRJET Journal
 
A Novel and Robust Wavelet based Super Resolution Reconstruction of Low Resol...
A Novel and Robust Wavelet based Super Resolution Reconstruction of Low Resol...A Novel and Robust Wavelet based Super Resolution Reconstruction of Low Resol...
A Novel and Robust Wavelet based Super Resolution Reconstruction of Low Resol...CSCJournals
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNZihao(Gerald) Zhang
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Suraj Aavula
 
A systematic image compression in the combination of linear vector quantisati...
A systematic image compression in the combination of linear vector quantisati...A systematic image compression in the combination of linear vector quantisati...
A systematic image compression in the combination of linear vector quantisati...eSAT Publishing House
 
Content Based Image Retrieval Using 2-D Discrete Wavelet Transform
Content Based Image Retrieval Using 2-D Discrete Wavelet TransformContent Based Image Retrieval Using 2-D Discrete Wavelet Transform
Content Based Image Retrieval Using 2-D Discrete Wavelet TransformIOSR Journals
 
Blank Background Image Lossless Compression Technique
Blank Background Image Lossless Compression TechniqueBlank Background Image Lossless Compression Technique
Blank Background Image Lossless Compression TechniqueCSCJournals
 
A REGULARIZED ROBUST SUPER-RESOLUTION APPROACH FORALIASED IMAGES AND LOW RESO...
A REGULARIZED ROBUST SUPER-RESOLUTION APPROACH FORALIASED IMAGES AND LOW RESO...A REGULARIZED ROBUST SUPER-RESOLUTION APPROACH FORALIASED IMAGES AND LOW RESO...
A REGULARIZED ROBUST SUPER-RESOLUTION APPROACH FORALIASED IMAGES AND LOW RESO...cscpconf
 
IMAGE COMPRESSION AND DECOMPRESSION SYSTEM
IMAGE COMPRESSION AND DECOMPRESSION SYSTEMIMAGE COMPRESSION AND DECOMPRESSION SYSTEM
IMAGE COMPRESSION AND DECOMPRESSION SYSTEMVishesh Banga
 
Wavelet based image fusion
Wavelet based image fusionWavelet based image fusion
Wavelet based image fusionUmed Paliwal
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learningYu Huang
 
Survey on Haze Removal Techniques
Survey on Haze Removal TechniquesSurvey on Haze Removal Techniques
Survey on Haze Removal TechniquesEditor IJMTER
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...IRJET Journal
 

What's hot (19)

Multiresolution SVD based Image Fusion
Multiresolution SVD based Image FusionMultiresolution SVD based Image Fusion
Multiresolution SVD based Image Fusion
 
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUESA STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
 
538 207-219
538 207-219538 207-219
538 207-219
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
 
A Novel and Robust Wavelet based Super Resolution Reconstruction of Low Resol...
A Novel and Robust Wavelet based Super Resolution Reconstruction of Low Resol...A Novel and Robust Wavelet based Super Resolution Reconstruction of Low Resol...
A Novel and Robust Wavelet based Super Resolution Reconstruction of Low Resol...
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
A systematic image compression in the combination of linear vector quantisati...
A systematic image compression in the combination of linear vector quantisati...A systematic image compression in the combination of linear vector quantisati...
A systematic image compression in the combination of linear vector quantisati...
 
Content Based Image Retrieval Using 2-D Discrete Wavelet Transform
Content Based Image Retrieval Using 2-D Discrete Wavelet TransformContent Based Image Retrieval Using 2-D Discrete Wavelet Transform
Content Based Image Retrieval Using 2-D Discrete Wavelet Transform
 
Digital.cc
Digital.ccDigital.cc
Digital.cc
 
Blank Background Image Lossless Compression Technique
Blank Background Image Lossless Compression TechniqueBlank Background Image Lossless Compression Technique
Blank Background Image Lossless Compression Technique
 
Image Fusion
Image FusionImage Fusion
Image Fusion
 
A REGULARIZED ROBUST SUPER-RESOLUTION APPROACH FORALIASED IMAGES AND LOW RESO...
A REGULARIZED ROBUST SUPER-RESOLUTION APPROACH FORALIASED IMAGES AND LOW RESO...A REGULARIZED ROBUST SUPER-RESOLUTION APPROACH FORALIASED IMAGES AND LOW RESO...
A REGULARIZED ROBUST SUPER-RESOLUTION APPROACH FORALIASED IMAGES AND LOW RESO...
 
IMAGE COMPRESSION AND DECOMPRESSION SYSTEM
IMAGE COMPRESSION AND DECOMPRESSION SYSTEMIMAGE COMPRESSION AND DECOMPRESSION SYSTEM
IMAGE COMPRESSION AND DECOMPRESSION SYSTEM
 
Wavelet based image fusion
Wavelet based image fusionWavelet based image fusion
Wavelet based image fusion
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learning
 
Survey on Haze Removal Techniques
Survey on Haze Removal TechniquesSurvey on Haze Removal Techniques
Survey on Haze Removal Techniques
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
 
mini prjt
mini prjtmini prjt
mini prjt
 

Viewers also liked

Resume 2016 - Jennifer Juan - UPDATED
Resume 2016 - Jennifer Juan - UPDATEDResume 2016 - Jennifer Juan - UPDATED
Resume 2016 - Jennifer Juan - UPDATEDJennifer Juan
 
Restructuracion y nutricion celular
Restructuracion y nutricion celularRestructuracion y nutricion celular
Restructuracion y nutricion celularVanessa Drews
 
Есть_ли_жизнь_после_корпоорации?
Есть_ли_жизнь_после_корпоорации?Есть_ли_жизнь_после_корпоорации?
Есть_ли_жизнь_после_корпоорации?Viktoriya Kravchenko
 
Articulo publicado redes.com 2015
Articulo publicado redes.com 2015Articulo publicado redes.com 2015
Articulo publicado redes.com 2015uca
 
قضية الأساتذة المتدربين.. مقرر وزاري من وزارة التربية الوطنية)
قضية الأساتذة المتدربين.. مقرر وزاري من وزارة التربية الوطنية)قضية الأساتذة المتدربين.. مقرر وزاري من وزارة التربية الوطنية)
قضية الأساتذة المتدربين.. مقرر وزاري من وزارة التربية الوطنية)Med Sugar Man
 
Azure Security Center - 120715 - PTBR-Final
Azure Security Center - 120715 - PTBR-FinalAzure Security Center - 120715 - PTBR-Final
Azure Security Center - 120715 - PTBR-FinalFabio Hara
 
Kehidupan Di Dunia Ibarat Seorang Perantau
Kehidupan Di Dunia Ibarat Seorang PerantauKehidupan Di Dunia Ibarat Seorang Perantau
Kehidupan Di Dunia Ibarat Seorang PerantauMr FD
 

Viewers also liked (9)

Resume 2016 - Jennifer Juan - UPDATED
Resume 2016 - Jennifer Juan - UPDATEDResume 2016 - Jennifer Juan - UPDATED
Resume 2016 - Jennifer Juan - UPDATED
 
Restructuracion y nutricion celular
Restructuracion y nutricion celularRestructuracion y nutricion celular
Restructuracion y nutricion celular
 
Majo
MajoMajo
Majo
 
Есть_ли_жизнь_после_корпоорации?
Есть_ли_жизнь_после_корпоорации?Есть_ли_жизнь_после_корпоорации?
Есть_ли_жизнь_после_корпоорации?
 
Articulo publicado redes.com 2015
Articulo publicado redes.com 2015Articulo publicado redes.com 2015
Articulo publicado redes.com 2015
 
قضية الأساتذة المتدربين.. مقرر وزاري من وزارة التربية الوطنية)
قضية الأساتذة المتدربين.. مقرر وزاري من وزارة التربية الوطنية)قضية الأساتذة المتدربين.. مقرر وزاري من وزارة التربية الوطنية)
قضية الأساتذة المتدربين.. مقرر وزاري من وزارة التربية الوطنية)
 
Jalan menuju hidup sukses
Jalan menuju hidup suksesJalan menuju hidup sukses
Jalan menuju hidup sukses
 
Azure Security Center - 120715 - PTBR-Final
Azure Security Center - 120715 - PTBR-FinalAzure Security Center - 120715 - PTBR-Final
Azure Security Center - 120715 - PTBR-Final
 
Kehidupan Di Dunia Ibarat Seorang Perantau
Kehidupan Di Dunia Ibarat Seorang PerantauKehidupan Di Dunia Ibarat Seorang Perantau
Kehidupan Di Dunia Ibarat Seorang Perantau
 

Similar to PhD_ppt_2012

Depth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors IIDepth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors IIYu Huang
 
3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving IIYu Huang
 
高解析度面板瑕疵檢測
高解析度面板瑕疵檢測高解析度面板瑕疵檢測
高解析度面板瑕疵檢測CHENHuiMei
 
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdfmokamojah
 
Depth Fusion from RGB and Depth Sensors III
Depth Fusion from RGB and Depth Sensors  IIIDepth Fusion from RGB and Depth Sensors  III
Depth Fusion from RGB and Depth Sensors IIIYu Huang
 
Digital Image Correlation Presentation
Digital Image Correlation PresentationDigital Image Correlation Presentation
Digital Image Correlation Presentationtrilionqualitysystems
 
Rapid Laser Scanning the process
Rapid Laser Scanning the processRapid Laser Scanning the process
Rapid Laser Scanning the processSeeview Solutions
 
3D Image visualization
3D Image visualization3D Image visualization
3D Image visualizationalok ray
 
3-d interpretation from single 2-d image III
3-d interpretation from single 2-d image III3-d interpretation from single 2-d image III
3-d interpretation from single 2-d image IIIYu Huang
 
3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IVYu Huang
 
Shadow Techniques for Real-Time and Interactive Applications
Shadow Techniques for Real-Time and Interactive ApplicationsShadow Techniques for Real-Time and Interactive Applications
Shadow Techniques for Real-Time and Interactive Applicationsstefan_b
 
Combining 3D Imagery and Lidar with ArcGIS in Aerial and Mobile Applications
Combining 3D Imagery and Lidar with ArcGIS in Aerial and Mobile ApplicationsCombining 3D Imagery and Lidar with ArcGIS in Aerial and Mobile Applications
Combining 3D Imagery and Lidar with ArcGIS in Aerial and Mobile ApplicationsEsri
 
Desktop Softwares for Unmanned Aerial Systems(UAS))
Desktop Softwares for Unmanned Aerial Systems(UAS))Desktop Softwares for Unmanned Aerial Systems(UAS))
Desktop Softwares for Unmanned Aerial Systems(UAS))Kamal Shahi
 
Spar 2011 Aldo Facchin
Spar 2011 Aldo FacchinSpar 2011 Aldo Facchin
Spar 2011 Aldo FacchinAldoFacchin
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern PresentationDaniel Cahall
 
Understanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfUnderstanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfQualcomm Research
 

Similar to PhD_ppt_2012 (20)

Depth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors IIDepth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors II
 
3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II
 
高解析度面板瑕疵檢測
高解析度面板瑕疵檢測高解析度面板瑕疵檢測
高解析度面板瑕疵檢測
 
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
 
Depth Fusion from RGB and Depth Sensors III
Depth Fusion from RGB and Depth Sensors  IIIDepth Fusion from RGB and Depth Sensors  III
Depth Fusion from RGB and Depth Sensors III
 
Photogrammetry 1.
Photogrammetry 1.Photogrammetry 1.
Photogrammetry 1.
 
[DL輪読会]ClearGrasp
[DL輪読会]ClearGrasp[DL輪読会]ClearGrasp
[DL輪読会]ClearGrasp
 
Digital Image Correlation Presentation
Digital Image Correlation PresentationDigital Image Correlation Presentation
Digital Image Correlation Presentation
 
Rapid Laser Scanning the process
Rapid Laser Scanning the processRapid Laser Scanning the process
Rapid Laser Scanning the process
 
DSM Extraction from Pleiades Images using MICMAC
DSM Extraction from Pleiades Images using MICMAC DSM Extraction from Pleiades Images using MICMAC
DSM Extraction from Pleiades Images using MICMAC
 
3D Image visualization
3D Image visualization3D Image visualization
3D Image visualization
 
3-d interpretation from single 2-d image III
3-d interpretation from single 2-d image III3-d interpretation from single 2-d image III
3-d interpretation from single 2-d image III
 
3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV
 
Shadow Techniques for Real-Time and Interactive Applications
Shadow Techniques for Real-Time and Interactive ApplicationsShadow Techniques for Real-Time and Interactive Applications
Shadow Techniques for Real-Time and Interactive Applications
 
Combining 3D Imagery and Lidar with ArcGIS in Aerial and Mobile Applications
Combining 3D Imagery and Lidar with ArcGIS in Aerial and Mobile ApplicationsCombining 3D Imagery and Lidar with ArcGIS in Aerial and Mobile Applications
Combining 3D Imagery and Lidar with ArcGIS in Aerial and Mobile Applications
 
Desktop Softwares for Unmanned Aerial Systems(UAS))
Desktop Softwares for Unmanned Aerial Systems(UAS))Desktop Softwares for Unmanned Aerial Systems(UAS))
Desktop Softwares for Unmanned Aerial Systems(UAS))
 
Spar 2011 Aldo Facchin
Spar 2011 Aldo FacchinSpar 2011 Aldo Facchin
Spar 2011 Aldo Facchin
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
 
Understanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfUnderstanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdf
 

PhD_ppt_2012

  • 1. Monocular Depth Cues in Computer Vision Applications Diego Cheda Thesis Advisors: Dr. Daniel Ponsa Dr. Antonio L´opez December 14, 2012
  • 2. We don’t need two eyes to perceive depth. [Edgar Muller]
  • 3. Motivation Human depth cues There are different sources of information supporting depth perception. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 3/64
  • 4. Motivation Depth estimation from a single image Prior information Our world is structured In an abstract world Gloconde Blank check The listening room Personal values Ren´e Magritte Diego Cheda — Monocular Depth Cues in Computer Vision Applications 4/64
  • 5. Outline 1 Objectives 2 Coarse depth map estimation 3 Egomotion estimation 4 Background estimation 5 Pedestrian candidate generation 6 Conclusions and future work
  • 6. Outline 1 Objectives 2 Coarse depth map estimation 3 Egomotion estimation 4 Background estimation 5 Pedestrian candidate generation 6 Conclusions and future work
  • 7. Objectives • Coarse depth map estimation simple and low-cost low-level features based on pictorial cues • Increasing the performance of many applications Egomotion estimation Background estimation Pedestrian candidates generation Diego Cheda — Monocular Depth Cues in Computer Vision Applications 7/64
  • 8. Objectives Segmenting an image into depth categories • Near Depth is usually estimated by using a stereo configuration. • Very-far The effect of camera translation at faraway distances is inappreciable. • Medium and Far Interesting for potential applications. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 8/64
  • 9. Outline 1 Objectives 2 Coarse depth map estimation 3 Egomotion estimation 4 Background estimation 5 Pedestrian candidate generation 6 Conclusions and future work
  • 10. Coarse depth map estimation Method Pipeline of our approach • Multiclass classification problem • Supervised learning approach Diego Cheda — Monocular Depth Cues in Computer Vision Applications 10/64
  • 11. Coarse depth map estimation Method Ground truth dataset • Set of urban outdoor images Saxena et al.: 400 images for training and 134 for testing. • Each image has an associated depth map acquired by a laser scanner. Thresholding depth map to be used as ground truth. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 11/64
  • 12. Coarse depth map estimation Method Regions Superpixels Regular grid Superpixels conserve intra-region similarities. × Time consuming. × Regular grids merge information of different regions. Once for a camera configuration. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 12/64
  • 13. Coarse depth map estimation Method Features • Monocular pictorial cues are predominant beyond 30 m to estimate depth. • Low-level visual features to represent texture, relative height, atmospheric scattering. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 13/64
  • 14. Coarse depth map estimation Method Features - Texture Paris street, rainy day - Gustave Caillebotte At a greater distance, texture patterns get finer and appear smoother To capture textures we use • Weibull distribution • Gabor filters Diego Cheda — Monocular Depth Cues in Computer Vision Applications 14/64
  • 15. Coarse depth map estimation Method Features - Texture: Weibull distribution • Compact representation β parameter γ parameter Diego Cheda — Monocular Depth Cues in Computer Vision Applications 15/64
  • 16. Coarse depth map estimation Method Features - Texture: Weibull distribution • Compact representation β parameter γ parameter Diego Cheda — Monocular Depth Cues in Computer Vision Applications 15/64
  • 17. Coarse depth map estimation Method Features - Texture: Weibull distribution • Compact representation β parameter γ parameter Diego Cheda — Monocular Depth Cues in Computer Vision Applications 15/64
  • 18. Coarse depth map estimation Method Features - Texture: Weibull distribution • Compact representation β parameter γ parameter Diego Cheda — Monocular Depth Cues in Computer Vision Applications 15/64
  • 19. Coarse depth map estimation Method Features - Texture: Gabor filter Images Gabor filter responses • Capture smoothed and textured regions Diego Cheda — Monocular Depth Cues in Computer Vision Applications 16/64
  • 20. Coarse depth map estimation Method Features - Relative height When an object is near the horizon, it is perceived as distant. To capture relative height we use • Location: x and y coordinates in the image Diego Cheda — Monocular Depth Cues in Computer Vision Applications 17/64
  • 21. Coarse depth map estimation Method Features - Location near medium far Depth average over ground truth Diego Cheda — Monocular Depth Cues in Computer Vision Applications 18/64
  • 22. Coarse depth map estimation Method Features - Atmospheric scattering The Virgin and Child with St. Anne - Leonardo Da Vinci The further away objects are unclearer and less detailed with respect to those which are closer. To capture atmospheric scattering we use • RGB histogram • HSV histogram Diego Cheda — Monocular Depth Cues in Computer Vision Applications 19/64
  • 23. Coarse depth map estimation Method Learning approach One-vs-All • Binary classifiers • Training one classifier per class (near, medium, far, and very-far) • Low-performance due to number of positive examples for medium and far regions. Our approach • Training three classifiers: > 30, > 50, > 70 m. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 20/64
  • 24. Coarse depth map estimation Method Training Diego Cheda — Monocular Depth Cues in Computer Vision Applications 21/64
  • 25. Coarse depth map estimation Method Testing Diego Cheda — Monocular Depth Cues in Computer Vision Applications 22/64
  • 26. Coarse depth map estimation Method Testing Diego Cheda — Monocular Depth Cues in Computer Vision Applications 23/64
  • 27. Coarse depth map estimation Method Testing Diego Cheda — Monocular Depth Cues in Computer Vision Applications 23/64
  • 28. Coarse depth map estimation Method Testing Diego Cheda — Monocular Depth Cues in Computer Vision Applications 23/64
  • 29. Coarse depth map estimation Method Inference • CRF • Combining probabilities obtained from classifiers • Associating neighboring regions belonging to the same depth category. • Graph cut to guarantee a global maximum likelihood result. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 24/64
  • 30. Coarse depth map estimation Experimental results Performance measurement • Measure of performance: Jaccard index. TP (TP + FP + FN) Measures the level of agreement with respect to an ideal classification result. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 25/64
  • 31. Coarse depth map estimation Experimental results Different regions grouping Performance our method using different oversegmentation configurations. Regular Grid 10 x 10 15 x 15 20 x 20 Turbo Pixels ∼200 regions ∼400 regions ∼800 regions Algorithm Number of regions 20x20 15x15 10x10 Superpixels 0.3623 0.3567 0.3561 Grid 0.3586 0.3602 0.3570 • Best performing configuration is using superpixels Diego Cheda — Monocular Depth Cues in Computer Vision Applications 26/64
  • 32. Coarse depth map estimation Experimental results Comparison w.r.t. state-of-the-art Saxena et al. • A more challenging goal: photo-realistic 3D model • For each superpixel and its neighbors: features for occlusions, geometric, statistical and spatial information, textures, at multiple spatial scales. • Inferences methods with a high computational. • MRF Diego Cheda — Monocular Depth Cues in Computer Vision Applications 27/64
  • 33. Coarse depth map estimation Experimental results Comparison w.r.t. state-of-the-art Using a remarkable inferior number of low-level features (64 vs 646 respectively). Diego Cheda — Monocular Depth Cues in Computer Vision Applications 28/64
  • 34. Coarse depth map estimation Experimental results Relevance of visual features Diego Cheda — Monocular Depth Cues in Computer Vision Applications 29/64
  • 35. Coarse depth map estimation Experimental results Image Laser Depth Map Saxena et al. Our Diego Cheda — Monocular Depth Cues in Computer Vision Applications 30/64
  • 36. Coarse depth map estimation Experimental results Image Laser Depth Map Saxena et al. Our Diego Cheda — Monocular Depth Cues in Computer Vision Applications 30/64
  • 37. Coarse depth map estimation Conclusions We have presented • A supervised learning approach to segment an image according to certain depth categories. • Our algorithm use a reduced number of low-level visual features, which are based on monocular pictorial cues. Our results show • Monocular cues are useful for depth estimation. • Close and distant regions are well-segmented by our approach. • Regions at medium distances are more difficult to segment. • In average, our method outperforms Saxena et al. method. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 31/64
  • 38. Outline 1 Objectives 2 Coarse depth map estimation 3 Egomotion estimation 4 Background estimation 5 Pedestrian candidate generation 6 Conclusions and future work
  • 39. Egomotion estimation Motivation Egomotion estimation Estimating the vehicle position is a key component in many ADAS systems Autonomous navigation Adaptive cruise control Lane change assistance Diego Cheda — Monocular Depth Cues in Computer Vision Applications 33/64
  • 40. Egomotion estimation Problem definition Egomotion problem Determining the changes in the 3D rigid camera position and orientation. • Camera motion is described as a 3D rigid motion: pt = Rtp0 + tt • Six degrees of freedom (DOF). Diego Cheda — Monocular Depth Cues in Computer Vision Applications 34/64
  • 41. Egomotion estimation Problem definition Egomotion problem Determining the changes in the 3D rigid camera position and orientation. • Camera motion is described as a 3D rigid motion: pt = Rtp0 + tt • Six degrees of freedom (DOF). Diego Cheda — Monocular Depth Cues in Computer Vision Applications 34/64
  • 42. Egomotion estimation Problem definition Egomotion problem Determining the changes in the 3D rigid camera position and orientation. • Camera motion is described as a 3D rigid motion: pt = Rtp0 + tt • Six degrees of freedom (DOF). Diego Cheda — Monocular Depth Cues in Computer Vision Applications 34/64
  • 43. Egomotion estimation Problem definition Egomotion problem Determining the changes in the 3D rigid camera position and orientation. • Camera motion is described as a 3D rigid motion: pt = Rtp0 + tt • Six degrees of freedom (DOF). Diego Cheda — Monocular Depth Cues in Computer Vision Applications 34/64
  • 44. Egomotion estimation Goal Distant regions behave as a plane at infinity Properties • It remains in the same image coordinates during translation • It is only affected by camera rotation Goal • Identify distant regions in the image to estimate vehicle rotation uncoupledly from vehicle translation. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 35/64
  • 45. Egomotion estimation Algorithm overview Egomotion estimation based on distant points / regions × Distant points are hard to be tracked since they are located at low-textured regions. Distant region algorithm does a maximal use of distant information. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
  • 46. Egomotion estimation Algorithm overview Egomotion estimation based on distant points / regions × Distant points are hard to be tracked since they are located at low-textured regions. Distant region algorithm does a maximal use of distant information. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
  • 47. Egomotion estimation Algorithm overview Egomotion estimation based on distant points / regions × Distant points are hard to be tracked since they are located at low-textured regions. Distant region algorithm does a maximal use of distant information. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
  • 48. Egomotion estimation Algorithm overview Egomotion estimation based on distant points / regions × Distant points are hard to be tracked since they are located at low-textured regions. Distant region algorithm does a maximal use of distant information. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
  • 49. Egomotion estimation Algorithm overview Egomotion estimation based on distant points / regions × Distant points are hard to be tracked since they are located at low-textured regions. Distant region algorithm does a maximal use of distant information. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
  • 50. Egomotion estimation Algorithm overview Egomotion estimation based on distant points / regions × Distant points are hard to be tracked since they are located at low-textured regions. Distant region algorithm does a maximal use of distant information. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
  • 51. Egomotion estimation Algorithm overview Egomotion estimation based on distant points / regions × Distant points are hard to be tracked since they are located at low-textured regions. Distant region algorithm does a maximal use of distant information. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
  • 52. Egomotion estimation Algorithm overview Egomotion estimation based on distant points / regions × Distant points are hard to be tracked since they are located at low-textured regions. Distant region algorithm does a maximal use of distant information. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 36/64
  • 53. Egomotion estimation Experimental results Datasets • Karlsruhe dataset: 8 sequences • More than 8000 (∼ 3 km). • GT: INS Sensor. • Stereo depth maps Diego Cheda — Monocular Depth Cues in Computer Vision Applications 37/64
  • 54. Egomotion estimation Experimental results Evaluation of our distant regions segmentation Diego Cheda — Monocular Depth Cues in Computer Vision Applications 38/64
  • 55. Egomotion estimation Experimental results Comparison with other approaches • The five-point algorithm (5pts) by Nister. • The Burschka et al. method (RANSAC). • The stereo-based algorithm by Kitt et al. (as a baseline). Diego Cheda — Monocular Depth Cues in Computer Vision Applications 39/64
  • 56. Egomotion estimation Experimental results Rotation estimation performance Trajectory estimation performance Diego Cheda — Monocular Depth Cues in Computer Vision Applications 40/64
  • 57. Egomotion estimation Experimental results Yaw angle comparison GT (INS Sensor) DR DP Diego Cheda — Monocular Depth Cues in Computer Vision Applications 41/64
  • 58. Egomotion estimation Experimental results Trajectory results Diego Cheda — Monocular Depth Cues in Computer Vision Applications 42/64
  • 59. Egomotion estimation Conclusions In this section, we have • Proposed two novel monocular egomotion methods based on tracking distant points and distant regions. Our results show • Rotations are accurately estimated, since distant regions provide strong indicators of camera rotation. • In comparison with other state-of-the-art methods, our approach outperforms them. • Comparable performance with respect to the considered stereo algorithm. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 43/64
  • 60. Outline 1 Objectives 2 Coarse depth map estimation 3 Egomotion estimation 4 Background estimation 5 Pedestrian candidate generation 6 Conclusions and future work
  • 61. Background estimation Problem definition Background estimation Automatically remove transient and moving objects from a set of images with the aim of obtaining an occlusion-free background image of the scene. Background model • Represents objects whose distance to the camera is maximal. • Background objects are stationary. Goal • Identify close regions to penalize deviations from our background model. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 45/64
  • 62. Background estimation Experimental results Example of labeling Original images Close/distant regions Labeling Our result Diego Cheda — Monocular Depth Cues in Computer Vision Applications 46/64
  • 63. Background estimation Method Energy function E(f ) = p∈P Dp(fp) Data term + p,q∈N Vp,q(fp, fq) Smoothness term Data term Penalizes deviation from our background model taking into account color, motion and depth. Dp(fp) = αDS p (fp) + βDM p (fp) + γDP p • Color variations between sort time intervals • Moving objects by using motion boundaries • Close objects using our approach Smoothness term Penalizes the intensity differences between neighboring regions, giving a higher cost when images do not match well. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 47/64
  • 64. Background estimation Experimental results Datasets Towers City Train Market #frames: 11 #frames: 7 #frames: 3 #frames: 8 Diego Cheda — Monocular Depth Cues in Computer Vision Applications 48/64
  • 65. Background estimation Experimental results Agarwala et al. • State-of-the-art method. • Require user intervention to refine results. • Refined results used as ground truth. Norm of absolute difference in RGB channels Sequences Towers City Train Market 0.0551 0.0804 0.0479 0.0603 Diego Cheda — Monocular Depth Cues in Computer Vision Applications 49/64
  • 66. Background estimation Experimental results Independent moving object Original images Our method Agarwala et al. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 50/64
  • 67. Background estimation Conclusions In this section, • We have presented a method to background estimation containing moving/transient objects. • This method uses depth information for such purpose by penalizing close regions in a cost function. Our results show that • Our method significantly outperforms the median filter. • Our approach is comparable to Agarwala et al. method, without performing any user intervention. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 51/64
  • 68. Outline 1 Objectives 2 Coarse depth map estimation 3 Egomotion estimation 4 Background estimation 5 Pedestrian candidate generation 6 Conclusions and future work
  • 69. Pedestrian candidate generation Problem definition Pedestrian candidate generation Generating hypothesis to be evaluated by a pedestrian classifier. [Ger´onimo 2010] Goal Exploiting geometric and depth information available on single images to reduce the number of windows to be further processed. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 53/64
  • 70. Pedestrian candidate generation Problem definition Pedestrian candidate generation Generating hypothesis to be evaluated by a pedestrian classifier. [Ger´onimo 2010] Goal Exploiting geometric and depth information available on single images to reduce the number of windows to be further processed. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 53/64
  • 71. Pedestrian candidate generation Method Overview a) Original Image d) Pedestrian Candidate Windows b) Geometric Information c) Depth Information Fusion Diego Cheda — Monocular Depth Cues in Computer Vision Applications 54/64
  • 72. Pedestrian candidate generation Method Agglomerative clustering schema • Regions over ground surface • Agglomerating regions maintaining size coherence w.r.t. depth Original Image Geometric and Depth Information Superpixels (a) Geometric, Depth, and Spatial Information (b) Superpixels are merged Gravity Depth Size Hierarchical clustering (c) Bounding boxes surrounding regions Diego Cheda — Monocular Depth Cues in Computer Vision Applications 55/64
  • 73. Pedestrian candidate generation Experimental results Dataset • CVC Pedestrian dataset. • 15 sequences taken from a stereo-rig rigidly mounted in a car while it is driving on an urban scenario (4364 frames). • 7983 manual annotated pedestrians visible at less than 50 meters. Performance measures • Number of pedestrian candidates generated. • True Positive Rate TPR = TP TP + FN Diego Cheda — Monocular Depth Cues in Computer Vision Applications 56/64
  • 74. Pedestrian candidate generation Experimental results Diego Cheda — Monocular Depth Cues in Computer Vision Applications 57/64
  • 75. Pedestrian candidate generation Experimental results Lost pedestrians 4 % 1 8 % 7 8 % 0 - 1 0 1 0 - 2 5 > 2 5 0 3 0 0 6 0 0 9 0 0 1 2 0 0 LostPedestrians D i s t a n c e ( m ) Diego Cheda — Monocular Depth Cues in Computer Vision Applications 58/64
  • 76. Pedestrian candidate generation Conclusions In this section, we have presented: • Novel monocular method for generating pedestrian candidates. • It is based on geometric relationships and depth. Our results show that: • Our method overcome all considered methods because significantly reduces the number of candidates. • High value for TPR. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 59/64
  • 77. Outline 1 Objectives 2 Coarse depth map estimation 3 Egomotion estimation 4 Background estimation 5 Pedestrian candidate generation 6 Conclusions and future work
  • 78. Conclusions and future work Conclusions • We have proposed a supervised learning approach to classify the pixels of outdoor images in just four categories: near, medium-distance, far and very-far, based on monocular pictorial cues. • In comparison against the results of a most complex depth map estimation method, our method overcomes the performance of it, using low computational demanding techniques. • We have demonstrated the usefulness of our coarse depth maps in improving the results of egomotion estimation, background estimation, and pedestrian candidates generation. In each application, we have contributed with novel methods from a different perspective based on the use of coarse depth. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 61/64
  • 79. Conclusions and future work Future work • Extend our approach to consider more monocular depth cues like occlusions, relative and familiar size, that could improve our coarse estimation. • Explore other possible applications of depth information (tracking, for initializing 3D reconstruction algorithms, learning pedestrians classifiers according with depth, etc). • Integrate our depth estimation method in different ADAS modules. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 62/64
  • 80. Conclusions and future work Publications This thesis take as bases the following publications: Conference papers • Camera Egomotion Estimation in the ADAS Context, D. Cheda, D. Ponsa and A. M. L´opez, IEEE Conf. Intell. Transp. Syst., 2010. • Monocular Egomotion Estimation based on Image Matching, D. Cheda, D. Ponsa and A. M. L´opez, Int. Conf. Pattern Recognit. Appl. and Methods, 2012. • Monocular Depth-based Background Estimation, D. Cheda, D. Ponsa and A. M. L´opez, Int. Conf. Comput. Vision Theory Appl., 2012. • Pedestrian Candidates Generation using Monocular Cues, D. Cheda, D. Ponsa and A. M. L´opez, IEEE Intell. Vehicles Symposium, 2012. Journal papers under reviewing • Monocular Multilayer Depth Segmentation and Applications, D. Cheda, D. Ponsa and A. M. L´opez, submitted to IJCV, Springer. • Monocular Visual Odometry Boosted by Monocular Depth Cues, D. Cheda, D. Ponsa and A. M. L´opez, submitted to ITS, IEEE. Diego Cheda — Monocular Depth Cues in Computer Vision Applications 63/64