1. Depth Images Prediction from a Single RGB Image
Using Deep learning
Deep Learning
May 2017
Soubhi Hadri
2. Depth Images Prediction from a Single RGB Image
Table of Contents :
Introduction.1
Existing Solutions.2
Dataset and Model.3
Project Code and Results.1
4. Depth Images Prediction from a Single RGB Image
Introduction
-In 3D computer graphics a depth map is an image or image channel
that contains information relating to the distance of the surfaces of
scene objects from a viewpoint.
-RGB-D image : a RGB image and its corresponding depth image
-A depth image is an image channel in which each pixel relates to a
distance between the image plane and the corresponding object in the
RGB image.
5. Depth Images Prediction from a Single RGB Image
Introduction
To approximate the depth of objects :
• Stereo camera : camera with two/more lenses to simulate human vision.
• Realsense or Kinect to get RGB-D images
• Deep Learning..!!
7. Depth Images Prediction from a Single RGB Image
Deep Learning for depth estimation :
Recently, there are many works to estimate the depth map for RGB image.
8. Depth Images Prediction from a Single RGB Image
Deep Learning for depth estimation :
Learning Fine-Scaled Depth
Maps from Single RGB Images.
7 Feb 2017
Recently, there are many works to estimate the depth map for RGB image.
10. Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
The NYU-Depth V2 data set is comprised of video sequences from a variety of
indoor scenes as recorded by both the RGB and Depth cameras from the
Microsoft Kinect.
11. Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
The NYU-Depth V2 data set is comprised of video sequences from a variety of
indoor scenes as recorded by both the RGB and Depth cameras from the
Microsoft Kinect.
12. Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
The dataset consists of :
• 1449 labeled pairs of aligned RGB and depth images (2.8 GB).
• 407,024 new unlabeled frames - raw rgb, depth (428 GB).
• Toolbox: Useful functions for manipulating the data and labels.
Different parts of the dataset can be downloaded individually.
Authors : Nathan Silberman, Derek Hoiem, Pushmeet Kohli and Rob Fergus
2012
13. Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
The dataset consists of :
• 1449 labeled pairs of aligned RGB and depth images (2.8 GB).
• 407,024 new unlabeled frames - raw rgb, depth (428 GB).
• Toolbox: Useful functions for manipulating the data and labels.
Different parts of the dataset can be downloaded individually.
Authors : Nathan Silberman, Derek Hoiem, Pushmeet Kohli and Rob Fergus
2012
14. Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
For this project:
• Office 1-2 dataset (part of the whole dataset).
• 15 GB after processing RAW data.
• 3522 RGB-D images.
15. Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
For this project:
• Office 1-2 dataset (part of the whole dataset).
• 15 GB after processing RAW data.
• 3522 RGB-D images.
Split the data:
3522
20%
80% 2817
705
2414
403
Training
Validation
Test
17. Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Model proposed by JaN IVANECK in his master degree thesis -2016.
18. Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Model proposed by JaN IVANECK in his master degree thesis -2016.
He derived his model from Eigen et al.
Predicting Depth, Surface
Normals and Semantic Labels
with a Common Multi-Scale
Convolutional Architecture.
17 Dec 2015
19. Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Global context network
estimates the rough
depth map of the whole
scene from the input
RGB image.
20. Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Gradient network
estimates horizontal and
vertical gradients of the
depth map globally, for
the whole RGB image.
21. Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Refining network
improves the rough
estimate from the global
context network, utilizing
gradients estimated by the
gradient network and an
input RGB image.
22. Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Global context network
Architecture of the global context
network
The model is derived from AlexNet.
23. Depth Images Prediction from a Single RGB Image
Loss Function:
Root mean squared error log(rms-log)
24. Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the output images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
25. Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the label images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
26. Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the label images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
27. Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the label images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
28. Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the label images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
29. Depth Images Prediction from a Single RGB Image
Project Functions :
1- split_data : to split and save the data into training/testing/val.npy files.
2- load_data : load data from .npy files.
3- plot_imgs: to plot pair of images.
4- get_next_batch: to get the next batch from training data.
5- loss : calculate the loss function.
6- model: to create model (network structure).
30. Depth Images Prediction from a Single RGB Image
Project Functions :
7- train: to start training .
8- evaluate: to evaluate new data after restoring the model..
31. Depth Images Prediction from a Single RGB Image
Project Tools and Libraries:
1- Tensorflow.
2- Slim : lightweight library for defining, training and evaluating complex
models in TensorFlow.
3- Tensorboard.
4- numpy.
5-matplotlib.
34. Depth Images Prediction from a Single RGB Image
Project Results:
Explanation :
• Training data is not sufficient.
35. Depth Images Prediction from a Single RGB Image
Project Results:
Explanation :
• Training data is not sufficient.
In Jan’s experiment:
• Full NYU dataset and 3 dataset generated from the original one.
• Network was trained for 100,000 iterations.
36. Depth Images Prediction from a Single RGB Image
Project Results:
Explanation :
• Training data is not sufficient.
In Jan’s experiment:
• Full NYU dataset and 3 dataset generated from the original one.
• Network was trained for 100,000 iterations.
This experiment:
• It took ~26 hours for 30 Epochs.
37. Depth Images Prediction from a Single RGB Image
Project :
The project code and data will be available on GitHub:
https://github.com/SubhiH/Depth-Estimation-Deep-Learning
38. Depth Images Prediction from a Single RGB Image
Resources :
-https://arxiv.org/pdf/1607.00730.pdf
-http://janivanecky.com/
-http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html