SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Depth Images Prediction from a Single RGB Image
Using Deep learning
Deep Learning
May 2017
Soubhi Hadri
Depth Images Prediction from a Single RGB Image
Table of Contents :
Introduction.1
Existing Solutions.2
Dataset and Model.3
Project Code and Results.1
Introduction
Depth Images Prediction from a Single RGB Image
Introduction
-In 3D computer graphics a depth map is an image or image channel
that contains information relating to the distance of the surfaces of
scene objects from a viewpoint.
-RGB-D image : a RGB image and its corresponding depth image
-A depth image is an image channel in which each pixel relates to a
distance between the image plane and the corresponding object in the
RGB image.
Depth Images Prediction from a Single RGB Image
Introduction
To approximate the depth of objects :
• Stereo camera : camera with two/more lenses to simulate human vision.
• Realsense or Kinect to get RGB-D images
• Deep Learning..!!
Existing Solutions
Depth Images Prediction from a Single RGB Image
Deep Learning for depth estimation :
Recently, there are many works to estimate the depth map for RGB image.
Depth Images Prediction from a Single RGB Image
Deep Learning for depth estimation :
Learning Fine-Scaled Depth
Maps from Single RGB Images.
7 Feb 2017
Recently, there are many works to estimate the depth map for RGB image.
Dataset & Model
Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
The NYU-Depth V2 data set is comprised of video sequences from a variety of
indoor scenes as recorded by both the RGB and Depth cameras from the
Microsoft Kinect.
Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
The NYU-Depth V2 data set is comprised of video sequences from a variety of
indoor scenes as recorded by both the RGB and Depth cameras from the
Microsoft Kinect.
Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
The dataset consists of :
• 1449 labeled pairs of aligned RGB and depth images (2.8 GB).
• 407,024 new unlabeled frames - raw rgb, depth (428 GB).
• Toolbox: Useful functions for manipulating the data and labels.
Different parts of the dataset can be downloaded individually.
Authors : Nathan Silberman, Derek Hoiem, Pushmeet Kohli and Rob Fergus
2012
Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
The dataset consists of :
• 1449 labeled pairs of aligned RGB and depth images (2.8 GB).
• 407,024 new unlabeled frames - raw rgb, depth (428 GB).
• Toolbox: Useful functions for manipulating the data and labels.
Different parts of the dataset can be downloaded individually.
Authors : Nathan Silberman, Derek Hoiem, Pushmeet Kohli and Rob Fergus
2012
Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
For this project:
• Office 1-2 dataset (part of the whole dataset).
• 15 GB after processing RAW data.
• 3522 RGB-D images.
Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
For this project:
• Office 1-2 dataset (part of the whole dataset).
• 15 GB after processing RAW data.
• 3522 RGB-D images.
Split the data:
3522
20%
80% 2817
705
2414
403
Training
Validation
Test
Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
Samples of the data:
Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Model proposed by JaN IVANECK in his master degree thesis -2016.
Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Model proposed by JaN IVANECK in his master degree thesis -2016.
He derived his model from Eigen et al.
Predicting Depth, Surface
Normals and Semantic Labels
with a Common Multi-Scale
Convolutional Architecture.
17 Dec 2015
Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Global context network
estimates the rough
depth map of the whole
scene from the input
RGB image.
Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Gradient network
estimates horizontal and
vertical gradients of the
depth map globally, for
the whole RGB image.
Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Refining network
improves the rough
estimate from the global
context network, utilizing
gradients estimated by the
gradient network and an
input RGB image.
Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Global context network
Architecture of the global context
network
The model is derived from AlexNet.
Depth Images Prediction from a Single RGB Image
Loss Function:
Root mean squared error log(rms-log)
Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the output images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the label images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the label images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the label images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the label images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
Depth Images Prediction from a Single RGB Image
Project Functions :
1- split_data : to split and save the data into training/testing/val.npy files.
2- load_data : load data from .npy files.
3- plot_imgs: to plot pair of images.
4- get_next_batch: to get the next batch from training data.
5- loss : calculate the loss function.
6- model: to create model (network structure).
Depth Images Prediction from a Single RGB Image
Project Functions :
7- train: to start training .
8- evaluate: to evaluate new data after restoring the model..
Depth Images Prediction from a Single RGB Image
Project Tools and Libraries:
1- Tensorflow.
2- Slim : lightweight library for defining, training and evaluating complex
models in TensorFlow.
3- Tensorboard.
4- numpy.
5-matplotlib.
Depth Images Prediction from a Single RGB Image
Project Results: 
Training Loss error:
Depth Images Prediction from a Single RGB Image
Project Results: 
Samples of new data:
Depth Images Prediction from a Single RGB Image
Project Results: 
Explanation :
• Training data is not sufficient.
Depth Images Prediction from a Single RGB Image
Project Results: 
Explanation :
• Training data is not sufficient.
In Jan’s experiment:
• Full NYU dataset and 3 dataset generated from the original one.
• Network was trained for 100,000 iterations.
Depth Images Prediction from a Single RGB Image
Project Results: 
Explanation :
• Training data is not sufficient.
In Jan’s experiment:
• Full NYU dataset and 3 dataset generated from the original one.
• Network was trained for 100,000 iterations.
This experiment:
• It took ~26 hours for 30 Epochs.
Depth Images Prediction from a Single RGB Image
Project :
The project code and data will be available on GitHub:
https://github.com/SubhiH/Depth-Estimation-Deep-Learning
Depth Images Prediction from a Single RGB Image
Resources :
-https://arxiv.org/pdf/1607.00730.pdf
-http://janivanecky.com/
-http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

Deep convolutional neural fields for depth estimation from a single image
Deep convolutional neural fields for depth estimation from a single imageDeep convolutional neural fields for depth estimation from a single image
Deep convolutional neural fields for depth estimation from a single imageWei Yang
 
Histogram equalization
Histogram equalizationHistogram equalization
Histogram equalization11mr11mahesh
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
Human Pose Estimation by Deep Learning
Human Pose Estimation by Deep LearningHuman Pose Estimation by Deep Learning
Human Pose Estimation by Deep LearningWei Yang
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationDat Nguyen
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningMohamed Loey
 
Depth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors IIDepth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors IIYu Huang
 
Recent Progress in RNN and NLP
Recent Progress in RNN and NLPRecent Progress in RNN and NLP
Recent Progress in RNN and NLPhytae
 
Pose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learningPose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learningYu Huang
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012Jinwon Lee
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]Dongmin Choi
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyNUPUR YADAV
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationVikas Jain
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 
Computer Vision image classification
Computer Vision image classificationComputer Vision image classification
Computer Vision image classificationWael Badawy
 

Was ist angesagt? (20)

Deep convolutional neural fields for depth estimation from a single image
Deep convolutional neural fields for depth estimation from a single imageDeep convolutional neural fields for depth estimation from a single image
Deep convolutional neural fields for depth estimation from a single image
 
Histogram equalization
Histogram equalizationHistogram equalization
Histogram equalization
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Human Pose Estimation by Deep Learning
Human Pose Estimation by Deep LearningHuman Pose Estimation by Deep Learning
Human Pose Estimation by Deep Learning
 
Edge detection
Edge detectionEdge detection
Edge detection
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
 
Depth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors IIDepth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors II
 
Recent Progress in RNN and NLP
Recent Progress in RNN and NLPRecent Progress in RNN and NLP
Recent Progress in RNN and NLP
 
Pose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learningPose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learning
 
Hog
HogHog
Hog
 
Deep Learning for Video: Action Recognition (UPC 2018)
Deep Learning for Video: Action Recognition (UPC 2018)Deep Learning for Video: Action Recognition (UPC 2018)
Deep Learning for Video: Action Recognition (UPC 2018)
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and Classification
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Computer Vision image classification
Computer Vision image classificationComputer Vision image classification
Computer Vision image classification
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image Processing
 

Ähnlich wie Depth estimation using deep learning

SeRanet introduction
SeRanet introductionSeRanet introduction
SeRanet introductionKosuke Nakago
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsPetteriTeikariPhD
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image ProcessingAnkur Nanda
 
Open CV - 電腦怎麼看世界
Open CV - 電腦怎麼看世界Open CV - 電腦怎麼看世界
Open CV - 電腦怎麼看世界Tech Podcast Night
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMeetupDataScienceRoma
 
[CVPRW 2020]Real world Super-Resolution via Kernel Estimation and Noise Injec...
[CVPRW 2020]Real world Super-Resolution via Kernel Estimation and Noise Injec...[CVPRW 2020]Real world Super-Resolution via Kernel Estimation and Noise Injec...
[CVPRW 2020]Real world Super-Resolution via Kernel Estimation and Noise Injec...KIMMINHA3
 
Techniques for effective and efficient fire detection from social media images
Techniques for effective and efficient fire detection from social media imagesTechniques for effective and efficient fire detection from social media images
Techniques for effective and efficient fire detection from social media imagesUniversidade de São Paulo
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Universitat Politècnica de Catalunya
 
Depth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep LearningDepth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep LearningYu Huang
 
Optimized Feedforward Network of CNN with Xnor Final Presentation
Optimized Feedforward Network of CNN with Xnor Final PresentationOptimized Feedforward Network of CNN with Xnor Final Presentation
Optimized Feedforward Network of CNN with Xnor Final PresentationIndiana University Bloomington
 
cvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptxcvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptxPyariMohanJena
 
ppt 20BET1024.pptx
ppt 20BET1024.pptxppt 20BET1024.pptx
ppt 20BET1024.pptxManeetBali
 
Ijmsr 2016-10
Ijmsr 2016-10Ijmsr 2016-10
Ijmsr 2016-10ijmsr
 
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...Savvas Chatzichristofis
 
Coin recognition using matlab
Coin recognition using matlabCoin recognition using matlab
Coin recognition using matlabslmnsvn
 
Computer Vision Landscape : Present and Future
Computer Vision Landscape : Present and FutureComputer Vision Landscape : Present and Future
Computer Vision Landscape : Present and FutureSanghamitra Deb
 
Surveys of Image Recoginition.ppt
Surveys of Image Recoginition.pptSurveys of Image Recoginition.ppt
Surveys of Image Recoginition.pptSarang Rakhecha
 

Ähnlich wie Depth estimation using deep learning (20)

SeRanet introduction
SeRanet introductionSeRanet introduction
SeRanet introduction
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problems
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image Processing
 
Open CV - 電腦怎麼看世界
Open CV - 電腦怎麼看世界Open CV - 電腦怎麼看世界
Open CV - 電腦怎麼看世界
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 
Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image Processing
 
[DL輪読会]ClearGrasp
[DL輪読会]ClearGrasp[DL輪読会]ClearGrasp
[DL輪読会]ClearGrasp
 
[CVPRW 2020]Real world Super-Resolution via Kernel Estimation and Noise Injec...
[CVPRW 2020]Real world Super-Resolution via Kernel Estimation and Noise Injec...[CVPRW 2020]Real world Super-Resolution via Kernel Estimation and Noise Injec...
[CVPRW 2020]Real world Super-Resolution via Kernel Estimation and Noise Injec...
 
Techniques for effective and efficient fire detection from social media images
Techniques for effective and efficient fire detection from social media imagesTechniques for effective and efficient fire detection from social media images
Techniques for effective and efficient fire detection from social media images
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
 
Depth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep LearningDepth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep Learning
 
Optimized Feedforward Network of CNN with Xnor Final Presentation
Optimized Feedforward Network of CNN with Xnor Final PresentationOptimized Feedforward Network of CNN with Xnor Final Presentation
Optimized Feedforward Network of CNN with Xnor Final Presentation
 
MAJOR PROJECT
MAJOR PROJECT MAJOR PROJECT
MAJOR PROJECT
 
cvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptxcvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptx
 
ppt 20BET1024.pptx
ppt 20BET1024.pptxppt 20BET1024.pptx
ppt 20BET1024.pptx
 
Ijmsr 2016-10
Ijmsr 2016-10Ijmsr 2016-10
Ijmsr 2016-10
 
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
 
Coin recognition using matlab
Coin recognition using matlabCoin recognition using matlab
Coin recognition using matlab
 
Computer Vision Landscape : Present and Future
Computer Vision Landscape : Present and FutureComputer Vision Landscape : Present and Future
Computer Vision Landscape : Present and Future
 
Surveys of Image Recoginition.ppt
Surveys of Image Recoginition.pptSurveys of Image Recoginition.ppt
Surveys of Image Recoginition.ppt
 

Kürzlich hochgeladen

Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdfAkritiPradhan2
 
Javier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier Fernández Muñoz
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTSneha Padhiar
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfBalamuruganV28
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptJohnWilliam111370
 
Theory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfTheory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfShreyas Pandit
 
Forming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptForming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptNoman khan
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxRomil Mishra
 
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxTriangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxRomil Mishra
 
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfManish Kumar
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Erbil Polytechnic University
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionSneha Padhiar
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSHigh Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSsandhya757531
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfalene1
 
Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewsandhya757531
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communicationpanditadesh123
 

Kürzlich hochgeladen (20)

Designing pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptxDesigning pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptx
 
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
 
Javier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptx
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdf
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
 
Theory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfTheory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdf
 
Forming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptForming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).ppt
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
 
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxTriangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
 
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based question
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSHigh Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
 
Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overview
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communication
 

Depth estimation using deep learning

  • 1. Depth Images Prediction from a Single RGB Image Using Deep learning Deep Learning May 2017 Soubhi Hadri
  • 2. Depth Images Prediction from a Single RGB Image Table of Contents : Introduction.1 Existing Solutions.2 Dataset and Model.3 Project Code and Results.1
  • 4. Depth Images Prediction from a Single RGB Image Introduction -In 3D computer graphics a depth map is an image or image channel that contains information relating to the distance of the surfaces of scene objects from a viewpoint. -RGB-D image : a RGB image and its corresponding depth image -A depth image is an image channel in which each pixel relates to a distance between the image plane and the corresponding object in the RGB image.
  • 5. Depth Images Prediction from a Single RGB Image Introduction To approximate the depth of objects : • Stereo camera : camera with two/more lenses to simulate human vision. • Realsense or Kinect to get RGB-D images • Deep Learning..!!
  • 7. Depth Images Prediction from a Single RGB Image Deep Learning for depth estimation : Recently, there are many works to estimate the depth map for RGB image.
  • 8. Depth Images Prediction from a Single RGB Image Deep Learning for depth estimation : Learning Fine-Scaled Depth Maps from Single RGB Images. 7 Feb 2017 Recently, there are many works to estimate the depth map for RGB image.
  • 10. Depth Images Prediction from a Single RGB Image Dataset : NYU Depth V2 The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect.
  • 11. Depth Images Prediction from a Single RGB Image Dataset : NYU Depth V2 The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect.
  • 12. Depth Images Prediction from a Single RGB Image Dataset : NYU Depth V2 The dataset consists of : • 1449 labeled pairs of aligned RGB and depth images (2.8 GB). • 407,024 new unlabeled frames - raw rgb, depth (428 GB). • Toolbox: Useful functions for manipulating the data and labels. Different parts of the dataset can be downloaded individually. Authors : Nathan Silberman, Derek Hoiem, Pushmeet Kohli and Rob Fergus 2012
  • 13. Depth Images Prediction from a Single RGB Image Dataset : NYU Depth V2 The dataset consists of : • 1449 labeled pairs of aligned RGB and depth images (2.8 GB). • 407,024 new unlabeled frames - raw rgb, depth (428 GB). • Toolbox: Useful functions for manipulating the data and labels. Different parts of the dataset can be downloaded individually. Authors : Nathan Silberman, Derek Hoiem, Pushmeet Kohli and Rob Fergus 2012
  • 14. Depth Images Prediction from a Single RGB Image Dataset : NYU Depth V2 For this project: • Office 1-2 dataset (part of the whole dataset). • 15 GB after processing RAW data. • 3522 RGB-D images.
  • 15. Depth Images Prediction from a Single RGB Image Dataset : NYU Depth V2 For this project: • Office 1-2 dataset (part of the whole dataset). • 15 GB after processing RAW data. • 3522 RGB-D images. Split the data: 3522 20% 80% 2817 705 2414 403 Training Validation Test
  • 16. Depth Images Prediction from a Single RGB Image Dataset : NYU Depth V2 Samples of the data:
  • 17. Depth Images Prediction from a Single RGB Image The Model for Depth Estimation: Model proposed by JaN IVANECK in his master degree thesis -2016.
  • 18. Depth Images Prediction from a Single RGB Image The Model for Depth Estimation: Model proposed by JaN IVANECK in his master degree thesis -2016. He derived his model from Eigen et al. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture. 17 Dec 2015
  • 19. Depth Images Prediction from a Single RGB Image The Model for Depth Estimation: Global context network estimates the rough depth map of the whole scene from the input RGB image.
  • 20. Depth Images Prediction from a Single RGB Image The Model for Depth Estimation: Gradient network estimates horizontal and vertical gradients of the depth map globally, for the whole RGB image.
  • 21. Depth Images Prediction from a Single RGB Image The Model for Depth Estimation: Refining network improves the rough estimate from the global context network, utilizing gradients estimated by the gradient network and an input RGB image.
  • 22. Depth Images Prediction from a Single RGB Image The Model for Depth Estimation: Global context network Architecture of the global context network The model is derived from AlexNet.
  • 23. Depth Images Prediction from a Single RGB Image Loss Function: Root mean squared error log(rms-log)
  • 24. Depth Images Prediction from a Single RGB Image Training The Network: 1- Scale the output images to [0 1]. 2-Subtraction 127 from input images to center the data (kind of normalization). 3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer Learning). 4-Training the network using batches (batch size = 32) for 35 Epochs. 5- Save the session and model in the end of each Epoch.
  • 25. Depth Images Prediction from a Single RGB Image Training The Network: 1- Scale the label images to [0 1]. 2-Subtraction 127 from input images to center the data (kind of normalization). 3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer Learning). 4-Training the network using batches (batch size = 32) for 35 Epochs. 5- Save the session and model in the end of each Epoch.
  • 26. Depth Images Prediction from a Single RGB Image Training The Network: 1- Scale the label images to [0 1]. 2-Subtraction 127 from input images to center the data (kind of normalization). 3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer Learning). 4-Training the network using batches (batch size = 32) for 35 Epochs. 5- Save the session and model in the end of each Epoch.
  • 27. Depth Images Prediction from a Single RGB Image Training The Network: 1- Scale the label images to [0 1]. 2-Subtraction 127 from input images to center the data (kind of normalization). 3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer Learning). 4-Training the network using batches (batch size = 32) for 35 Epochs. 5- Save the session and model in the end of each Epoch.
  • 28. Depth Images Prediction from a Single RGB Image Training The Network: 1- Scale the label images to [0 1]. 2-Subtraction 127 from input images to center the data (kind of normalization). 3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer Learning). 4-Training the network using batches (batch size = 32) for 35 Epochs. 5- Save the session and model in the end of each Epoch.
  • 29. Depth Images Prediction from a Single RGB Image Project Functions : 1- split_data : to split and save the data into training/testing/val.npy files. 2- load_data : load data from .npy files. 3- plot_imgs: to plot pair of images. 4- get_next_batch: to get the next batch from training data. 5- loss : calculate the loss function. 6- model: to create model (network structure).
  • 30. Depth Images Prediction from a Single RGB Image Project Functions : 7- train: to start training . 8- evaluate: to evaluate new data after restoring the model..
  • 31. Depth Images Prediction from a Single RGB Image Project Tools and Libraries: 1- Tensorflow. 2- Slim : lightweight library for defining, training and evaluating complex models in TensorFlow. 3- Tensorboard. 4- numpy. 5-matplotlib.
  • 32. Depth Images Prediction from a Single RGB Image Project Results:  Training Loss error:
  • 33. Depth Images Prediction from a Single RGB Image Project Results:  Samples of new data:
  • 34. Depth Images Prediction from a Single RGB Image Project Results:  Explanation : • Training data is not sufficient.
  • 35. Depth Images Prediction from a Single RGB Image Project Results:  Explanation : • Training data is not sufficient. In Jan’s experiment: • Full NYU dataset and 3 dataset generated from the original one. • Network was trained for 100,000 iterations.
  • 36. Depth Images Prediction from a Single RGB Image Project Results:  Explanation : • Training data is not sufficient. In Jan’s experiment: • Full NYU dataset and 3 dataset generated from the original one. • Network was trained for 100,000 iterations. This experiment: • It took ~26 hours for 30 Epochs.
  • 37. Depth Images Prediction from a Single RGB Image Project : The project code and data will be available on GitHub: https://github.com/SubhiH/Depth-Estimation-Deep-Learning
  • 38. Depth Images Prediction from a Single RGB Image Resources : -https://arxiv.org/pdf/1607.00730.pdf -http://janivanecky.com/ -http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html