Differentiable Ray Sampling for Neural 3D Representation

•

4 likes•15,887 views

This document summarizes a method for single-view 3D reconstruction using differentiable ray sampling. It discusses prior work using 3D or 2D supervision and their limitations. The proposed method uses a neural 3D representation that maps coordinates to occupancy. It introduces differentiable ray sampling to allow end-to-end training with only 2D images. Results on cars and chairs show the method achieves similar or better accuracy compared to prior work, with constant memory usage at high resolutions.

Technology

N. H. Shimada
Differentiable Ray Sampling  
for Neural 3D Representation 
Preferred Networks 2019 Research Internship

Single-view 3D reconstruction
・Grasping ・Autonomous driving
[Yan+ ICRA 2018] [Mapillary blog]

Single-view 3D reconstruction
● 3D supervision
○ A large number of 3D datas are needed.
[Kato+ CVPR 2019]
Input
(image)
Output
(3D geometry)
prediction model

Single-view 3D reconstruction
● 2D supervision
○ End-to-end training: only 2D images.
○ Differentiable renderer is needed.
[Kato+ CVPR 2019]
Input
(image)
prediction model
Rendering
3D geometry Output
(image)

Single-view 3D reconstruction
● 3D Geometry representation
1. [Kato+ CVPR 2017]
2. [Tulsiani+ CVPR 2018]
3. [Sitzmann+ arXiv 2019]
Mesh1
Voxel2 Neural 3D
(SRN3
)
Neural 3D
(Ours)
initial shape ✕ ◯ ◯ ◯
memory
vs
resolution
◯ ✕ ◯ ◯
the number
of train views
◯ ◯ (✕) ◯
Accuracy
(IoU)
0.71 0.73 - ???

DRC (Tulsiani+ CVPR 2017)
Encoder
Decoder
Input
(image)
323
voxel
(occupancy)
Rendered
image

DRC (Tulsiani+ CVPR 2017)
● Differentiable rendering

DRC (Tulsiani+ CVPR 2017)
Input
(RGB) Input
(RGB)
Ground truth Prediction
Prediction

Ours
Voxel grid representation as function :
(xi
, yi
, zi
) → (Occupancy)
323
discrete input
Memory increases cubically with higher resolution
DRC (Tulsiani+ CVPR 2017) Our idea
x
y
z
Occupancy
Neural 3D representation : 
(x, y, z) → (Occupancy)
Continuous input
Constant memory with high resolution

Ours
● Differentiable ray sampling
d  Translation probability
Pixel value
in mask images
0 1

Ours
Encoder
Decoder
Input
(image)
Rendered
image
parameters
x
y
z
3D Networks

Results
● 1 instance Ground
truth
Prediction Diff
IoU
(DRC)
0.53
(0.43)
Voxelized 3D (sliced image)
{prediction, gt, diff}
0.81
(0.73)
Car
Chair

Results
● Multi-instance (Qualitative)
Ground
truth
Prediction Diff
Input
RGB
Car Chair

Results
● Multi-instance (Quantitative)
Accuracy
(IoU)
Voxel
(DRC1
)
Neural 3D
(Ours)
Car 0.73 0.72
Chair 0.43 0.44

Results
● Multi-instance (Loss plots)
Car Chair

SRN (Sitzmann+ NIPS 2019)
Encoder
Decoder
Input
(image)
Rendered
image
parameters
x
y
z
3D Networks
pixel generator
SDF (?)
di
d1
d2
d0
The part of rendering is also a networks.
→ 50 images per 1 object for training

What's hot

スパースモデリング入門Hideo Terada

実装レベルで学ぶVQVAEぱんいちすみもと

backbone としての timm 入門Takuji Tahara

最近のKaggleに学ぶテーブルデータの特徴量エンジニアリングmlm_kansai

PyTorchLightning ベース Hydra+MLFlow+Optuna による機械学習開発環境の構築Kosuke Shinoda

強化学習における好奇心Shota Imai

SSII2022 [SS1] ニューラル3D表現の最新動向〜ニューラルネットでなんでも表せる？？〜SSII

近年のHierarchical Vision TransformerYusuke Uchida

[DL輪読会]Disentangling by FactorisingDeep Learning JP

[DL輪読会]Revisiting Deep Learning Models for Tabular Data (NeurIPS 2021) 表形式デー...Deep Learning JP

[DL輪読会]Dream to Control: Learning Behaviors by Latent ImaginationDeep Learning JP

Generative Models（メタサーベイ）cvpaper. challenge

[DL輪読会]Learning Latent Dynamics for Planning from PixelsDeep Learning JP

「ランダムフォレスト回帰」のハイパーパラメーターJun Umezawa

Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recog...yukihiro domae

[DL輪読会]NVAE: A Deep Hierarchical Variational AutoencoderDeep Learning JP

言語表現モデルBERTで文章生成してみたTakuya Koumura

高速な倍精度指数関数expの実装MITSUNARI Shigeo

「ベータ分布の謎に迫る」第6回プログラマのための数学勉強会 LT資料Ken'ichi Matsui

Halide による画像処理プログラミング入門Fixstars Corporation

What's hot (20)

スパースモデリング入門

実装レベルで学ぶVQVAE

backbone としての timm 入門

最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング

PyTorchLightning ベース Hydra+MLFlow+Optuna による機械学習開発環境の構築

強化学習における好奇心

SSII2022 [SS1] ニューラル3D表現の最新動向〜ニューラルネットでなんでも表せる？？〜

近年のHierarchical Vision Transformer

[DL輪読会]Disentangling by Factorising

[DL輪読会]Revisiting Deep Learning Models for Tabular Data (NeurIPS 2021) 表形式デー...

[DL輪読会]Dream to Control: Learning Behaviors by Latent Imagination

Generative Models（メタサーベイ）

[DL輪読会]Learning Latent Dynamics for Planning from Pixels

「ランダムフォレスト回帰」のハイパーパラメーター

Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recog...

[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder

言語表現モデルBERTで文章生成してみた

高速な倍精度指数関数expの実装

「ベータ分布の謎に迫る」第6回プログラマのための数学勉強会 LT資料

Halide による画像処理プログラミング入門

Similar to Differentiable Ray Sampling for Neural 3D Representation

Comparison of Rendering Processes on 3D ModelAIRCC Publishing Corporation

COMPARISON OF RENDERING PROCESSES ON 3D MODELijcsit

DrTAD Blender software. Example 3. Images as Planes. Spin (3D Object). Materi...FIDE Master Tihomir Dovramadjiev PhD

DrTAD Blender software. Example 4b. Modeling based on spherical primitives (3...FIDE Master Tihomir Dovramadjiev PhD

Weakly supervised semantic segmentation of 3D point cloudArithmer Inc.

Deep single view 3 d object reconstruction with visual hullHanqing Wang

IRJET- A Study of Generative Adversarial Networks in 3D ModellingIRJET Journal

3d-recons-pres.pdfTahaTekdogan

Dissecting the Rendering of The SurgePhilip Hammer

DrTAD Blender Tutorial. Animation of National Bulgarian Flag. Physics - Cloth...FIDE Master Tihomir Dovramadjiev PhD

Photogrametry_3D_Modelling[1]Joachim Nkendeys

Conditional CycleGANによる食事画像変換Ryosuke Tanno

3D Image visualizationalok ray

Point-GNN: Graph Neural Network for 3D Object Detection in a Point CloudNuwan Sriyantha Bandara

DrTAD Blender software. Quick creating an interior scene. Curves, Subdivide. ...FIDE Master Tihomir Dovramadjiev PhD

Cad notesVaibhav Bajaj

Cad notes - ENGINEERING DRAWING - RGPV,BHOPALAbhishek Kandare

Cad notesSumit Chandak

Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017Universitat Politècnica de Catalunya

Similar to Differentiable Ray Sampling for Neural 3D Representation (20)

Comparison of Rendering Processes on 3D Model

COMPARISON OF RENDERING PROCESSES ON 3D MODEL

DrTAD Blender software. Example 3. Images as Planes. Spin (3D Object). Materi...

DrTAD Blender software. Example 4b. Modeling based on spherical primitives (3...

Weakly supervised semantic segmentation of 3D point cloud

Deep single view 3 d object reconstruction with visual hull

IRJET- A Study of Generative Adversarial Networks in 3D Modelling

3d-recons-pres.pdf

Dissecting the Rendering of The Surge

DrTAD Blender Tutorial. Animation of National Bulgarian Flag. Physics - Cloth...

Photogrametry_3D_Modelling[1]

Conditional CycleGANによる食事画像変換

3D Image visualization

Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud

DrTAD Blender software. Quick creating an interior scene. Curves, Subdivide. ...

Cad notes

Cad notes - ENGINEERING DRAWING - RGPV,BHOPAL

Cad notes

Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

Scaling API-first – The story of a global engineering organizationRadu Cotescu

A Domino Admins Adventures (Engage 2024)Gabriella Davis

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

A Call to Action for Generative AI in 2024Results

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

GenCyber Cyber Security Day PresentationMichael W. Hawkins

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Driving Behavioral Change for Information Management through Data-Driven Gree...

Scaling API-first – The story of a global engineering organization

A Domino Admins Adventures (Engage 2024)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Finology Group – Insurtech Innovation Award 2024

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

Axa Assurance Maroc - Insurer Innovation Award 2024

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Factors to Consider When Choosing Accounts Payable Services Providers.pptx

A Call to Action for Generative AI in 2024

CNv6 Instructor Chapter 6 Quality of Service

Breaking the Kubernetes Kill Chain: Host Path Mount

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

08448380779 Call Girls In Friends Colony Women Seeking Men

GenCyber Cyber Security Day Presentation

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Differentiable Ray Sampling for Neural 3D Representation

1. N. H. Shimada Differentiable Ray Sampling   for Neural 3D Representation  Preferred Networks 2019 Research Internship

2. Single-view 3D reconstruction ・Grasping ・Autonomous driving [Yan+ ICRA 2018] [Mapillary blog]

3. Single-view 3D reconstruction ● 3D supervision ○ A large number of 3D datas are needed. [Kato+ CVPR 2019] Input (image) Output (3D geometry) prediction model

4. Single-view 3D reconstruction ● 2D supervision ○ End-to-end training: only 2D images. ○ Differentiable renderer is needed. [Kato+ CVPR 2019] Input (image) prediction model Rendering 3D geometry Output (image)

5. Single-view 3D reconstruction ● 3D Geometry representation 1. [Kato+ CVPR 2017] 2. [Tulsiani+ CVPR 2018] 3. [Sitzmann+ arXiv 2019] Mesh1 Voxel2 Neural 3D (SRN3 ) Neural 3D (Ours) initial shape ✕ ◯ ◯ ◯ memory vs resolution ◯ ✕ ◯ ◯ the number of train views ◯ ◯ (✕) ◯ Accuracy (IoU) 0.71 0.73 - ???

6. DRC (Tulsiani+ CVPR 2017) Encoder Decoder Input (image) 323 voxel (occupancy) Rendered image

7. DRC (Tulsiani+ CVPR 2017) ● Differentiable rendering

8. DRC (Tulsiani+ CVPR 2017) Input (RGB) Input (RGB) Ground truth Prediction Prediction

9. Ours Voxel grid representation as function : (xi , yi , zi ) → (Occupancy) 323 discrete input Memory increases cubically with higher resolution DRC (Tulsiani+ CVPR 2017) Our idea x y z Occupancy Neural 3D representation :  (x, y, z) → (Occupancy) Continuous input Constant memory with high resolution

10. Ours ● Differentiable ray sampling d  Translation probability Pixel value in mask images 0 1

11. Ours Encoder Decoder Input (image) Rendered image parameters x y z 3D Networks

12. Results ● 1 instance Ground truth Prediction Diff IoU (DRC) 0.53 (0.43) Voxelized 3D (sliced image) {prediction, gt, diff} 0.81 (0.73) Car Chair

13. Results ● Multi-instance (Qualitative) Ground truth Prediction Diff Input RGB Car Chair

14. Results ● Multi-instance (Quantitative) Accuracy (IoU) Voxel (DRC1 ) Neural 3D (Ours) Car 0.73 0.72 Chair 0.43 0.44

15. Results ● Multi-instance (Loss plots) Car Chair

16. SRN (Sitzmann+ NIPS 2019) Encoder Decoder Input (image) Rendered image parameters x y z 3D Networks pixel generator SDF (?) di d1 d2 d0 The part of rendering is also a networks. → 50 images per 1 object for training

Differentiable Ray Sampling for Neural 3D Representation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Differentiable Ray Sampling for Neural 3D Representation

Similar to Differentiable Ray Sampling for Neural 3D Representation (20)

More from Preferred Networks

More from Preferred Networks (20)

Recently uploaded

Recently uploaded (20)

Differentiable Ray Sampling for Neural 3D Representation