PointNet

Research Fellow
Research FellowEntrepreneur | Deep Learning | Healthcare | Visual Neuroscience | Ophthalmology um Research Fellow
PointNet
Implementation Initial ‘deep learning’ idea
.XYZ point cloud better than the
reconstructed .obj file for automatic
segmentation due to higher resolution
InputPointCloud
3D CAD MODEL
No need to have
planar surfaces
Sampled too densely
www.outsource3dcadmodeling.com
2DCAD MODEL
Straightforward from 3D to 2D
cadcrowd.com
RECONSTRUCT 3D
“Deep Learning”
3DSemantic Segmentation
frompointcloud / reconstructed mesh
youtube.com/watch?v=cGuoyNY54kU
arxiv.org/1608.04236
Primitive-based deep learning segmentation
The order between semantic segmentation and reconstruction could be swapped
NIPS 2016: 3D Workshop
very early still for point cloud pipelines compared to “ordered images”
Deep learning is proven to be a powerful tool to build
models for language (one-dimensional) and image
(two-dimensional) understanding. Tremendous efforts
have been devoted to these areas, however, it is still
at the early stage to apply deep learning to 3D data,
despite their great research values and broad real-
world applications. In particular, existing methods
poorly serve the three-dimensional data that drives
a broad range of critical applications such as
augmented reality, autonomous driving, graphics,
robotics, medical imaging, neuroscience, and
scientific simulations. These problems have drawn
the attention of researchers in different fields such as
neuroscience, computer vision, and graphics.
The goal of this workshop is to foster interdisciplinary
communication of researchers working on 3D data
(Computer Vision and Computer Graphics) so that
more attention of broader community can be drawn
to 3D deep learning problems. Through those
studies, new ideas and discoveries are expected to
emerge, which can inspire advances in related fields.
This workshop is composed of invited talks, oral
presentations of outstanding submissions and a
poster session to showcase the state-of-the-art
results on the topic. In particular, a panel discussion
among leading researchers in the field is planned, so
as to provide a common playground for inspiring
discussions and stimulating debates.
The workshop will be held on Dec 9 at NIPS 2016 in
Barcelona, Spain. http://3ddl.cs.princeton.edu/2016/
ORGANIZERS
●
Fisher Yu - Princeton University
●
Joseph Lim - Stanford University
●
Matthew Fisher - Stanford University
●
Qixing Huang - University of Texas at Austin
●
Jianxiong Xiao - AutoX Inc.
http://cvpr2017.thecvf.com/ In Honolulu, Hawaii
“I am co-organizing the
2nd Workshop on Visual
Understanding for
Interaction in conjunction
with CVPR 2017. Stay
tuned for the details!”
“Our workshop on Large-
Scale Scene Under-
standing Challenge is
accepted by CVPR 2017.
http://3ddl.cs.princeton.edu/2016/slides/su.pdf
PointNet Deep learning for point cloud classification and segmentation
https://github.com/charlesq34/pointnethttps://arxiv.org/abs/1612.00593
Applications of PointNet. We propose a novel deep net
architecture that consumes raw unordered point cloud (set of
points) without voxelization or rendering.
It is a unified architecture that learns both global and local
point features, providing a simple, efficient and effective
approach for a number of 3D recognition tasks
PointNet Architecture
Our network has three key modules:
1) the max pooling layer as a symmetric function to aggregate information from all the points,
2) a local and global information combination structure,
3) and two joint alignment networks that align both input points and point features.
PointNet symmetry function #1: Multi-layer Perceptron
http://iamaaditya.github.io/2016/03/one-by-one-convolution/
https://github.com/charlesq34/pointnet/blob/master/models/pointnet_cls_basic.py
MLP implented
as 1x1 2D convolution
PointNet symmetry function #2: Max Pooling
https://www.quora.com/How-is-a-convolutional-neural-network-able-to-learn-invariant-features
Jean Da Rolt, PhD, Computer Engineer, Professor: “After some thought, I do not believe that pooling
operations are responsible for the translation invariant property in CNNs. I believe that invariance (at least to
translation) is due to the convolution filters (not specifically the pooling) and due to the fully-connected layer. In
conclusion, what makes a CNN invariant to object translation is the architecture of the neural network: the
convolution filters and the fully-connected layer.”
Artem Rozantsev, PhD Computer Vision & Machine Learning: “In addition to the previous answers,
standard ConvNets are invariant only to transformationas that are present in the training data. However, there are
works, which made a step towards training networks that are inherently invariant to transformations such as
rotation and translation, for example”
https://arxiv.org/abs/1703.00356,
https://arxiv.org/abs/1612.04642
https://arxiv.org/abs/1512.07108
University College London
Ecole Polytechnique Fedérale de Lausanne (EPFL),
Lausanne, Switzerland
Key to our approach is the use of a single
symmetric function, max pooling. E
ffectively the network learns a set of
optimization functions/criteria that select
interesting or informative points of the point
cloud and encode the reason for their selection.
The final fully connected layers of the network
aggregate these learnt optimal values into the
global descriptor for the entire shape as
mentioned above (shape classification) or are
used to predict per point labels (shape
segmentation
PointNet Combination Structure
(pg. 3)
" Therefore, the model needs to be able to capture local structures from nearby points,
and the combinatorial interactions among local structures"
(pg. 4)
" After computing the global point cloud feature vector, we feed it back to per point
features by concatenating the global feature with each of the point features. Then we
extract new per point features based on the combined point features - this time the per
point feature is aware of both the local and global information"
(pg. 8)
"As discussed in Sec 4.2 (pg. 4), our network computes K (we take K = 1024 in this
experiment) dimension point features for each point and aggregates all the *per-point
local features* via a max pooling layer into a single K-dim vector, which forms the global
shape descriptor."
(pg. 13)
"Normal Estimation In segmentation version of PointNet, local point features and global
feature are concatenated in order to provide context to local points. However, it’s unclear
whether the context is learnt through this concatenation. In this experiment, we
validate our design by showing that our segmentation network can be trained to predict
point normals, a local geometric property that is determined by a point’s neighborhood"
PointNet Alignment Network
PointNet: (pg. 1)
"Thus we can add a data-dependent
spatial transformer network that
attempts to canonicalize the data before
the PointNet processes them, so as to
further improve the results."
PointNet: (pg. 4)
However, transformation matrix in the
feature space has much higher dimension
than the spatial transform matrix (e.g.
from 3 × 3 to 64 × 64), which greatly
increase the difficulty of optimization. We
therefore add a regularization term to
our softmax training loss. We constraint
the feature transformation matrix to be
close to orthogonal matrix.
We find that by adding the regularization
term, the optimization becomes more
stable and our model achieves better
performance.
In Fig 15 we see that performance grows as we increase the
number of points however it saturates at around 1K points.
The max layer size plays an important role, increasing the layer
size from 64 to 1024 results in a 2−4% performance gain. It
indicates that we need enough point feature functions to cover
the 3D space in order to discriminate different shapes.
PointNet Modifications input data,increase dimensionality?
PointNet: (pg. 1)
"In the basic setting each point is represented by
just its three coordinates (x, y, z). Additional
dimensions may be added by computing normals
and other local or global features."
Data columns: x, y, z, red, green, blue, no normals
Pointclouds canbe huge
https://www.we-get-around.com/wegetaround-
atlanta-our-blog/2015/10/cubicasa-creates-
2d-and-3d-floor-plans-for-matterport-photo
graphers-from-3d-showcase-tours
6-dimensional inputdata
With the x,y,z coordinates one
obtains also R,G,B values (or CIE LAB
colorspace) that are very useful in
segmenting objects.
7-dimensional inputdata
Normals could be obtained too if the
camera position were known
Eurographics Symposium on Geometry Processing 2016, Volume 35
(2016), Number 5 http://dx.doi.org/10.1111/cgf.12983
PointNet: (pg. 13)
PointNet Modifications Architecture #1: Uncertainty estimation?
https://arxiv.org/pdf/1703.04977.pdf
http://mlg.eng.cam.ac.uk/yarin/blog_3d801aa532c1ce.html
[in classification
pipeline only] not in
segmentation part
PointNet Modifications Architecture #2: component variations?
Nonlinearity Pooling Layer Normalization
In order to make a model invariant to input
permutation, the authors use max pooling
as the simple symmetric function to
aggregate the information from each point.
[in classification[ All layers, except the last
one, include ReLU and batch normalization.
[in classification[ All layers, except the last
one, include ReLU and batch normalization.
http://arxiv.org/abs/1604.04112
“One possible future line of work is to embed the network in its
entirety in the frequency domain. In models that employ Fourier
transforms to compute convolutions, at every convolutional layer
the input is FFT-ed and the elementwise multiplication output is
then inverse FFT-ed. These back-andforth transformations are very
computationally intensive, and as such it would be desirable to
strictly remain in the frequency domain. However, the reason for
these repeated transformations is the application of nonlinearities
in the forward domain: if one were to propose a sensible
nonlinearity in the frequency domain, this would spare us from
the incessant domain switching.”
Our reparameterization is inspired by batch
normalization but does not introduce any
dependencies between the examples in a
minibatch. This means that our method can also
be applied successfully to recurrent models such
as LSTMs and to noise-sensitive applications
such as deep reinforcement learning or
generative models, for which batch
normalization is less well suited.
https://arxiv.org/abs/1602.07868
https://arxiv.org/abs/1605.09332
http://arxiv.org/abs/1512.07108
PointNet Modifications Architecture #3: Unsupervised/Semi-supervised extensions?
1 von 13

Recomendados

[DL輪読会]PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metr... von
[DL輪読会]PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metr...[DL輪読会]PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metr...
[DL輪読会]PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metr...Deep Learning JP
4.6K views18 Folien
3D Point Cloud analysis using Deep Learning von
3D Point Cloud analysis using Deep Learning3D Point Cloud analysis using Deep Learning
3D Point Cloud analysis using Deep LearningData Science Milan
882 views24 Folien
Image Restoration for 3D Computer Vision von
Image Restoration for 3D Computer VisionImage Restoration for 3D Computer Vision
Image Restoration for 3D Computer VisionResearch Fellow
2.5K views74 Folien
Deep learning for 3 d point clouds presentation von
Deep learning for 3 d point clouds presentationDeep learning for 3 d point clouds presentation
Deep learning for 3 d point clouds presentationVijaylaxmiNagurkar
358 views22 Folien
グラフデータ分析 入門編 von
グラフデータ分析 入門編グラフデータ分析 入門編
グラフデータ分析 入門編順也 山口
34.9K views65 Folien
[DL輪読会]PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection von
[DL輪読会]PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection[DL輪読会]PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
[DL輪読会]PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object DetectionDeep Learning JP
1.4K views33 Folien

Más contenido relacionado

Was ist angesagt?

Point net von
Point netPoint net
Point netFujimoto Keisuke
17.9K views29 Folien
論文紹介:Grad-CAM: Visual explanations from deep networks via gradient-based loca... von
論文紹介:Grad-CAM: Visual explanations from deep networks via gradient-based loca...論文紹介:Grad-CAM: Visual explanations from deep networks via gradient-based loca...
論文紹介:Grad-CAM: Visual explanations from deep networks via gradient-based loca...Kazuki Adachi
5.2K views23 Folien
U-Net: Convolutional Networks for Biomedical Image Segmentationの紹介 von
U-Net: Convolutional Networks for Biomedical Image Segmentationの紹介U-Net: Convolutional Networks for Biomedical Image Segmentationの紹介
U-Net: Convolutional Networks for Biomedical Image Segmentationの紹介KCS Keio Computer Society
50.4K views11 Folien
3D reconstruction von
3D reconstruction3D reconstruction
3D reconstructionJorge Leandro, Ph.D.
967 views8 Folien
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis von
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisDeep Learning JP
4.6K views23 Folien
3.4 density and grid methods von
3.4 density and grid methods3.4 density and grid methods
3.4 density and grid methodsKrish_ver2
12.9K views26 Folien

Was ist angesagt?(20)

論文紹介:Grad-CAM: Visual explanations from deep networks via gradient-based loca... von Kazuki Adachi
論文紹介:Grad-CAM: Visual explanations from deep networks via gradient-based loca...論文紹介:Grad-CAM: Visual explanations from deep networks via gradient-based loca...
論文紹介:Grad-CAM: Visual explanations from deep networks via gradient-based loca...
Kazuki Adachi5.2K views
U-Net: Convolutional Networks for Biomedical Image Segmentationの紹介 von KCS Keio Computer Society
U-Net: Convolutional Networks for Biomedical Image Segmentationの紹介U-Net: Convolutional Networks for Biomedical Image Segmentationの紹介
U-Net: Convolutional Networks for Biomedical Image Segmentationの紹介
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis von Deep Learning JP
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Deep Learning JP4.6K views
3.4 density and grid methods von Krish_ver2
3.4 density and grid methods3.4 density and grid methods
3.4 density and grid methods
Krish_ver212.9K views
三次元点群を取り扱うニューラルネットワークのサーベイ von Naoya Chiba
三次元点群を取り扱うニューラルネットワークのサーベイ三次元点群を取り扱うニューラルネットワークのサーベイ
三次元点群を取り扱うニューラルネットワークのサーベイ
Naoya Chiba20.1K views
3次元レジストレーションの基礎とOpen3Dを用いた3次元点群処理 von Toru Tamaki
3次元レジストレーションの基礎とOpen3Dを用いた3次元点群処理3次元レジストレーションの基礎とOpen3Dを用いた3次元点群処理
3次元レジストレーションの基礎とOpen3Dを用いた3次元点群処理
Toru Tamaki9.9K views
Lec13: Clustering Based Medical Image Segmentation Methods von Ulaş Bağcı
Lec13: Clustering Based Medical Image Segmentation MethodsLec13: Clustering Based Medical Image Segmentation Methods
Lec13: Clustering Based Medical Image Segmentation Methods
Ulaş Bağcı1.6K views
文献紹介:Simple Copy-Paste Is a Strong Data Augmentation Method for Instance Segm... von Toru Tamaki
文献紹介:Simple Copy-Paste Is a Strong Data Augmentation Method for Instance Segm...文献紹介:Simple Copy-Paste Is a Strong Data Augmentation Method for Instance Segm...
文献紹介:Simple Copy-Paste Is a Strong Data Augmentation Method for Instance Segm...
Toru Tamaki185 views
Go-ICP: グローバル最適(Globally optimal) なICPの解説 von Yusuke Sekikawa
Go-ICP: グローバル最適(Globally optimal) なICPの解説Go-ICP: グローバル最適(Globally optimal) なICPの解説
Go-ICP: グローバル最適(Globally optimal) なICPの解説
Yusuke Sekikawa13.9K views
【DL輪読会】Aspect-based Analysis of Advertising Appeals for Search Engine Advert... von Deep Learning JP
【DL輪読会】Aspect-based Analysis of Advertising Appeals for Search  Engine Advert...【DL輪読会】Aspect-based Analysis of Advertising Appeals for Search  Engine Advert...
【DL輪読会】Aspect-based Analysis of Advertising Appeals for Search Engine Advert...
Deep Learning JP486 views
[DL輪読会]BANMo: Building Animatable 3D Neural Models from Many Casual Videos von Deep Learning JP
[DL輪読会]BANMo: Building Animatable 3D Neural Models from Many Casual Videos[DL輪読会]BANMo: Building Animatable 3D Neural Models from Many Casual Videos
[DL輪読会]BANMo: Building Animatable 3D Neural Models from Many Casual Videos
Deep Learning JP726 views
点群深層学習 Meta-study von Naoya Chiba
点群深層学習 Meta-study点群深層学習 Meta-study
点群深層学習 Meta-study
Naoya Chiba10.4K views
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis von Kento Doi
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Kento Doi2.8K views
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks von Deep Learning JP
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Deep Learning JP5.2K views
[DL輪読会]Relational inductive biases, deep learning, and graph networks von Deep Learning JP
[DL輪読会]Relational inductive biases, deep learning, and graph networks[DL輪読会]Relational inductive biases, deep learning, and graph networks
[DL輪読会]Relational inductive biases, deep learning, and graph networks
Deep Learning JP4.1K views
Webinar on Graph Neural Networks von LucaCrociani1
Webinar on Graph Neural NetworksWebinar on Graph Neural Networks
Webinar on Graph Neural Networks
LucaCrociani1129 views
Machine learning with graph von Ding Li
Machine learning with graphMachine learning with graph
Machine learning with graph
Ding Li199 views
CVPR2019 読み会「Understanding the Limitations of CNN-based Absolute Camera Pose ... von Sho Kagami
CVPR2019 読み会「Understanding the Limitations of CNN-based Absolute Camera Pose ...CVPR2019 読み会「Understanding the Limitations of CNN-based Absolute Camera Pose ...
CVPR2019 読み会「Understanding the Limitations of CNN-based Absolute Camera Pose ...
Sho Kagami1.3K views

Similar a PointNet

RunPool: A Dynamic Pooling Layer for Convolution Neural Network von
RunPool: A Dynamic Pooling Layer for Convolution Neural NetworkRunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural NetworkPutra Wanda
58 views11 Folien
Portfolio von
PortfolioPortfolio
PortfolioIvan Khomyakov
494 views12 Folien
Partial Object Detection in Inclined Weather Conditions von
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsIRJET Journal
4 views5 Folien
Laplacian-regularized Graph Bandits von
Laplacian-regularized Graph BanditsLaplacian-regularized Graph Bandits
Laplacian-regularized Graph Banditslauratoni4
122 views51 Folien
Learning Graph Representation for Data-Efficiency RL von
Learning Graph Representation for Data-Efficiency RLLearning Graph Representation for Data-Efficiency RL
Learning Graph Representation for Data-Efficiency RLlauratoni4
135 views59 Folien
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa... von
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...Seiya Ito
1.2K views24 Folien

Similar a PointNet(20)

RunPool: A Dynamic Pooling Layer for Convolution Neural Network von Putra Wanda
RunPool: A Dynamic Pooling Layer for Convolution Neural NetworkRunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
Putra Wanda58 views
Partial Object Detection in Inclined Weather Conditions von IRJET Journal
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
IRJET Journal4 views
Laplacian-regularized Graph Bandits von lauratoni4
Laplacian-regularized Graph BanditsLaplacian-regularized Graph Bandits
Laplacian-regularized Graph Bandits
lauratoni4122 views
Learning Graph Representation for Data-Efficiency RL von lauratoni4
Learning Graph Representation for Data-Efficiency RLLearning Graph Representation for Data-Efficiency RL
Learning Graph Representation for Data-Efficiency RL
lauratoni4135 views
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa... von Seiya Ito
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
Seiya Ito1.2K views
Development Infographic von RealMassive
Development InfographicDevelopment Infographic
Development Infographic
RealMassive170 views
Garbage Classification Using Deep Learning Techniques von IRJET Journal
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning Techniques
IRJET Journal290 views
IRJET- Weakly Supervised Object Detection by using Fast R-CNN von IRJET Journal
IRJET- Weakly Supervised Object Detection by using Fast R-CNNIRJET- Weakly Supervised Object Detection by using Fast R-CNN
IRJET- Weakly Supervised Object Detection by using Fast R-CNN
IRJET Journal27 views
Efficient Point Cloud Pre-processing using The Point Cloud Library von CSCJournals
Efficient Point Cloud Pre-processing using The Point Cloud LibraryEfficient Point Cloud Pre-processing using The Point Cloud Library
Efficient Point Cloud Pre-processing using The Point Cloud Library
CSCJournals390 views
Efficient Point Cloud Pre-processing using The Point Cloud Library von CSCJournals
Efficient Point Cloud Pre-processing using The Point Cloud LibraryEfficient Point Cloud Pre-processing using The Point Cloud Library
Efficient Point Cloud Pre-processing using The Point Cloud Library
CSCJournals319 views
Density Based Clustering Approach for Solving the Software Component Restruct... von IRJET Journal
Density Based Clustering Approach for Solving the Software Component Restruct...Density Based Clustering Approach for Solving the Software Component Restruct...
Density Based Clustering Approach for Solving the Software Component Restruct...
IRJET Journal27 views
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G... von miyurud
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
miyurud240 views

Más de Research Fellow

Intracerebral Hemorrhage (ICH): Understanding the CT imaging features von
Intracerebral Hemorrhage (ICH): Understanding the CT imaging featuresIntracerebral Hemorrhage (ICH): Understanding the CT imaging features
Intracerebral Hemorrhage (ICH): Understanding the CT imaging featuresResearch Fellow
434 views285 Folien
Hand Pose Tracking for Clinical Applications von
Hand Pose Tracking for Clinical ApplicationsHand Pose Tracking for Clinical Applications
Hand Pose Tracking for Clinical ApplicationsResearch Fellow
354 views95 Folien
Precision Physiotherapy & Sports Training: Part 1 von
Precision Physiotherapy & Sports Training: Part 1Precision Physiotherapy & Sports Training: Part 1
Precision Physiotherapy & Sports Training: Part 1Research Fellow
451 views293 Folien
Multimodal RGB-D+RF-based sensing for human movement analysis von
Multimodal RGB-D+RF-based sensing for human movement analysisMultimodal RGB-D+RF-based sensing for human movement analysis
Multimodal RGB-D+RF-based sensing for human movement analysisResearch Fellow
783 views63 Folien
Creativity as Science: What designers can learn from science and technology von
Creativity as Science: What designers can learn from science and technologyCreativity as Science: What designers can learn from science and technology
Creativity as Science: What designers can learn from science and technologyResearch Fellow
766 views99 Folien
Light Treatment Glasses von
Light Treatment GlassesLight Treatment Glasses
Light Treatment GlassesResearch Fellow
614 views77 Folien

Más de Research Fellow(20)

Intracerebral Hemorrhage (ICH): Understanding the CT imaging features von Research Fellow
Intracerebral Hemorrhage (ICH): Understanding the CT imaging featuresIntracerebral Hemorrhage (ICH): Understanding the CT imaging features
Intracerebral Hemorrhage (ICH): Understanding the CT imaging features
Research Fellow434 views
Hand Pose Tracking for Clinical Applications von Research Fellow
Hand Pose Tracking for Clinical ApplicationsHand Pose Tracking for Clinical Applications
Hand Pose Tracking for Clinical Applications
Research Fellow354 views
Precision Physiotherapy & Sports Training: Part 1 von Research Fellow
Precision Physiotherapy & Sports Training: Part 1Precision Physiotherapy & Sports Training: Part 1
Precision Physiotherapy & Sports Training: Part 1
Research Fellow451 views
Multimodal RGB-D+RF-based sensing for human movement analysis von Research Fellow
Multimodal RGB-D+RF-based sensing for human movement analysisMultimodal RGB-D+RF-based sensing for human movement analysis
Multimodal RGB-D+RF-based sensing for human movement analysis
Research Fellow783 views
Creativity as Science: What designers can learn from science and technology von Research Fellow
Creativity as Science: What designers can learn from science and technologyCreativity as Science: What designers can learn from science and technology
Creativity as Science: What designers can learn from science and technology
Research Fellow766 views
Multiphoton Vasculature Segmentation #7: Adding time and annotation von Research Fellow
Multiphoton Vasculature Segmentation #7: Adding time and annotationMultiphoton Vasculature Segmentation #7: Adding time and annotation
Multiphoton Vasculature Segmentation #7: Adding time and annotation
Research Fellow470 views
Multiphoton Vasculature Segmentation #6: GANs and Semi-Supervised learning wi... von Research Fellow
Multiphoton Vasculature Segmentation #6: GANs and Semi-Supervised learning wi...Multiphoton Vasculature Segmentation #6: GANs and Semi-Supervised learning wi...
Multiphoton Vasculature Segmentation #6: GANs and Semi-Supervised learning wi...
Research Fellow510 views
Multiphoton Vasculature Segmentation #5: U-Net von Research Fellow
Multiphoton Vasculature Segmentation #5: U-NetMultiphoton Vasculature Segmentation #5: U-Net
Multiphoton Vasculature Segmentation #5: U-Net
Research Fellow1.9K views
Multiphoton Vasculature Segmentation #3: Vascular Segmentation Intro von Research Fellow
Multiphoton Vasculature Segmentation #3: Vascular Segmentation IntroMultiphoton Vasculature Segmentation #3: Vascular Segmentation Intro
Multiphoton Vasculature Segmentation #3: Vascular Segmentation Intro
Research Fellow304 views
Multiphoton Vasculature Segmentation #2: Image Restoration von Research Fellow
Multiphoton Vasculature Segmentation #2: Image RestorationMultiphoton Vasculature Segmentation #2: Image Restoration
Multiphoton Vasculature Segmentation #2: Image Restoration
Research Fellow273 views
Multiphoton Vasculature Segmentation #1: Intro to Microscopy von Research Fellow
Multiphoton Vasculature Segmentation #1: Intro to MicroscopyMultiphoton Vasculature Segmentation #1: Intro to Microscopy
Multiphoton Vasculature Segmentation #1: Intro to Microscopy
Research Fellow386 views
Deep Learning for Biomedical Unstructured Time Series von Research Fellow
Deep Learning for Biomedical  Unstructured Time SeriesDeep Learning for Biomedical  Unstructured Time Series
Deep Learning for Biomedical Unstructured Time Series
Research Fellow1.4K views
Instrumentation for in vivo intravital microscopy von Research Fellow
Instrumentation for in vivo intravital microscopyInstrumentation for in vivo intravital microscopy
Instrumentation for in vivo intravital microscopy
Research Fellow1.1K views
Optical Designs for Fundus Cameras von Research Fellow
Optical Designs for Fundus CamerasOptical Designs for Fundus Cameras
Optical Designs for Fundus Cameras
Research Fellow5.9K views
Beyond Broken Stick Modeling: R Tutorial for interpretable multivariate analysis von Research Fellow
Beyond Broken Stick Modeling: R Tutorial for interpretable multivariate analysisBeyond Broken Stick Modeling: R Tutorial for interpretable multivariate analysis
Beyond Broken Stick Modeling: R Tutorial for interpretable multivariate analysis
Research Fellow776 views

Último

December 2023 Featured Portfolio von
 December 2023 Featured Portfolio December 2023 Featured Portfolio
December 2023 Featured PortfolioListing Turkey
19 views63 Folien
Upcoming Luxury Project Sector 37D Gurgaon Dwarka Expressway 8826997781 von
Upcoming Luxury Project Sector 37D Gurgaon Dwarka Expressway 8826997781Upcoming Luxury Project Sector 37D Gurgaon Dwarka Expressway 8826997781
Upcoming Luxury Project Sector 37D Gurgaon Dwarka Expressway 8826997781ApartmentWala1
71 views69 Folien
Omaxe Chandigarh.pdf von
Omaxe Chandigarh.pdfOmaxe Chandigarh.pdf
Omaxe Chandigarh.pdfashiyadav24
10 views13 Folien
Barriers to Innovation in Net-Zero Energy von
Barriers to Innovation in Net-Zero EnergyBarriers to Innovation in Net-Zero Energy
Barriers to Innovation in Net-Zero EnergyDerek Satnik
7 views12 Folien
2023 NAR Profile of Home Buyers and Sellers - Big Changes & Some Stability von
2023 NAR Profile of Home Buyers and Sellers - Big Changes & Some Stability2023 NAR Profile of Home Buyers and Sellers - Big Changes & Some Stability
2023 NAR Profile of Home Buyers and Sellers - Big Changes & Some StabilityTom Blefko
20 views62 Folien
Presentation Value Evolution of Offices in Belgium November 2023.pptx von
Presentation Value Evolution of Offices in Belgium November 2023.pptxPresentation Value Evolution of Offices in Belgium November 2023.pptx
Presentation Value Evolution of Offices in Belgium November 2023.pptxKoen Batsleer
51 views47 Folien

Último(18)

December 2023 Featured Portfolio von Listing Turkey
 December 2023 Featured Portfolio December 2023 Featured Portfolio
December 2023 Featured Portfolio
Listing Turkey19 views
Upcoming Luxury Project Sector 37D Gurgaon Dwarka Expressway 8826997781 von ApartmentWala1
Upcoming Luxury Project Sector 37D Gurgaon Dwarka Expressway 8826997781Upcoming Luxury Project Sector 37D Gurgaon Dwarka Expressway 8826997781
Upcoming Luxury Project Sector 37D Gurgaon Dwarka Expressway 8826997781
ApartmentWala171 views
Barriers to Innovation in Net-Zero Energy von Derek Satnik
Barriers to Innovation in Net-Zero EnergyBarriers to Innovation in Net-Zero Energy
Barriers to Innovation in Net-Zero Energy
Derek Satnik7 views
2023 NAR Profile of Home Buyers and Sellers - Big Changes & Some Stability von Tom Blefko
2023 NAR Profile of Home Buyers and Sellers - Big Changes & Some Stability2023 NAR Profile of Home Buyers and Sellers - Big Changes & Some Stability
2023 NAR Profile of Home Buyers and Sellers - Big Changes & Some Stability
Tom Blefko20 views
Presentation Value Evolution of Offices in Belgium November 2023.pptx von Koen Batsleer
Presentation Value Evolution of Offices in Belgium November 2023.pptxPresentation Value Evolution of Offices in Belgium November 2023.pptx
Presentation Value Evolution of Offices in Belgium November 2023.pptx
Koen Batsleer51 views
4BHK+SQ Price 5.99Cr* Coming Soon Bptp Sector 37D Gurgaon 8826997780 von ApartmentWala1
4BHK+SQ Price 5.99Cr* Coming Soon Bptp Sector 37D Gurgaon 8826997780 4BHK+SQ Price 5.99Cr* Coming Soon Bptp Sector 37D Gurgaon 8826997780
4BHK+SQ Price 5.99Cr* Coming Soon Bptp Sector 37D Gurgaon 8826997780
ApartmentWala170 views
Bptp 4BHK+SQ Luxury Residential Apartments Sector 37D Gurgaon - 8826997780 von ApartmentWala1
Bptp 4BHK+SQ Luxury Residential Apartments Sector 37D Gurgaon - 8826997780 Bptp 4BHK+SQ Luxury Residential Apartments Sector 37D Gurgaon - 8826997780
Bptp 4BHK+SQ Luxury Residential Apartments Sector 37D Gurgaon - 8826997780
ApartmentWala183 views
Oberoi Forestville Thane Brochure.pdf von Babyrudram
Oberoi Forestville Thane Brochure.pdfOberoi Forestville Thane Brochure.pdf
Oberoi Forestville Thane Brochure.pdf
Babyrudram10 views
Bptp Is Coming Soon With Ultra Luxury Project Bang On Dwarka Expressway - 88... von ApartmentWala1
Bptp Is Coming Soon With Ultra Luxury Project Bang On  Dwarka Expressway - 88...Bptp Is Coming Soon With Ultra Luxury Project Bang On  Dwarka Expressway - 88...
Bptp Is Coming Soon With Ultra Luxury Project Bang On Dwarka Expressway - 88...
ApartmentWala130 views
Experion Sector 42 Gurgaon.pdf von ashiyadav24
Experion Sector 42 Gurgaon.pdfExperion Sector 42 Gurgaon.pdf
Experion Sector 42 Gurgaon.pdf
ashiyadav248 views

PointNet

  • 2. Implementation Initial ‘deep learning’ idea .XYZ point cloud better than the reconstructed .obj file for automatic segmentation due to higher resolution InputPointCloud 3D CAD MODEL No need to have planar surfaces Sampled too densely www.outsource3dcadmodeling.com 2DCAD MODEL Straightforward from 3D to 2D cadcrowd.com RECONSTRUCT 3D “Deep Learning” 3DSemantic Segmentation frompointcloud / reconstructed mesh youtube.com/watch?v=cGuoyNY54kU arxiv.org/1608.04236 Primitive-based deep learning segmentation The order between semantic segmentation and reconstruction could be swapped
  • 3. NIPS 2016: 3D Workshop very early still for point cloud pipelines compared to “ordered images” Deep learning is proven to be a powerful tool to build models for language (one-dimensional) and image (two-dimensional) understanding. Tremendous efforts have been devoted to these areas, however, it is still at the early stage to apply deep learning to 3D data, despite their great research values and broad real- world applications. In particular, existing methods poorly serve the three-dimensional data that drives a broad range of critical applications such as augmented reality, autonomous driving, graphics, robotics, medical imaging, neuroscience, and scientific simulations. These problems have drawn the attention of researchers in different fields such as neuroscience, computer vision, and graphics. The goal of this workshop is to foster interdisciplinary communication of researchers working on 3D data (Computer Vision and Computer Graphics) so that more attention of broader community can be drawn to 3D deep learning problems. Through those studies, new ideas and discoveries are expected to emerge, which can inspire advances in related fields. This workshop is composed of invited talks, oral presentations of outstanding submissions and a poster session to showcase the state-of-the-art results on the topic. In particular, a panel discussion among leading researchers in the field is planned, so as to provide a common playground for inspiring discussions and stimulating debates. The workshop will be held on Dec 9 at NIPS 2016 in Barcelona, Spain. http://3ddl.cs.princeton.edu/2016/ ORGANIZERS ● Fisher Yu - Princeton University ● Joseph Lim - Stanford University ● Matthew Fisher - Stanford University ● Qixing Huang - University of Texas at Austin ● Jianxiong Xiao - AutoX Inc. http://cvpr2017.thecvf.com/ In Honolulu, Hawaii “I am co-organizing the 2nd Workshop on Visual Understanding for Interaction in conjunction with CVPR 2017. Stay tuned for the details!” “Our workshop on Large- Scale Scene Under- standing Challenge is accepted by CVPR 2017. http://3ddl.cs.princeton.edu/2016/slides/su.pdf
  • 4. PointNet Deep learning for point cloud classification and segmentation https://github.com/charlesq34/pointnethttps://arxiv.org/abs/1612.00593 Applications of PointNet. We propose a novel deep net architecture that consumes raw unordered point cloud (set of points) without voxelization or rendering. It is a unified architecture that learns both global and local point features, providing a simple, efficient and effective approach for a number of 3D recognition tasks
  • 5. PointNet Architecture Our network has three key modules: 1) the max pooling layer as a symmetric function to aggregate information from all the points, 2) a local and global information combination structure, 3) and two joint alignment networks that align both input points and point features.
  • 6. PointNet symmetry function #1: Multi-layer Perceptron http://iamaaditya.github.io/2016/03/one-by-one-convolution/ https://github.com/charlesq34/pointnet/blob/master/models/pointnet_cls_basic.py MLP implented as 1x1 2D convolution
  • 7. PointNet symmetry function #2: Max Pooling https://www.quora.com/How-is-a-convolutional-neural-network-able-to-learn-invariant-features Jean Da Rolt, PhD, Computer Engineer, Professor: “After some thought, I do not believe that pooling operations are responsible for the translation invariant property in CNNs. I believe that invariance (at least to translation) is due to the convolution filters (not specifically the pooling) and due to the fully-connected layer. In conclusion, what makes a CNN invariant to object translation is the architecture of the neural network: the convolution filters and the fully-connected layer.” Artem Rozantsev, PhD Computer Vision & Machine Learning: “In addition to the previous answers, standard ConvNets are invariant only to transformationas that are present in the training data. However, there are works, which made a step towards training networks that are inherently invariant to transformations such as rotation and translation, for example” https://arxiv.org/abs/1703.00356, https://arxiv.org/abs/1612.04642 https://arxiv.org/abs/1512.07108 University College London Ecole Polytechnique Fedérale de Lausanne (EPFL), Lausanne, Switzerland Key to our approach is the use of a single symmetric function, max pooling. E ffectively the network learns a set of optimization functions/criteria that select interesting or informative points of the point cloud and encode the reason for their selection. The final fully connected layers of the network aggregate these learnt optimal values into the global descriptor for the entire shape as mentioned above (shape classification) or are used to predict per point labels (shape segmentation
  • 8. PointNet Combination Structure (pg. 3) " Therefore, the model needs to be able to capture local structures from nearby points, and the combinatorial interactions among local structures" (pg. 4) " After computing the global point cloud feature vector, we feed it back to per point features by concatenating the global feature with each of the point features. Then we extract new per point features based on the combined point features - this time the per point feature is aware of both the local and global information" (pg. 8) "As discussed in Sec 4.2 (pg. 4), our network computes K (we take K = 1024 in this experiment) dimension point features for each point and aggregates all the *per-point local features* via a max pooling layer into a single K-dim vector, which forms the global shape descriptor." (pg. 13) "Normal Estimation In segmentation version of PointNet, local point features and global feature are concatenated in order to provide context to local points. However, it’s unclear whether the context is learnt through this concatenation. In this experiment, we validate our design by showing that our segmentation network can be trained to predict point normals, a local geometric property that is determined by a point’s neighborhood"
  • 9. PointNet Alignment Network PointNet: (pg. 1) "Thus we can add a data-dependent spatial transformer network that attempts to canonicalize the data before the PointNet processes them, so as to further improve the results." PointNet: (pg. 4) However, transformation matrix in the feature space has much higher dimension than the spatial transform matrix (e.g. from 3 × 3 to 64 × 64), which greatly increase the difficulty of optimization. We therefore add a regularization term to our softmax training loss. We constraint the feature transformation matrix to be close to orthogonal matrix. We find that by adding the regularization term, the optimization becomes more stable and our model achieves better performance. In Fig 15 we see that performance grows as we increase the number of points however it saturates at around 1K points. The max layer size plays an important role, increasing the layer size from 64 to 1024 results in a 2−4% performance gain. It indicates that we need enough point feature functions to cover the 3D space in order to discriminate different shapes.
  • 10. PointNet Modifications input data,increase dimensionality? PointNet: (pg. 1) "In the basic setting each point is represented by just its three coordinates (x, y, z). Additional dimensions may be added by computing normals and other local or global features." Data columns: x, y, z, red, green, blue, no normals Pointclouds canbe huge https://www.we-get-around.com/wegetaround- atlanta-our-blog/2015/10/cubicasa-creates- 2d-and-3d-floor-plans-for-matterport-photo graphers-from-3d-showcase-tours 6-dimensional inputdata With the x,y,z coordinates one obtains also R,G,B values (or CIE LAB colorspace) that are very useful in segmenting objects. 7-dimensional inputdata Normals could be obtained too if the camera position were known Eurographics Symposium on Geometry Processing 2016, Volume 35 (2016), Number 5 http://dx.doi.org/10.1111/cgf.12983 PointNet: (pg. 13)
  • 11. PointNet Modifications Architecture #1: Uncertainty estimation? https://arxiv.org/pdf/1703.04977.pdf http://mlg.eng.cam.ac.uk/yarin/blog_3d801aa532c1ce.html [in classification pipeline only] not in segmentation part
  • 12. PointNet Modifications Architecture #2: component variations? Nonlinearity Pooling Layer Normalization In order to make a model invariant to input permutation, the authors use max pooling as the simple symmetric function to aggregate the information from each point. [in classification[ All layers, except the last one, include ReLU and batch normalization. [in classification[ All layers, except the last one, include ReLU and batch normalization. http://arxiv.org/abs/1604.04112 “One possible future line of work is to embed the network in its entirety in the frequency domain. In models that employ Fourier transforms to compute convolutions, at every convolutional layer the input is FFT-ed and the elementwise multiplication output is then inverse FFT-ed. These back-andforth transformations are very computationally intensive, and as such it would be desirable to strictly remain in the frequency domain. However, the reason for these repeated transformations is the application of nonlinearities in the forward domain: if one were to propose a sensible nonlinearity in the frequency domain, this would spare us from the incessant domain switching.” Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. https://arxiv.org/abs/1602.07868 https://arxiv.org/abs/1605.09332 http://arxiv.org/abs/1512.07108
  • 13. PointNet Modifications Architecture #3: Unsupervised/Semi-supervised extensions?