Structure and Motion - 3D Reconstruction of Cameras and Structure
1. Structure
and
Mo-on
3D
Reconstruc-on
of
Cameras
and
Structure
Vision and Perception 2009/2010
Prof.ssa Maria Fiora Pirri
- Filippo Bianchi - Murru Giovanni
- Mariani Marco - Ligios Antonello
2. Assignment
The goal is the estimation of the structure of a
3D object (e.g. a 3D point- cloud) and the
motion on the cameras acquiring it.
Reconstruction is required for two data sets:
- the public collection of images available
here http://cs.gmu.edu/
%7Ekosecka/oldhouse2.zip
- a data set acquired by you (e.g. using a
simple camera)
A sparse reconstruction is sufficient; a metric
reconstruction is required.
3. Problems to be solved
• Derive the structure of a scene from a sequence
of images
• Estimation of Motion of the camera acquiring the
Structure
• Reconstruction for two Data Sets
– a
public
collec+on
of
images
– a
personal
dataset
4. Troubleshooting (1/2)
How to solve the problem for two images:
Objective *
Given two uncalibrated images compute a metric reconstruction (PM, P’M, {XMi}) of the
cameras and scene structure, i.e. a reconstruction that is within a similarity
transformation of the true cameras and scene structure.
Algorithm
(i) Compute a projective reconstruction (P, P', {Xi}):
(a) Compute the fundamental matrix from point correspondences xi <-> x’i
between the images.
(b) Camera retrieval: compute the camera matrices P, P' from the fundamental
matrix.
(c) Triangulation: for each point correspondence xi <-> x’i, compute the point Xi
in space that projects to these two image points.
(ii) Rectify the projective reconstruction to metric:
• either Direct method: Compute the homography H such that XEi = HXi from five or
more ground control points XEi with known Euclidean positions. Then the metric
reconstruction is
PM=PH-1, P‘M=P‘H-1, XMi=HXi
5. Troubleshoo-ng
(2/2)
• or Stratified method:
(a) Affine reconstruction: Compute the plane at infinity, π∞, then upgrade the
projective reconstruction to an affine reconstruction with the homography
(b) Metric reconstruction: Compute the image of the absolute conic, ω,
and then upgrade the affine reconstruction to a metric reconstruction with
the homography
where A is obtained by Cholesky factorization from the equation AAT =
(MTωM)-1, and M is the first 3x3 submatrix of the camera in the affine
reconstruction for which ω is computed.
*Multiple View Geometry, Hurtley & Zisserman
7. Example
of
Euclidian
reconstruc9on
using
the
set
of
Rubik
•
This
is
an
example
of
the
Euclidian
reconstruc-on
performed
using
the
Rubik
set.
•
The
points
are
taken
manually
in
order
to
verify
the
correctness
of
the
output.
•
The
output
Euclidian
represents
the
points
aGer
the
the
projec-ve,
affine
and
metric
reconstruc-ons
are
performed.
8. What
is
different
in
our
case?
(1/2)
• we don’t have two images, but a set.
• we have to use stratified method
Ø transition from the projective
reconstruction to metric reconstruction
through the affine one
9. What
is
different
in
our
case?
(2/2)
• The approach to reconstruction is to begin with a
projective reconstruction and then to refine it
progressively to an affine and finally a metric
reconstruction, if possible.
• Of course, affine and metric reconstruction are not
possible without further information about the scene.
10. Projec-ve
Reconstruc-on
Projective reconstruction can be divided into three
phases:
• Compute the fundamental Matrix from point
correspondences
• Compute the camera matrices P and P1 from the
fundamental matrix
• Triangulation
12. Image
acquisi-on
• This method reads the images from a given folder in the
order in which they are. The images are read directly
with the function imread and saved in a cell array.
• Order :
In old House the images are ordered, but it is not always.
Then to resolve it we use a particular function:
getCorrOrd(instead of getCorr)
13. getCorrOrd
• sorts the images according to correspondence found by
the match function, which does a match on keypoints
extracted from Sift
• Result: I have pictures ordered by match
• images similar to each other are adjacent
• side effects :
• computational time increases
• small details may be lost at the expense of greater
precision on the most significant elements of the image
14. Correspondence
Match
— The SIFT algorithm is used to find the keypoints, which are
definted by:
— Invariant descriptor of each keypoint
— Localization (vector containing the row, column, orientation and scale of the
keypoint)
— SIFT extract features from all images and memorized in two
vectors.
— For each new image the features are compared with
precedent (in the order estabilished) image’s features,
previously memorized in the vectors.
— The comparison is realized using dot products between unit
vector rather than Euclidean distances.
— The correspondence between the images is positive only
when the ratio of vector angles (inverse cosine of the dot prod
of descriptors features) from the nearest to second nearest
neighbor is less than 0.6
15. SIFT
vs
HARRIS
Need to find points to be correlated in different images
• SIFT : greater robustness to rotation, translation, scaling
and changes in brightness.With the implementation
Lowe in C (Matlab interface) improves the computational
complexity
• HARRIS : computationally easy to find, little strength,
high number of false matches
17. Correspondence Filtering
— The correspondences are filtered using RANSAC algorithm.
— RANdom SAmple Consensus
The key-points correspondences
determined with the matching algorithm,
can be divided in outliers and inliers. The
outliers are discarded because of their
numeric distance from the rest of the
data.
RANSAC
uses
a
threshold
parameter.
This
is
chosen
empirically,
a
small
value
implies
greater
precision
because
more
outliers
will
be
discarded
18. Fundamental Matrix F
• Is the algebraic representation of epipolar geometry since Fx
describes the line on which the corresponding point x’ on the other
image must lie.
• It correlates each point correspondence between two images.
• The algorithm implemented in the function of Peter Kovesi is used to
estimate the RANSAC filtered correspondences and the F matrix
related to that.
0' =Fxx T
19. In
the
image,
is
shown
as
given
a
pair
of
images,
for
each
point
x
in
one
image,
there
exist
a
corresponding
epipolar
line
l'
in
the
other
image.
Any
point
x'
in
the
second
image
matching
the
point
x
must
lie
on
the
epipolar
line
l'
.
20. The GetF Function
•
AGer
compu-ng
the
correspondences,
this
func-on
is
used
for
es-ma-ng
the
fundamental
matrix
and
the
epipoles
between
each
pair
of
corresponding
images.
•
Construct
a
cell
of
size
N-‐1,
where
N
is
the
number
of
images.
•
The
matrix
F
is
directly
used
only
for
es-ma-ng
the
second
camera:
P2
=
[skewsimm(e1)*F|e1]
• Anyway,
because
RANSAC
was
used
for
es-ma-ng
matrix
F,
trough
this
procedure
it’s
possible
to
obtain
a
set
of
inliers
and
outliers
for
each
pair
of
correponding
images,
that
are
saved
in
corrispondencesR.
21. 3D points: estimation and updating
•
The
first
es-ma-on
of
3D
points
is
derived
from
the
first
2
images
• For every remaining images :
- Add new 3d points
- Update older one
• we introduce
a
new
line
in
the
array
of
3D
points
which
hold
a
new
parameter
that
counts
the
number
of
images
that
update
every
single
point.
Accuracy e precision will incraese
• we introduce a new threshold parameter used to remove noisy
associations (with elimina_associazioni function)to reduce the
error
22. Affine Reconstruction (1/4)
• The essence of affine reconstruction is to
locate the plane at infinity by some means
• We need to identify the matrix of
Homography “H” that allows us to
transform the 3D points (coming from
projective computation) and the camera
matrices P and P1.
23. Affine Reconstruction (2/4)
We
iden-fy
the
plane
at
infinity
iden-fying
3
vanishing
points
on
the
space.
To
compute
it
we
use
the
parallel
lines’
method
:
1)
Take
2
lines
that
in
reality
are
parallel
from
the
first
image.
This
pair
of
lines
iden-fies
a
vanishing
point
in
2D,
named
v
2)
Take
a
line
l’
from
the
second
image
that
corresponds
to
one
of
the
two
parallel
lines
24. Affine Reconstruction (3/4)
• The corresponding vanishing point in the
second image v’, can be computed as the
intersection of l’ and the epipolar line Fv
derived from the vanishing point v
computed in the first image.
• Finding the null space of a matrix A like
is equivalent to find the 3D
vanishing point that
corresponds to v⎟⎟
⎠
⎞
⎜⎜
⎝
⎛
=
''
][
Pl
Pv
A T
x
25. Affine Reconstruction (4/4)
• Hence,
repea-ng
this
procedure
3
-mes,
we
can
find
3
vanishing
point
in
the
space
that
uniquely
iden-fies
the
plane
at
infinity.
• Now
using
the
plane
at
infinity
we
can
calculate
the
affine
homography:
⎟⎟
⎠
⎞
⎜⎜
⎝
⎛Ι
= Τ
∞π
0|
H
27. Under a pure translational camera motion, 3D points appear to slide along parallel
rails. The images of these parallel lines intersect in a vanishing point
corresponding to the translation direction. The epipole e is the vanishing point.
Figure
1.2
28. Metric Reconstruction (1/5)
• The key to metric reconstruction is the identification of the absolute conic .
• Compute the IAC (image of asbsolute conic), ω, and then upgrade the affine
reconstruction to a metric, with the homography
• The projection of the absolute conic into an image depends only on the
calibration matrix of the camera and not on position and orientation.
• Usually images taken with the same camera, implies that both cameras P and
P1 have the same calibration matrix. Hence ω= ω1, i.e. IAC is the same in both
images.
⎟
⎟
⎠
⎞
⎜
⎜
⎝
⎛ Α
=
−
1
1
H
29. Metric
Reconstruc-on
(2/5)
• The constraints arising from scene
orthogonality are used
• 5 constraints are needed
• 3 can be computed from
orthogonal pairs of parallel lines
• 2 can be retrieved supposing
zero skew and unit aspect ratio
• w2=0
• w1=w3.
• The algorithm used to compute the ω
is that explained in the paper :
Crea-ng
Architectural
Models
from
Images,
by
David
Liebowitz,
Antonio
Criminisi
and
Andrew
Zisserman
cited
on
the
right.
30. Metric
Reconstruc-on
(3/5)
The 2 constraints related to zero skew and unit aspect ratio are
equivalent to those expressed by square pixels:
ω₁₂ = ω₂₁ = 0
ω₁₁ = ω₂₂
31. These 3 pairs of parallel lines are mutually
orthogonal to each other and we can say
that 3 constraints arise from this scene:
v₁T ω v₂ = 0
v₁T ω v₃ = 0
v₂T ω v₃ = 0
Prima coppia rette parallele
Seconda coppia rette parallele
Terza coppia rette parallele
32. •
The
two
condi-ons
arising
from
square
pixels
can
be
wrieen
as
rows
in
a
matrix
Ā,
a₄=
[0
1
0
0
0
0]
a₅=
[1
0
-‐1
0
0
0]
•
a₃
a₂
a
₁
are
the
rows
corresponding
to
the
constraints
arising
from
the
vanishing
point
generated
by
orthogonal
pairs
of
parallel
lines,
expressed
in
the
way
illustrated
in
the
paper,
and
in
the
generic
algorithm
on
the
right.
Ā
=
[a
₁;
a₂;
a₃;
a₄;
a₅]
•
To
find
the
null
space
of
Ā
•
apply
svd(Ā)
and
get
the
last
column
of
V
•
Equivalent
to
solve
•
Ā
w
=
0
Metric
Reconstruc-on
(4/5)
33. Metric
Reconstruc-on
(5/5)
• The knowledge of w implies the knowledge of ω
• Hence we can compute A (that is not the A previously
computed in SVD), such that :
§ AA T = (MTωM)- 1 = Z
§ where M = P(1:3, 1:3)
• To compute A we use the cholesky decomposition:
§ A= chol (Z,’lower’)
• In fact this returns Cholesky matrix A that satisfies AA T = Z
• Using the A value we can build the metric homography such
that is possible to upgrade the affine to the metric
reconstruction
⎟
⎟
⎠
⎞
⎜
⎜
⎝
⎛ Α
=
−
1
1
H