This document discusses various mathematical tools used for monocular model-based 3D tracking of rigid objects, including camera representation using the pinhole camera model and calibration matrix, methods for estimating the external camera parameters matrix like DLT and PnP, and techniques for pose estimation, robust estimation, and Bayesian tracking. It covers topics like camera parameterization using Euler angles, quaternions and exponential maps, and algorithms for minimizing reprojection error like linear least squares and Newton-based methods. It also describes M-estimators and RANSAC for robust estimation in the presence of outliers.
5. 2.1.1 The Perspective Projection Model Image coordinate system World Coordinates Image Coordinates (in the image) Projection Matrix
6. 2.1.2 The Camera Calibration Matrix internal parameters focal length principal point skew parameter the number of pixels per unit distance in the u the number of pixels per unit distance in the v
7. 2.1.2 The Camera Calibration Matrix projection focal length Image Plane
8. 2.1.2 The Camera Calibration Matrix projection to image principal point (center of image plane) the number of pixels per unit distance in the u the number of pixels per unit distance in the v
9. 2.1.2 The Camera Calibration Matrix skew field of view referred as the skew, usually image plane size and field of view are assumed to be fixed, but not fixed focal length
10. 2.1.3 The External Parameters Matrix world coordinate to camera coordinate The 3x4 external parameters rotation matrix translation vector in the world coordinate system in the camera coordinate system
12. 2.1.4 Estimating the Camera Calibration Matrix internal parameters are assumed to be fixed make use of a calibration pattern of known sizeinside the field of view correspondence between the 3D points and the 2D image points
19. estimated camera positions (when the internal parameters are known) 2.3 Estimating the External parameters Matrix
20. 2.3.1 How many Correspondences are necessary? n=3 known correspondences produce 4 possible solution (P3P Problem) n>=4 known correspondences produce 2 possible solution n>=4 known correspondences (points are coplanar) produce unique solution n>=6 known correspondences produce unique solution
21. 2.3.2 The Direct Linear Transformation (DLT) to estimate the whole matrix P by solving a linear system even when the internal parameters are not known Each correspondence gives rise to two linearly independent equations
22. 2.3.2 The Direct Linear Transformation (DLT) Stacking all the equation into B yields the linear system :
23. 2.3.2 The Direct Linear Transformation (DLT) is the eigen vector of B corresponding to the smallest eigenvalue of B 6 correspondences must be known for 3D tracking , using a calibrated camera and estimating only its orientation and position
25. 2.3.4 Pose estimation from a 3D Plane The relation between a 3D plane and its image projection can be represented by a homogeneous 3x3 matrix (homography matrix) Let us consider the plane
26. 2.3.4 Pose estimation from a 3D Plane The matrix H can be estimated from four correspondences using a DLT algorithm the translation vector last column is given by the cross-product since the columns of R must be orthonormal
28. finding the pose that minimizes a sum of residual errors 2.4 Least-Squares Minimization Techniques
29. 2.4.1 Linear Least-Squares the function is linear the camera pose parameters the unknowns of a set of linear equations in matrix form as can be estimated as pseudo-inverse of A
30. 2.4.2 Newton-Based Minimization Algorithms the function is not linear algorithms start from an initial estimate of the minimum and update it iteratively is chosen to minimize the residual at iteration and estimated by approximating to the first order
31. 2.4.2 Newton-Based Minimization Algorithms Jacobian matrix the partial derivatives of all these functions stabilizes the begavior
32. inliers data whose distribution can be explained by some set of model parameters outliers which are data that do not fit the model the data can be subject to noise M-estimators good at finding accurate solutions require an initial estimate to converge correctly RANSAC does not require such an initial estimate does not take into account all the available data lacks precision 2.5 Robust Estimation
33. 2.5.1 M-Estimators least-squares estimation the assumption that the observations are independent and have a Gaussian distribution Instead of minimizing are residual errors is an M-estimator that reduce the influence of outliers
36. Tukey estimator : flat so that large residual errors have no influence at all2.5.1 M-Estimators
37. 2.5.2 RANSAC samples of data pointsare randomly selected estimate model parameters find the subset of points (consistent with the estimate) the largest is retained and refined by least-squares minimization the model parameters require a minimum of a set of measurements