Uncertainty analysis of 3D reconstruction from uncalibrated views

doi:10.1016/S0262-8856(99)00072-4

Image and Vision Computing

Volume 18, Issue 9, June 2000, Pages 685-696

https://doi.org/10.1016/S0262-8856(99)00072-4 Get rights and content

Abstract

We consider reconstruction algorithms using points tracked over a sequence of (at least three) images, to estimate the positions of the cameras (motion parameters), the 3D coordinates (structure parameters), and the calibration matrix of the cameras (calibration parameters). Many algorithms have been reported in literature, and there is a need to know how well they may perform. We show how the choice of assumptions on the camera intrinsic parameters (either fixed, or with a probabilistic prior) influences the precision of the estimator. We associate a Maximum Likelihood estimator to each type of assumptions, and derive analytically their covariance matrices, independently of any specific implementation. We verify that the obtained covariance matrices are realistic, and compare the relative performance of each type of estimator.

Introduction

The problem of 3D reconstruction from images has drawn considerable attention. We focus on the problem of reconstruction from matched points (corners). The parameters of interest are the structure parameters, i.e. the 3D coordinates of the points, the motion parameters that describe the positions of the cameras; and the calibration parameters that describe the intrinsic characteristics of the used sensors. The case of known intrinsic parameters has been thoroughly studied in photogrammetry [1]. Work on uncalibrated reconstruction progressed dramatically in recent years with the works of Hartley [2], Faugeras [3], Maybank [4], Pollefeys et al. [5], who showed how to obtain projective, affine, and, finally, euclidean reconstructions from uncalibrated views. We are interested in euclidean reconstruction. Many algorithms have been proposed, differing, e.g. on the assumptions concerning the calibration parameters and/or motion [6]. Studies of the precision of the estimation of the “fundamental matrix” [7] and “trifocal tensor” [8], which represent multilinear constraints that tracked 2D features must verify can be found in [9], [10], [11]. A study of critical (pathological) cases for self-calibration can be found in Ref. [12], and the achievable precision in the calibrated case is addressed in Ref. [13].

In this paper, we study the precision with which 3D points, camera orientation, position and calibration are estimated. In some studies [14], [15] some intrinsic parameters are fixed to nominal values. We want to compare, in terms of precision, the effect of these assumptions and the precision achieved in the calibrated case. One contribution of this paper is to compare the precisions of calibrated and uncalibrated reconstruction. Although the former always performs better, experimentation shows that when more than ten images are available uncalibrated reconstruction performs honorably.

Errors in the localization of image features introduce errors in the reconstruction. Some algorithms are numerically unstable, intrinsically, or in conjunction to particular setups of points and/or of cameras. However, an in-depth study of the precision of these algorithms has not been presented. The issue of the accuracy of uncalibrated reconstruction has been raised and studied repeatedly, but always associated to a particular algorithm. Our aim is to give a more general treatment to the question, while remaining as independent as possible of any particular implementation.

Most algorithms combine an “algebraic” part, and an optimization part that solves for a Maximum Likelihood [2] (or related [16]) estimate. Maximum Likelihood (ML) and related estimators are often reported [16] to converge to the solution only if started close from it. It is the purpose of the “algebraic” algorithm to provide the starting position. In this paper, we study the precision of the ML-like estimator, not that of the algebraic algorithm. The true parameters are considered as random variables with a distribution that is defined from the observations. The estimator is defined by the observation model, independently from any specific algorithm; we derive analytically its covariance matrix in various cases of interest: we distinguish the cases in which only the observations are available (ML estimation) and those where some knowledge of the estimated quantities is available a priori. Amongst the later, we further distinguish the cases of probabilistic knowledge (maximum a posteriori estimation), and that of “exact” knowledge, where some parameters are fixed.

When estimating all parameters from only the observations, the estimation is often numerically ill posed. For example, in Ref. [14] some intrinsic parameters are highly correlated with some of the motion parameters, and the focal length is correlated to the depth (cinema uses the fact that zooming is almost indistinguishable from forward motion).

If some calibration parameters are fixed, they may be removed from the estimated vector. This simplifies the study and implementation of the estimator, and—presumably—ameliorate the numerical stability. Typical assumptions are that pixels are rectangular or square, or that the principal point coincides with the image center [5], [15]. We verify in Section 5.1 the effect on precision of fixing the intrinsic parameters, either to values obtained from a pre-calibration step or to nominal values (corresponding to square pixels and centered principal point).

Finally, the likelihood function may be modified to take into account a priori knowledge expressed probabilistically, e.g. assuming that structure or calibration follow a known distribution. One then performs maximum a posteriori (MAP) estimation. A prior on structure serves most often to retrieve precisely the intrinsic parameters, and is then called calibration from a known object.

A prior on the calibration parameters, may come either from a previous calibration step, or from assuming that the camera parameters follow a “nominal” distribution, e.g. the expected value of the principal point is the center of the image, and that its standard deviation is approximately 10% of the image size.¹ This is the probabilistic counterpart of fixing the principal point to image center, in Section 2.2. In terms of the theoretical precision, priors are preferable to fixed parameters.

We will write analytically the covariance matrices corresponding to the studied cases in , , . The diagonal terms correspond to the variances of the individual estimated parameters. The validity of our analytical expressions is verified by comparing the theoretical and the observed behavior of a reconstruction algorithm, in Section 5.1. One important contribution of this paper lies in showing how big the variances of the considered estimators are in practice.

Section snippets

Notations

We now define the notations used throughout the paper. We consider that a set of P points has been tracked over a sequence of N images. The following notation is adopted:

•
p∈{1,2,…,P} and n∈{1,2,…,N} are the indices used for numbering points and images, respectively.
•
$x_{p} ∈R^{3}$ is the vector of the coordinates, in the world frame, of the pth point. Its components are i∈{1,2,3}. The symbol $X$ shall denote the 3D coordinates of all the points $x_{1},…, x_{P} .$

The projection of these 3D points in the image depends

Covariance of estimators

We derive the covariance matrices of estimators for three possible cases: the “plain” maximum likelihood (ML) estimator defined from the observations only, the maximum a posteriori (MAP) estimator obtained when a probabilistic prior is available for a subset of the estimated parameters and finally for the “restricted” ML estimator, in which a subset of the estimated parameters is fixed to given values. The obtained expressions, some of which being identical to those in [19], [20], only involve Q

Specialization to the problem of reconstruction

The above formulas hold for any estimator of the considered types (ML, MAP or “restricted” ML). We now specialize them to the case of Gaussian noise, when the log-likelihood is a sum of squared differences between observations and predictions $D_{Θ_{i}} Q= ∑ npk D_{Θ_{i}} v_{npk} (v_{npk} −u_{npk})/σ^{2}$ $D_{Θ_{i}Θ_{j}}^{2} Q= 1 σ^{2} ∑ npk D_{Θ_{i}} v_{npk} D_{Θ_{j}} v_{npk} +D_{Θ_{i}Θ_{j}}^{2} v_{npk} (v_{npk} −u_{npk})$ $D_{Θ_{i}u_{npk}}^{2} Q=D_{Θ_{i}} v_{npk} /σ^{2}$

A first practical consideration: notice that in the previous section, one may perform the expansions in Taylor series around Θ^∗ rather than in $Θ ̂ .$ One would

Experimental results

We performed various experiments (real and simulated) to study the performance of the estimators. The errors on the parameters $X$ , $W$ , $T$ and K are studied separately. For $X$ and K, which are normalized for having $E(‖ x_{p} ‖^{2})=1,$ and $E(‖ K ‖^{2})≃1,$ the error measures are the standard deviations of $‖ x_{p} − x_{p}^{∗} ‖$ and $‖ K − K^{∗} ‖$ . For $T$ , the standard deviation of $‖ t − t^{∗} ‖$ is used too. We saw in Section 2.5 that $T$ is expressed in the same unit as $X$ . For $W$ , the measure is the standard deviation of the angle formed between

Conclusions

We have analyzed the problem of the precision that is achievable in 3D reconstruction from uncalibrated views. Although a lot of work has been carried out on various forms of reconstruction, the problem of precision evaluation is seldom addressed in a systematic way.

We have formulated the problem in a probabilistic framework. We further considered that various types of prior information may be available and defined the corresponding estimators.

One contribution of this work is the analytical

Acknowledgements

This work has been supported by projects INCO COPERNICUS Proj. 960174-VIRTUOUS and PRAXIS 2/2.1/TPAR/2074/95.

References (21)

P.H.S. Torr et al.
Robust detection of degenerate configurations for the fundamental matrix
Computer Vision and Image Understanding
(1998)
P.R. Wolf
Elements of photogrammetry, with air photo interpretation and remote sensing
(1983)
R.I. Hartley
Euclidean reconstruction from uncalibrated views
In in 2nd Proc. Europe-U.S. Workshop on Invariance
(1993)
O.D. Faugeras
What can be seen in three dimensions with an uncalibrated stereo rig?
S.J. Maybank et al.
Theory of self-calibration of a moving camera
International Journal Computer Vision
(1992)
M. Pollefeys et al.
Self-calibration and metric reconstruction in spite of varying and unknown internal camera parameters
Proceedings of the Sixth ICCV
(1998)
A.W. Fitzgibbon et al.
Automatic 3D model construction for turn-table sequences
Gang Xu et al.
Epipolar geometry in Stereo, Motion and Object Recognition, A Unified Approach
(1996)
A. Shashua
Algebraic functions for recognition
IEEE Transactions on PAMI
(1994)
G. Csurka et al.
Characterizing the uncertainty of the fundamental matrix
Computer Vision and Image Understanding
(1997)

There are more references available in the full text version of this article.

Cited by (20)

Accurate 3D Target Positioning in Close Range Photogrammetry with Implicit Image Correction
2009, Chinese Journal of Aeronautics
Accurate three-dimensional (3D) target positioning is of great importance in many industrial applications. Although various methods for reconstructing 3D information from a set of images have been available in the literature, few of them pay enough attention to the indispensable procedures, such as target extraction from images and image correction having strong influences upon the 3D positioning accuracy. This article puts forward a high-precision ellipse center (target point) extraction method and a new image correction approach which has been integrated into the 3D reconstruction pipeline with a concise implicit model to accurately compensates for the image distortion. The methods are applied to a copyright-reserved close range photogrammetric system. Real measuring experiments and industrial applications have evidenced the proposed methods, which can significantly improve the 3D positioning accuracy.
Motion bias and structure distortion induced by intrinsic calibration errors
2008, Image and Vision Computing
This article provides an account of sensitivity and robustness of structure and motion recovery with respect to the errors in intrinsic parameters of the camera. We demonstrate both analytically and in simulation, the interplay between measurement and calibration errors and their effect on motion and structure estimates. In particular we show that the calibration errors introduce an additional bias towards the optical axis, which has opposite sign to the bias typically observed by egomotion algorithms. The overall bias causes a distortion of the resulting 3D structure, which we express in a parametric form. The analysis and experiments are carried out in the differential setting for motion and structure estimation from image velocities. While the analytical explanations are derived in the context of linear techniques for motion estimation, we verify our observations experimentally on a variety of optimal and suboptimal motion and structure estimation algorithms. The obtained results illuminate and explain the performance and sensitivity of the differential structure and motion recovery techniques in the presence of calibration errors.
Research on Fast Motion Estimation in H264 Coding
2022, Proceedings of SPIE - The International Society for Optical Engineering
Structure from Motion with variable focal length: Interconnected fuzzy observer
2021, Proceedings of the IEEE Conference on Decision and Control
Uncertainty analysis of 3D line reconstruction in a new minimal spatial line representation
2020, Applied Sciences (Switzerland)
Research on the influence of camera position on reconstruction accuracy in binocular vision
2020, Proceedings of SPIE - The International Society for Optical Engineering

View all citing articles on Scopus

View full text

Uncertainty analysis of 3D reconstruction from uncalibrated views

Abstract

Introduction

Section snippets

Notations

Covariance of estimators

Specialization to the problem of reconstruction

Experimental results

Conclusions

Acknowledgements

Computer Vision and Image Understanding

Elements of photogrammetry, with air photo interpretation and remote sensing

Euclidean reconstruction from uncalibrated views

In in 2nd Proc. Europe-U.S. Workshop on Invariance

What can be seen in three dimensions with an uncalibrated stereo rig?

Theory of self-calibration of a moving camera

International Journal Computer Vision

Self-calibration and metric reconstruction in spite of varying and unknown internal camera parameters

Proceedings of the Sixth ICCV

Automatic 3D model construction for turn-table sequences

Epipolar geometry in Stereo, Motion and Object Recognition, A Unified Approach

Algebraic functions for recognition

IEEE Transactions on PAMI

Characterizing the uncertainty of the fundamental matrix

Computer Vision and Image Understanding