Generic self-calibration of central cameras

https://doi.org/10.1016/j.cviu.2009.07.007Get rights and content

Abstract

We consider the self-calibration problem for a generic imaging model that assigns projection rays to pixels without a parametric mapping. We consider the central variant of this model, which encompasses all camera models with a single effective viewpoint. Self-calibration refers to calibrating a camera’s projection rays, purely from matches between images, i.e. without knowledge about the scene such as using a calibration grid. In order to do this we consider specific camera motions, concretely, pure translations and rotations, although without the knowledge of rotation and translation parameters (rotation angles, axis of rotation, translation vector). Knowledge of the type of motion, together with image matches, gives geometric constraints on the projection rays. We show for example that with translational motions alone, self-calibration can already be performed, but only up to an affine transformation of the set of projection rays. We then propose algorithms for full metric self-calibration, that use rotational and translational motions or just rotational motions.

Introduction

Many different types of cameras have been used in computer vision. Existing calibration and self-calibration procedures are often taylor-made for specific camera models, mostly for pinhole cameras (possibly including radial or decentering distortion), fisheyes, specific types of catadioptric cameras, etc.; see examples in [2], [3], [10], [6], [11], [12].

A few works have proposed calibration methods for a highly generic camera model that encompasses the above mentioned models and others [7], [4], [8], [19], [18]: a camera acquires images consisting of pixels; each pixel captures light that travels along a projection ray in 3D. Projection rays may in principle be positioned arbitrarily, i.e. no functional relationship between projection rays and pixels, governed by a few intrinsic parameters, is assumed. Calibration is thus described by:

  • the coordinates of these rays (given in some local coordinate frame).

  • the mapping between rays and pixels; this is basically a simple indexing.

One motivation of the cited works is to provide flexible calibration methods that should work for many different camera types. The proposed methods rely on the use of a calibration grid and some of them on equipment to carry out precisely known motions.

The work presented in this paper aims at further flexibility, by addressing the problem of self-calibration for the above generic camera model. The fundamental questions are: can one calibrate the generic imaging model, without any other information than image correspondences, and how? This work presents a step in this direction, by presenting principles and methods for self-calibration using specific camera motions. Concretely, we consider how pure rotations and pure translations may enable self-calibration.

Further, we consider the central variant of the imaging model, i.e. the existence of an optical center through which all projection rays pass, is assumed. Besides this assumption, projection rays are unconstrained, although we do need some continuity (neighboring pixels should have “neighboring” projection rays), in order to match images.

The self-calibration problem has been addressed for a slightly more restricted model in [20], [21], [15]. Tardif et al. [20], [21] introduced a generic radially symmetric model where images are modeled using a unique distortion center and concentric distortion circles centered about this point. Every distortion circle around the distortion center is mapped to a cone of rays. In [15] the self-calibration problem is transformed to a factorization requiring only a singular value decomposition of a matrix composed of dense image matches. Thirthala and Pollefeys [22] proposed a linear solution for recovering radial distortion which can also include non-central cameras. Here, pixels on any line passing through the distortion center are mapped to coplanar rays.

This paper is an extended version of [16]. In addition to the methods proposed in [16], we study the self-calibration problem for two new scenarios. The first is to obtain a metric self-calibration from two pure rotations. Second we study the possibility of obtaining self-calibration up to an unknown focal length in the case of using one rotation and one translation. The same self-calibration problem has been studied independently in [14], [9], [5], where an algebraic approach is utilized for a differentiable imaging model and infinitesimal camera motion. In contrast to these works, we use a discrete imaging model and consider finite motions.

In this work we focus on restricted motions like pure translations and pure rotations. We compute dense matches over space and time, i.e. we assume that for any pixel p, we have determined all pixels that match p at some stage during the rotational or translational motion. We call a complete such set of matching pixels, a flowcurve. Such flowcurves provide geometrical constraints on the projection rays. For example, a flowcurve in the case of a pure translation corresponds to a set of pixels whose projection rays are coplanar. In the case of pure rotation, the corresponding projection rays lie on a cone. These coplanarity and “co-cone” constraints are the basis of the self-calibration algorithms proposed in this paper.

We formulate the generic self-calibration problem for central cameras in Section 2. In Section 3 we describe the geometrical constraints that can be obtained from pure translation and pure rotation. In Section 4 we show that with translational motions alone, self-calibration can already be performed, but only up to an affine transformation of the set of projection rays. Our main contribution is given in Section 5 where we show different self-calibration approaches using combinations of pure rotations and pure translations. Finally in Section 6 we show results for fisheye images using a self-calibration method that uses two rotations and one translation.

Section snippets

Problem formulation

We want to calibrate a central camera with n pixels. To do so, we have to recover the directions of the associated projection rays, in some common coordinate frame. Rays need only be recovered up to a euclidean transformation, i.e. ray directions need only be computed up to rotation. Let us denote by Di the 3-vector describing the direction of the ray associated with the ith pixel p.

Input for computing ray directions are pixel correspondences between images and the knowledge that the motion

Constraints from specific camera motions

In this Section, we explain constraints on the self-calibration of projection ray directions that are obtained from flowcurves due to specific camera motions: one translational or one rotational motion.

Multiple translational motions

In this section, we explain that multiple translational motions allow to recover camera calibration up to an affine transformation. First, it is easy to explain that no more than an affine “reconstruction” of projection rays is possible here. Let us consider one valid solution for all ray directions Di, i.e. ray directions that satisfy all collinearity constraints associated with t-curves (cf. Section 3.1). Let us transform all ray directions by an affine transformation of 3-spaceAb0T1i.e. we

Self-calibration algorithms

We put together constraints derived in Section 3 in order to propose self-calibration algorithms for different scenarios that require rotational and translational motions. First, we show that with one rotational and one translational motion, full self-calibration up to a single degree of freedom, is possible. This degree of freedom is equivalent to an unknown focal length in the case of a perspective camera. Second, it is shown how to remove this ambiguity using an additional rotational motion.

Experiments

We tested the algorithm of Section 5.2 using simulated and real cameras. For real cameras, ground truth is difficult to obtain, so we visualize the self-calibration result by performing perspective distortion correction.

Discussion

We have studied the generic self-calibration problem for central cameras using different combinations of pure translations and pure rotations. Our experimental results are promising and show that self-calibration may indeed be feasible in practice.

We may summarize minimal required conditions for self-calibration. First, two rotational motions have been shown to be sufficient for full self-calibration. Note that even for self-calibration of the pinhole model from pure rotations, two rotations

References (22)

  • J. Salvi et al.

    Pattern codification strategies in structured light systems

    Pattern Recognition

    (2004)
  • Opencv (open source computer vision library), Intel...
  • J.P. Barreto et al.

    Paracatadioptric camera calibration using lines

    International Conference on Computer Vision (ICCV)

    (2003)
  • D.C. Brown

    Close-range camera calibration

    Photogrammetric Engineering

    (1971)
  • G. Champleboux et al.

    Accurate calibration of cameras and range imaging sensors: the NPBS method

    ICRA

    (1992)
  • F. Espuny

    A closed-form solution for the generic self-calibration of central cameras from two rotational flows

    VISAPP

    (2007)
  • C. Geyer et al.

    Paracatadioptric camera calibration

    IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)

    (2002)
  • K.D. Gremban et al.

    Geometric camera calibration using systems of linear equations

    ICRA

    (1988)
  • M.D. Grossberg et al.

    A general imaging model and a method for finding its parameters

    International Conference on Computer Vision (ICCV)

    (2001)
  • E. Grossmann et al.

    Are two rotational flows sufficient to calibrate a smooth non-parametric sensor?

    CVPR

    (2006)
  • R.I. Hartley et al.

    Multiple View Geometry in Computer Vision

    (2000)
  • Cited by (46)

    • Line detection in images showing significant lens distortion and application to distortion correction

      2014, Pattern Recognition Letters
      Citation Excerpt :

      Plumb-line methods (Alvarez et al., 2008, 2011; Brown, 1971; Devernay and Faugeras, 2001; Wang et al., 2009) rely on the human-supervised identification of some known straight lines in one or more images (a user identifies the lines and manually marks some points on the distorted lines). As a consequence of the human intervention, these methods are robust, independent of the camera parameters, and do not need any calibration pattern (see Zhang (2000) for a point correspondence method) or images acquired under a particular camera motion (see for instance, Faugeras et al. (1992) or Ramalingam et al. (2010)). However, these methods are slow and tedious for the case of dealing with large sets of images.

    • SEM-DCNet: A Self-Supervised Network for SEM Images Zonary Non-Linear Distortion Correction

      2022, Proceedings - 2022 2nd Asia-Pacific Conference on Communications Technology and Computer Science, ACCTCS 2022
    View all citing articles on Scopus
    View full text