Spherical approximation for multiple cameras in motion estimation: Its applicability and advantages

https://doi.org/10.1016/j.cviu.2010.07.005Get rights and content

Abstract

Estimating motions of a multi-camera system which may not have overlapping fields of view is generally complex and computationally expensive because of the non-zero offset between each camera’s center. It is conceivable that if we can assume that multiple cameras share a single optical center, and thus can be modeled as a spherical imaging system, motion estimation and calibration of this system would become simpler and more efficient.

In this paper, we analytically and empirically derive the conditions under which a multi-camera system can be modeled as a single spherical camera. Various analyses and experiments using simulated and real images show that spherical approximation is applicable to a surprisingly larger extent than currently expected. Moreover, we show that, when applicable, this approximation even results in improvements in accuracy and stability of estimated motion over the exact algorithm.

Introduction

Assume that vehicles such as cars and small-sized unmanned aerial vehicles (UAV) flying in urban environments have multiple cameras whose fields of view (FOV) may not overlap. Two major current approaches exist for estimating ego-motions of such a vehicle using a multiple camera system. One approach uses the linear 17-point algorithm based on the generalized camera model [1], [2]. With this algorithm, all six degrees of freedom are linearly recovered for motions that include a scale. The other approach is based on the assumption that each camera shares a single optical center, which makes it possible to approximate the imaging system as a spherical camera. Motion estimation algorithms for a spherical camera recover the camera’s motions only up to scale. It is generally expected that spherical approximation, while resulting in a simpler and more efficient algorithm, causes systematic errors in estimated motions compared with the generalized 17-point algorithm.

To test this understanding, we applied both methods to real image sequences taken by a set of cameras, as shown in Fig. 1. The three cameras can view forward, left and right, and the distance between the camera centers is about 100 mm. Fig. 2 shows examples of 320 × 240 synchronized images captured by the camera set. We manually chose correspondences in the sequences to remove outliers.

The generalized 17-point algorithm gave the ego-motion estimation results shown in Fig. 3a. This exact algorithm was unable to obtain accurate results for both rotation and translation. On the other hand, the spherical approximation method, which assumes all the cameras share a single optical center, also viewed as aggressive approximation, provided the more accurate results shown in Fig. 3b. This method can not determine an absolute scale of the estimated motion, but does obtain more accurate and stable estimations for both rotation and translation direction.

We wondered why the exact algorithm was outperformed by the aggressive approximation. Because it can be speculated that a poorer performance may be caused by inaccurate camera calibration or incorrect selection of features, we conducted another experiment using simulated data which should be free of these potential inaccuracies. To make the experiment as similar to the real situation as possible, we replicated the path and scene points obtained by real experiment data. With noiseless feature data, the generalized 17-point algorithm provided the results shown in Fig. 4a, which represents the exact given trajectory. However, Fig. 4b shows, when 1-pixel Gaussian tracking errors were added to the features, an inaccurate trajectory was obtained.

We also tested the spherical approximation method using the same feature data. With the noiseless data, the aggressive approximation method provided the results shown in Fig. 5a, which suffered from systematic errors caused by this approximation. With the addition of 1-pixel Gaussian feature tracking error, we obtained the results shown in Fig. 5b, which were more accurate than the results obtained by the generalized 17-point algorithm in Fig. 4b. This suggests that feature tracking error affects the approximation algorithm much less than it affects the exact algorithm in this simulation. If scene is very distant compared to the distance between camera centers, an exact algorithm can be less robust than an approximation algorithm which suffers from errors induced by the approximation.

These results can be explained if we consider the principles involved in depth estimation using a stereo system. In stereo, it is not possible to obtain accurate depth estimation of very distant points. If system’s baseline is too short compared to distances from the system to scene points, the cameras in a stereo system seem to have an identical projection center for the points. This principle applies to a multiple camera system whose centers do not coincide, and explains why the motions estimated by the generalized 17-point algorithm, especially traveling distances, are so different from the real ones. If the scene points are too far, the distance between camera centers becomes negligible. In this situation, trying to estimate the scale, which can not be accurate, makes the entire motion estimation inaccurate and unstable.

These phenomena can also serve as an example to show that a simpler and more restricted model (spherical camera model) may tend to be more stable than a more general model (generalized camera model). Still, to the best of our knowledge, there has been no systematic analysis done to this point to understand which algorithm best fits the problem of the ego-motion estimation with multiple cameras.

In this paper, we will show the conditions under which the simpler spherical assumption is more applicable both theoretically and empirically. Performance analyses will be given in regard to various aspects such as feature tracking error, distance to scene points, calibration error, configurations of cameras, and image resolutions. Surprisingly, the range in which spherical approximation outperforms the exact 17-point algorithm is larger than we tend to expect. In addition, we will show that spherical approximation makes camera calibration simpler than previous methods.

Section snippets

Related works

One of the fundamental difficulties in using single-camera structure from motion (SFM) is translation–rotation ambiguity. The apparent motion or optical flows between two frames of, for example, small sideways translational motion and small panning rotational motion are hard to distinguish. Similarly, it is hard to accurately estimate these two motion components based on observed optical flows. Baker et al. [3] showed that this difficulty in decoupling the two components is mainly a result of

Multi-camera system as a spherical camera

Fig. 6 shows one of our multi-camera systems, consisting of three low-weight cameras placed in three orthogonal directions with non-overlapping fields of view. The distance between each camera center is approximately 100 mm.

We will treat this set of cameras as a single spherical camera, or equivalently, we will assume that the three cameras share a common projection center. Applying a spherical approximation to non-spherical cameras induces positional errors when mapping to the spherical image

Motion estimation and calibration method under spherical camera approximation

By assuming that all the cameras share a single projection center, any motion estimation or SFM algorithm for a single focal point camera can be used. We follow a conventional process to build SFM: motion estimation between two frames, integration of motions, and optional bundle adjustment [12].

Experiments

We conducted a series of experiments to analyze the performance of spherical approximation in various aspects, including both the motion estimation and the online calibration method.

Conclusion

One of the fundamental difficulties of single-camera SFM is translation–rotation ambiguity due to limited FOVs. To resolve this problem, it is beneficial to use multiple cameras to get a larger FOV as a whole. However, the discrepancy between camera centers makes motion estimation nonlinear and challenging. Though a linear algorithm using 17 point correspondences has been proposed, it is not sufficiently accurate or stable in outdoor applications. It is conceivable that if we can assume that

References (22)

  • Y. Chen et al.

    Three-dimensional ego-motion estimation from motion fields observed with multiple cameras

    Pattern Recognition

    (2001)
  • R. Pless, Using many cameras as one, in: IEEE Computer Vision and Pattern Recognition or CVPR, vol. II, 2003, pp....
  • H. Li, R. Hartley, J. Kim, A linear approach to motion estimation using generalized camera models, in: IEEE Computer...
  • P. Baker, C. Fermuller, Y. Aloimonos, R. Pless, A spherical eye from multiple cameras (makes better models of the...
  • J.-P. Tardif, Y. Pavlidis, K. Daniilidis, Monocular visual odometry in urban environments using an omnidirectional...
  • M.D. Grossberg, S.K. Nayar, A general imaging model and a method for finding its parameters, in: International...
  • W. Chang, C. Chen, Pose estimation for multiple camera systems, in: International Conference on Pattern Recognition,...
  • C. Chen et al.

    On pose recovery for generalized visual sensors

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2004)
  • J. Frahm, K. Koser, R. Koch, Pose estimation for multi-camera systems, in: 26th Symposium of the German Association for...
  • M. Kaess, F. Dellaert, Visual SLAM with a multi-camera rig, Georgia Institute of Technology, Tech. Rep. GIT-GVU-06-06,...
  • J.-H. Kim et al.

    Motion estimation for non-overlapping multi-camera rigs: linear algebraic and l∞ geometric solutions

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2009)
  • Cited by (13)

    • An intrinsic approach to formation control of regular polyhedra for reduced attitudes

      2020, Automatica
      Citation Excerpt :

      Another engineering application of regular tetrahedron formation is the Cluster II mission launched by the ESA (Credland, Mecke, & Ellwood, 1997). Moreover, in computer vision field regular polyhedron configurations are also used in motion capture applications with a panoramic field of view (Goldberger, 2005; Kim, Hwangbo, & Kanade, 2010; Pless, 2003), for example when the orientations (reduced attitudes) of six cameras form a regular octahedron known as the “Argus Eye” system (Pless, 2003), more accurate estimation for target’s motion is obtained. By leveraging these tools, we show that it is sufficient to investigate stability of a far less constrained system to obtain stability of each desired formation.

    • Scale recovery in multicamera cluster SLAM with non-overlapping fields of view

      2014, Computer Vision and Image Understanding
      Citation Excerpt :

      If the features are too far away for a particular camera baseline length, the scale information is lost in the image measurement noise. This observation was used by Kim et al. [8] to justify their assumption that a camera cluster can be approximated as a spherical camera when the features were far enough away relative to the distance between the camera centers. In that case, their assumption does not allow for the scale to be determined.

    • Determining shape and motion from non-overlapping multi-camera rig: A direct approach using normal flows

      2013, Computer Vision and Image Understanding
      Citation Excerpt :

      8-pt RANSAC: An 8-point algorithm uses feature correspondences in a RANSAC framework [57] to determine camera motion [58] (code from [59]). spher-5-pt RANSAC: A 5-point algorithm [14] uses feature correspondences in a RANSAC framework to estimate motion from an approximate spherical camera [16] (code partially from [59]). AFD + AFM: A two-stage direct method uses normal flows to estimate motion of monocular camera [42].

    • Online Panorama Image Generation for a Disaster Rescue Vehicle

      2019, 2019 16th International Conference on Ubiquitous Robots, UR 2019
    View all citing articles on Scopus
    View full text