Editor's choice article
Multiple consumer-grade depth camera registration using everyday objects

https://doi.org/10.1016/j.imavis.2017.03.005Get rights and content

Highlights

  • Utilizing pre-scanned shapes for joint calibration of multiple static depth cameras

  • Joint extrinsic calibration and depth distortion correction

  • A dataset for comparing different methods for multi-depth camera registration

Abstract

The registration of multiple consumer-grade depth sensors is a challenging task due to noisy and systematic distortions in depth measurements. Most of the existing works heavily rely on large number of checkerboard observations for calibration and registration of multiple depth cameras, which is tedious and not flexible. In this paper, we propose a more practical method for conducting and maintaining registration of multi-depth sensors, via replacing checkerboards with everyday objects found in the scene, such as regular furniture. Particularly, high quality pre-scanned 3D shapes of standard furniture are used as calibration targets. We propose a unified framework that jointly computes the optimal extrinsic calibration and depth correction parameters. Experimental results show that our proposed method significantly outperforms state-of-the-art depth camera registration methods.

Introduction

Consumer-grade depth cameras have become widely used as a low-cost means for real-time 3D sensing. However, such existing depth cameras have two major limitations: they only have a restricted field of view, and are able to only recover a depth field rather than true 3D data. One approach to overcome these problems would be to move the camera around [1] under the assumption of a static scene; but if real-time wide-coverage 3D data is needed, such as in 3D telepresence [2] and real-time novel view synthesis [3], then multiple depth cameras are required. The challenge then becomes that of registering and calibrating multiple depth cameras, given known problems that such cameras have systematic distortions in depth measurements, and are in general very noisy.

Extrinsic calibration may be conducted on a multi-camera rig, in order to determine the relative poses of the depth cameras with respect to a common reference frame. One approach is to fall back on classical image-based (i.e. optical) calibration techniques utilizing a checkerboard and color or raw IR imagery. Doing so would effectively calibrate the cameras through passive stereo triangulation, while completely ignoring the depth sensing mechanisms. However, 3D data subsequently obtained from the cameras will register poorly as the systematic depth distortions present are not accounted for. An alternative approach is to compute the rigid transform that will geometrically align 3D data separately sensed in the multiple cameras. Although doing so does not correct for the depth distortions, this optimization procedure will typically distribute 3D registration errors more evenly.

While depth distortion correction methods have been proposed for a single depth camera [2], [4], [5], [6], [7], combining such methods with extrinsic calibration in a sequential or piecemeal manner remains unsatisfactory and will still lead to significant misalignment, as the depth correction for individual devices tend to be sensitive to errors. Instead, we propose a unified framework in which extrinsic calibration and depth distortion correction are determined together.

For short experiments, an initial one-off calibration may be satisfactory. However if accurate registration has to be retained for an extended period of time, then in practice recalibration must be conducted at multiple intervals, due to mechanical displacement of the cameras (intentional, accidental, or environmental temperature variations, etc.) or simply sensor drift. This can pose a major practical problem if the calibration process is tedious, such as requiring tens or even hundreds of checkerboard observations for each recalibration. We thus seek to find a more practical method of maintaining registration, such as replacing checkerboards with everyday objects found in the scene, such as regular furniture.

In this work, we propose a new approach for registering depth maps obtained from multiple fixed depth cameras using a non-rigid alignment strategy. High quality pre-scanned 3D shapes of standard furniture are used as calibration targets, rather than checkerboards; these scans may be obtained offline using a moving depth camera together with a structure from motion algorithm such as KinectFusion [1]. The key idea of the proposed approach is to utilize the high quality geometry of the pre-scanned shape as both a reference and a conduit for registering the dynamic 3D data obtained from the multiple static depth cameras.

The contributions of this paper are as follows:

  • To the best of our knowledge, this is the first work to primarily utilize pre-scanned shapes for joint calibration of multiple static depth cameras;

  • A unified framework is presented that jointly computes the optimal extrinsic calibration and depth correction parameters; and

  • A dataset is constructed for comparing different methods for multi-depth camera registration.

Section snippets

Related work

Calibration of normal cameras is a well-established task in computer vision [8], [9], where Zhang's technique [10] is probably the most widely used, and has been implemented in Bouget's MATLAB toolbox [11] and the OpenCV library [12]. Zhang's approach [10] makes use of a planar checkerboard captured in multiple poses for establishing the correspondences needed to estimate intrinsic and extrinsic parameters.

Proposed method

In this section, we describe the details of the proposed method. Fig. 1 shows the steps in our proposed framework. In particular, during the pre-processing stage, we generate a pre-scanned shape database and obtain backgrounds for each depth camera. When performing the registration, we capture scenes with the pre-scanned shapes at different locations. Pre-scanned shape detection and localization are performed in order to find poses of the detected shapes. Correspondences are found using the

Experiments

In this section, we conduct various experiments to evaluate the effectiveness of the proposed calibration framework.

Conclusion

In this paper, we have developed a multi-depth sensor registration method using easy-to-obtain everyday objects. Particularly, we use a common chair as an example and pre-scan it. Then, we apply state-of-the-art method to detect and locate the chair to generate correspondences between the pre-scan and the observation. We have proposed an optimization framework that reserves the geometry of the pre-scanned shape and minimizes the distance between the overlapped region of two depth cameras.

Acknowledgement

This research is partially supported by Singapore MoE AcRF Tier-1 Grants RG138/14 and the BeingTogether Centre, a collaboration between Nanyang Technological University (NTU) Singapore and University of North Carolina (UNC) at Chapel Hill. The BeingTogether Centre is supported by the National Research Foundation, Prime Minister's Office, Singapore under its International Research Centres in Singapore Funding Initiative.

References (23)

  • D.H. Ballard

    Generalizing the hough transform to detect arbitrary shapes

    Pattern Recogn.

    (1981)
  • S. Izadi et al.

    KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera

  • A. Maimone et al.

    Enhanced personal autostereoscopic telepresence system using commodity depth cameras

    Comput. Graph.

    (2012)
  • C. Kuster et al.

    FreeCam: a hybrid camera system for interactive free-viewpoint video

  • D. Herrera et al.

    Joint depth and color camera calibration with distortion correction

    IEEE TPAMI

    (2012)
  • J. Smisek et al.

    3d with kinect

  • R. Avetisyan et al.

    Calibration of depth camera arrays

    (2014)
  • B. Jin et al.

    Accurate intrinsic calibration of depth camera with cuboids

  • G.G. Mateos

    A camera calibration technique using targets of circular features

    (2000)
  • T. Svoboda et al.

    A convenient multicamera self-calibration for virtual environments

    PRESENCE Teleop. Virt.

    (2005)
  • Z. Zhang

    A flexible new technique for camera calibration

    IEEE TPAMI

    (2000)
  • Cited by (0)

    Editor's Choice Articles are invited and handled by a select rotating 12 member Editorial Board committee. This paper has been recommended for acceptance by Robert Walecki.

    View full text