Editor's choice articleMultiple consumer-grade depth camera registration using everyday objects☆
Graphical Abstract
Introduction
Consumer-grade depth cameras have become widely used as a low-cost means for real-time 3D sensing. However, such existing depth cameras have two major limitations: they only have a restricted field of view, and are able to only recover a depth field rather than true 3D data. One approach to overcome these problems would be to move the camera around [1] under the assumption of a static scene; but if real-time wide-coverage 3D data is needed, such as in 3D telepresence [2] and real-time novel view synthesis [3], then multiple depth cameras are required. The challenge then becomes that of registering and calibrating multiple depth cameras, given known problems that such cameras have systematic distortions in depth measurements, and are in general very noisy.
Extrinsic calibration may be conducted on a multi-camera rig, in order to determine the relative poses of the depth cameras with respect to a common reference frame. One approach is to fall back on classical image-based (i.e. optical) calibration techniques utilizing a checkerboard and color or raw IR imagery. Doing so would effectively calibrate the cameras through passive stereo triangulation, while completely ignoring the depth sensing mechanisms. However, 3D data subsequently obtained from the cameras will register poorly as the systematic depth distortions present are not accounted for. An alternative approach is to compute the rigid transform that will geometrically align 3D data separately sensed in the multiple cameras. Although doing so does not correct for the depth distortions, this optimization procedure will typically distribute 3D registration errors more evenly.
While depth distortion correction methods have been proposed for a single depth camera [2], [4], [5], [6], [7], combining such methods with extrinsic calibration in a sequential or piecemeal manner remains unsatisfactory and will still lead to significant misalignment, as the depth correction for individual devices tend to be sensitive to errors. Instead, we propose a unified framework in which extrinsic calibration and depth distortion correction are determined together.
For short experiments, an initial one-off calibration may be satisfactory. However if accurate registration has to be retained for an extended period of time, then in practice recalibration must be conducted at multiple intervals, due to mechanical displacement of the cameras (intentional, accidental, or environmental temperature variations, etc.) or simply sensor drift. This can pose a major practical problem if the calibration process is tedious, such as requiring tens or even hundreds of checkerboard observations for each recalibration. We thus seek to find a more practical method of maintaining registration, such as replacing checkerboards with everyday objects found in the scene, such as regular furniture.
In this work, we propose a new approach for registering depth maps obtained from multiple fixed depth cameras using a non-rigid alignment strategy. High quality pre-scanned 3D shapes of standard furniture are used as calibration targets, rather than checkerboards; these scans may be obtained offline using a moving depth camera together with a structure from motion algorithm such as KinectFusion [1]. The key idea of the proposed approach is to utilize the high quality geometry of the pre-scanned shape as both a reference and a conduit for registering the dynamic 3D data obtained from the multiple static depth cameras.
The contributions of this paper are as follows:
- •
To the best of our knowledge, this is the first work to primarily utilize pre-scanned shapes for joint calibration of multiple static depth cameras;
- •
A unified framework is presented that jointly computes the optimal extrinsic calibration and depth correction parameters; and
- •
A dataset is constructed for comparing different methods for multi-depth camera registration.
Section snippets
Related work
Calibration of normal cameras is a well-established task in computer vision [8], [9], where Zhang's technique [10] is probably the most widely used, and has been implemented in Bouget's MATLAB toolbox [11] and the OpenCV library [12]. Zhang's approach [10] makes use of a planar checkerboard captured in multiple poses for establishing the correspondences needed to estimate intrinsic and extrinsic parameters.
Proposed method
In this section, we describe the details of the proposed method. Fig. 1 shows the steps in our proposed framework. In particular, during the pre-processing stage, we generate a pre-scanned shape database and obtain backgrounds for each depth camera. When performing the registration, we capture scenes with the pre-scanned shapes at different locations. Pre-scanned shape detection and localization are performed in order to find poses of the detected shapes. Correspondences are found using the
Experiments
In this section, we conduct various experiments to evaluate the effectiveness of the proposed calibration framework.
Conclusion
In this paper, we have developed a multi-depth sensor registration method using easy-to-obtain everyday objects. Particularly, we use a common chair as an example and pre-scan it. Then, we apply state-of-the-art method to detect and locate the chair to generate correspondences between the pre-scan and the observation. We have proposed an optimization framework that reserves the geometry of the pre-scanned shape and minimizes the distance between the overlapped region of two depth cameras.
Acknowledgement
This research is partially supported by Singapore MoE AcRF Tier-1 Grants RG138/14 and the BeingTogether Centre, a collaboration between Nanyang Technological University (NTU) Singapore and University of North Carolina (UNC) at Chapel Hill. The BeingTogether Centre is supported by the National Research Foundation, Prime Minister's Office, Singapore under its International Research Centres in Singapore Funding Initiative.
References (23)
Generalizing the hough transform to detect arbitrary shapes
Pattern Recogn.
(1981)- et al.
KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera
- et al.
Enhanced personal autostereoscopic telepresence system using commodity depth cameras
Comput. Graph.
(2012) - et al.
FreeCam: a hybrid camera system for interactive free-viewpoint video
- et al.
Joint depth and color camera calibration with distortion correction
IEEE TPAMI
(2012) - et al.
3d with kinect
- et al.
Calibration of depth camera arrays
(2014) - et al.
Accurate intrinsic calibration of depth camera with cuboids
A camera calibration technique using targets of circular features
(2000)- et al.
A convenient multicamera self-calibration for virtual environments
PRESENCE Teleop. Virt.
(2005)
A flexible new technique for camera calibration
IEEE TPAMI
Cited by (0)
- ☆
Editor's Choice Articles are invited and handled by a select rotating 12 member Editorial Board committee. This paper has been recommended for acceptance by Robert Walecki.