Elsevier

Pattern Recognition

Volume 34, Issue 5, May 2001, Pages 1105-1117
Pattern Recognition

Calibrating a camera network using a domino grid

https://doi.org/10.1016/S0031-3203(00)00047-9Get rights and content

Abstract

This work considers the problem of calibrating a distributed multi-camera system. As opposed to traditional multi-camera systems, such as stereo heads, in a distributed network the fields-of-view do not all overlap. Our novel method uses a grid of domino calibration targets. We also describe a novel grid-finding algorithm, to expedite the location of image-to-world correspondences. Experiments conducted in a hallway and two connecting rooms, using 12 cameras, demonstrate accuracies of 4.3 mm in world coordinates and 0.28 pixels in image coordinates.

Introduction

This paper describes a method to calibrate a network of cameras to a common world coordinate system. The working environment is assumed to resemble common indoor floorplans, with rooms and corridors connected via doorways. The camera network is assumed to resemble a security video network, where each camera views a moderate amount of floorspace from an elevated position.

Traditional installations for multi-camera systems include stereo heads, object modeling scanners, and factory workcells. In these configurations the fields-of-view of the cameras all overlap. In this case a single target can be used to calibrate all the cameras to a common world coordinate system [1]. Fig. 1 illustrates a common configuration.

In this work we consider a distributed multi-camera system. Fig. 2 illustrates a possible configuration. In this configuration the fields-of-view of the cameras do not overlap, so that multiple targets must be used for calibration. In this case methods are needed to establish the positions of the multiple calibration targets in a common world coordinate system.

Camera calibration also requires the establishment of correspondences between image points and world coordinates. A traditional multi-camera system might include two to five cameras. In this case, manual methods are tolerable. One approach is to carefully measure image coordinates using a graphical user interface. However, as the number of cameras increases, such methods become increasingly tedious and error prone.

In this work, multiple, identical tiles are used for calibration targets. The tiles are posterboard-sized. Each tile exhibits a 2×2 dot pattern. The dots are positioned so that when two tiles are placed side-by-side, or end-to-end, the spacing between consecutive dots remains constant. We refer to these calibration targets as dominoes.

Dominoes are placed on the floor, throughout the floorspace of the combined field-of-view of the cameras. All the dominoes must be adjoining side-by-side or end-to-end. In this manner, the dots deploy a common coordinate system. The dominoes do not require any permanent marking of the environment, or any particular floorspace configuration.

A novel grid-finding algorithm is applied to each image. The grid-finder automatically locates and segments a rectilinear configuration of dots. The user need only specify the size (in dots) of the visible domino configuration, the world coordinates of one of the visible dots, and the world axes orientations and scales relative to the dot axes. The grid-finder produces as output a list of correspondences between 2D image coordinates and 3D world coordinates, one per dot. To solve for the camera model, we use the coplanar solution introduced by Tsai [2], [3].

A single image is captured from each camera for calibration. The images do not need to be captured simultaneously. A limited set of dominoes may be used in ‘leapfrog’ fashion (see Fig. 3). The dominoes are deployed to fill the first camera's field-of-view, and its calibration image is acquired. Dominoes are then moved from one side of the initial deployment to the other side of the initial deployment. This motion iterates until the second camera's field-of-view is filled with dominoes, then its calibration image is acquired. This procedure iterates until a calibration image has been acquired for all cameras.

Blank dominoes may be substituted for dotted dominoes to cover floorspace on the boundary of a field-of-view. Blank and dotted dominoes can also be substituted between image captures for different cameras. In this manner, rectilinear dot grids can be presented to each camera, regardless of the amount of field-of-view overlap.

Section snippets

Related work

Camera calibration is the process of determining a camera's internal (focal point, lens distortion, etc.) and external (position and orientation) parameters. These parameters model the camera in a reference system in the space being imaged, often called world coordinates. Once calibrated, a camera's 2D image coordinates map to 3D rays in world coordinates.

The standard camera calibration process has two steps. First, a list of 2D image coordinates and their corresponding 3D world coordinates is

Domino calibration target

The purpose of the domino calibration target is to facilitate deployment of calibration points throughout multiple connecting rooms and corridors, while maintaining a common coordinate system. This is accomplished by constructing multiple identical dominoes, so that when positioned end-to-end and side-by-side, the distances between dots remains constant. Fig. 4 presents an overhead view of a floorplan, consisting of three areas (two rooms and a hallway), observed by 12 cameras, along with a

Establishing correspondences

The following describes an algorithm for automatically finding rectilinear grids, imaged as described in Section 1. Grid-like structures are often employed as calibration targets (see for instance [8], [16], [17], [18], [19]). A grid makes a good calibration target because it (a) uses features with maximum contrast (dots on a background), (b) spans the imaged area evenly (so the resulting camera model is accurate for the entire image), and (c) presents a recognizable macro-configuration (the

Experiments

To make the calibration process easier a graphical user interface has been constructed. From the interface it is possible to control the calibration of multiple cameras. The interface offers the option to select the input camera, as in Fig. 10(a), and adjust the algorithm parameters. The image on screen is a live video signal. This is especially useful while positioning dominoes. When the user chooses to perform a calibration, the program prompts for the size of the grid to look for, as in Fig.

Conclusions and discussion

In this paper, novel methods were described to calibrate multiple cameras that observe multiple connecting rooms and corridors to a common coordinate system. A domino calibration target was described, along with a grid finder algorithm, to expedite the calibration process. Experiments were shown to demonstrate the efficacy and accuracy of the approach.

We imagine accuracies of 4.3 mm in world coordinates and 0.28 pixels in image coordinates should prove acceptable for a variety of applications,

Summary

This work considers the problem of calibrating a distributed multi-camera system. As opposed to traditional multi-camera systems, such as stereo heads, in a distributed network the fields-of-view do not all overlap. Methods are needed to align multiple distributed calibration targets in a common coordinate system. Methods are also needed in which the required effort scales reasonably as the number of cameras and installations increases. Our novel method uses a grid of domino calibration

About the Author—BENT OLSEN received his M.S. degree (1998) in Electrical Engineering from Aalborg University, Denmark. During that time his research focused on medical image processing and sensor network control of mobile robots. From 1997 to 1998 Mr. Olsen was a visiting scholar at the University of California, San Diego. In 1998 Mr. Olsen held a position as a Research Assistant at Aalborg University, where his research focused on virtual reality. Since 1999 Mr. Olsen has been working at

References (22)

  • R. Horaud et al.

    An analytic solution for the perspective 4-point problem

    Comput. Vision Graphics Image Process.

    (1989)
  • L. Robert

    Camera calibration without feature extraction

    Comput. Vision Image Understanding

    (1996)
  • F. Pedersini et al.

    Multi-Camera Systems

    IEEE Signal Process.

    (1999)
  • R.K. Lenz et al.

    Techniques for calibration of the scale factor and image center for high accuracy 3D machine vision metrology

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1988)
  • R.Y. Tsai

    A versatile camera calibration technique for high accuracy 3D vision metrology using off-the-shelf tv cameras and lenses

    IEEE Trans. Robotics Automat.

    (1987)
  • R.Y. Tsai

    Synopsis of recent progress on camera calibration for 3D machine vision

  • R. Jain, R. Kasturi, B.G. Schunck, Calibration Machine Vision, McGraw-Hill Inc., New York, 1995, pp. 309–362 (Chapter...
  • R. Talluri et al.

    Position estimation techniques for an autonomous mobile robot – a review

  • D. Gibbins, G.N. Newsam, M.J. Brooks, Detecting suspicious background changes in video surveillance of busy scenes,...
  • A. Hoover, B. Olsen, A real-time occupancy map from multiple video streams, Proceedings of IEEE International...
  • B.S.Y. Rao et al.

    A fully decentralized multi-sensor system for tracking and surveillance

    Int. J. Robotics Res.

    (1993)
  • Cited by (0)

    About the Author—BENT OLSEN received his M.S. degree (1998) in Electrical Engineering from Aalborg University, Denmark. During that time his research focused on medical image processing and sensor network control of mobile robots. From 1997 to 1998 Mr. Olsen was a visiting scholar at the University of California, San Diego. In 1998 Mr. Olsen held a position as a Research Assistant at Aalborg University, where his research focused on virtual reality. Since 1999 Mr. Olsen has been working at Praja, Inc. on multimedia content management systems.

    About the Author—ADAM HOOVER received a B.S. (1992) and M.S. (1993) in Computer Engineering, and a Ph.D. (1996) in Computer Science and Engineering, all from the University of South Florida. During this time his research focused on range image processing and 3D model construction for object recognition and mobile robot navigation. From 1996 to 1998 Dr. Hoover held a post-doctoral position at the University of California, San Diego, in the Electrical and Computer Engineering Department. During this time his research focused on medical (retinal) image processing, data fusion for multiple video streams, and sensor network control of mobile robots. In January of 1999 Dr. Hoover joined the Electrical and Computer Engineering Department of Clemson University as an Assistant Professor. His research continues on the aforementioned projects, and also blends with computer and robot architecture issues as they relate to machine vision.

    View full text