Elsevier

Pattern Recognition

Volume 36, Issue 9, September 2003, Pages 2143-2159
Pattern Recognition

3-D shape reconstruction in an active stereo vision system using genetic algorithms

https://doi.org/10.1016/S0031-3203(03)00049-9Get rights and content

Abstract

The recovery of 3-D shape information (depth) using stereo vision analysis is one of the major areas in computer vision and has given rise to a great deal of literature in the recent past. The widely known stereo vision methods are the passive stereo vision approaches that use two cameras. Obtaining 3-D information involves the identification of the corresponding 2-D points between left and right images. Most existing methods tackle this matching task from singular points, i.e. finding points in both image planes with more or less the same neighborhood characteristics. One key problem we have to solve is that we are on the first instance unable to know a priori whether a point in the first image has a correspondence or not due to surface occlusion or simply because it has been projected out of the scope of the second camera. This makes the matching process very difficult and imposes a need of an a posteriori stage to remove false matching.

In this paper we are concerned with the active stereo vision systems which offer an alternative to the passive stereo vision systems. In our system, a light projector that illuminates objects to be analyzed by a pyramid-shaped laser beam replaces one of the two cameras. The projections of laser rays on the objects are detected as spots in the image. In this particular case, only one image needs to be treated, and the stereo matching problem boils down to associating the laser rays and their corresponding real spots in the 2-D image. We have expressed this problem as a minimization of a global function that we propose to perform using Genetic Algorithms (GAs). We have implemented two different algorithms: in the first, GAs are performed after a deterministic search. In the second, data is partitioned into clusters and GAs are independently applied in each cluster. In our second contribution in this paper, we have described an efficient system calibration method. Experimental results are presented to illustrate the feasibility of our approach. The proposed method yields high accuracy 3-D reconstruction even for complex objects. We conclude that GAs can effectively be applied to this matching problem.

Introduction

One of the major areas in computer vision is the recovery of 3-D shape information (depth) using stereo vision analysis. The widely known stereo vision methods are the passive stereo vision approaches. They attempt to imitate the depth extraction ability of the human visual system with the use of two cameras and a computer. In this case, obtaining 3-D information involves the identification of the corresponding 2-D points between the left and right images that are projections of the same physical point in the 3-D scene. This is called the stereo matching problem. If both the geometrical relationship between the two cameras and their intrinsic parameters are known (from the system calibration process), the 3-D coordinates of a point can be deduced from the 2-D coordinates using epipolar geometry [1], [2]. Although this method is accurate and often used, it requires computationally expansive image processing tasks, such as extracting the points to be reconstructed from one of the two images and searching along the epipolar line for their corresponding points in the second image. The stereo matching problem remains one of the most difficult problems in computer vision and has stimulated a great deal of literature [3], [4], [5], [6], [7]. Most stereo matching techniques are either area- or feature-based. No matter which matching technique is used, the correspondence process is done assuming that for each point in one image there is a unique matching point in the other image. This underlying assumption appears to be a valid one for relatively textured areas and for image pairs with little difference between them. However it may be wrong at occlusion boundaries and within featureless regions. Moreover, particularly in area-based methods, the reliability of calculating the correlation to determine corresponding points depends on the size of the window in which the search is performed. Most conventional correlation-based stereo matching methods use a window of a fixed size which is selected experimentally for each application. Kanade has proposed a stereo matching method using an adaptive window [8]. Saito et al. have introduced a method for determining the optimal window size [9].

Active stereo vision systems, also called structured-light systems, offer an alternative approach to the use of two cameras. An artificial source of energy, such as an ultrasonic or laser device, which projects a known pattern on the studied scene, replaces the second stereo camera. Analyzing the deformation of the pattern in the image acquired by the camera with respect to the projected one provides 3-D information [10], [11], [12]. Some systems utilize a coded structured-light pattern, which allows unique codification of each token of the projected light. Thus, the correspondence task that determines where each token comes from is directly solved. However, this kind of system has a drawback: it imposes constraints on the reflectance of the objects and on the illumination of the measuring scene. A survey of the most commonly used coded structured-light techniques is given in Ref. [13].

In this study, we are concerned with an active stereo vision system which does not use any coded structured-light technique. Our system is composed of a CCD camera connected to a PC which allows the acquisition of images, and a laser diode coupled with a diffraction grid for the generation of a pyramid-shaped laser beam composed of 361 (19×19) rays oriented so that the angle between two consecutive rays, in both directions (horizontally and vertically), is fixed and equal to 0.77°. An image thus obtained has many spots created by the laser rays. Contrary to the passive stereo vision systems which can generate a dense depth map, our system can only produce a sparse depth map (i.e. only on the pixel corresponding to the center of the spots in the image). Calculating 3-D coordinates of a point requires the identification of the laser ray from which its corresponding 2-D spot in the image originates. In this case, the stereo matching problem boils down to matching the laser rays and their corresponding spots in the 2-D image acquired by the camera. We propose to perform this matching task using Genetic Algorithms (GAs).

GAs are a type of stochastic search methods the functioning of which is inspired by natural selection and the principles of evolution [14]. They are not gradient based and do not require initial guesses. Furthermore, genetic searches begin from a set of points in the search space rather than a single point. Even more significant is the fact that the search mechanisms possess an implicit parallelism that enables a rapid sampling of the search space and thus an improved recognition of the whereabouts of the global optima. All these features tend to render GAs robust and global without the pitfall of entrapment at local optima. GAs have received a great deal of attention recently and are widely used in diverse areas of image processing and pattern recognition [15], [16], [17], [18], [19].

This paper proposes a global 3-D shape reconstruction approach. In Section 2, we describe the system calibration process which provides the parameters necessary to model the relationship between the laser projector and the camera. Section 3 is devoted to the matching task that associates each image spot with the laser ray from which it originates. Two different algorithms are proposed. In the first one, referred to as the Hybrid GA Matching Method (HGAMM), some spots and rays are matched using a deterministic method. Then, the remaining spots are coupled with their corresponding rays using a GA. In the second algorithm, referred to as the Partitioning GA Matching Method (PGAMM), the ray set is first partitioned in regions, then a GA matching process is performed independently in each region. In Section 4, we describe how to compute the 3-D coordinates of physical points represented by spots on the image. Experimental results are discussed in Section 5, followed by our conclusions in Section 6.

Section snippets

System calibration

Obtaining depth information using a stereo vision system requires the calibration of the system. Classical calibration methods aim to determine the matrices which characterize the intrinsic and extrinsic parameters of the system [1]. In this study we propose a simpler calibration method which directly provides the parameters needed, (1) to solve the stereo matching process and, (2) to reconstruct the 3-D shape. Note that the method requires a plane spanning the workspace (area within which the

Matching laser rays and image spots

The major information to extract from an image acquired by our active stereovision system is the set of spots created by the rays on an object to be analyzed. Reconstructing the 3-D shape of the object from this image requires the matching of each spot with its corresponding laser ray. It is a hard combinatorial search problem which is difficult to solve with a conventional optimization method. To perform this matching task, we propose two different methods, both based on GAs.

GAs are adaptive

Shape reconstruction

In order to reconstruct the 3-D surface of an object, one only needs to take one image of the object illuminated by the laser beam. If the calibration parameters are computed and the matching between the image spots and the laser rays is done using either one of the two matching methods previously described or any other matching algorithm, the 3-D coordinates of the object points corresponding to the spots on the image can be calculated as follows:

Let pl be the center of a given spot on the

Experimental results

The following experiments were conducted to assess the feasibility and the validity of our approach. The first experiments were concerned with 3-D reconstruction of two objects: a fairly simple object, a biplane that allows an easy qualitative result verification, and a computer mouse the shape of which is more complex since it has some concavities and variable relief. During image acquisition, a planar surface was put up behind the objects in order to recover all the laser spots. However,

Conclusion

In this paper, we have presented a new 3-D shape reconstruction method. It is based on an active stereo vision system which illuminates objects to be analyzed by a structured laser system. The projections of laser rays on the objects are detected as spots in the image, and depth information of each spot is computed. Calculating 3-D coordinates of a spot requires:

  • (1)

    calibrating the system : an easy and efficient calibration method is proposed;

  • (2)

    finding the correct correspondence between the spot

Acknowledgements

The authors thank the anonymous reviewers for their careful reading of the manuscript and their sound comments which have greatly helped in improving the clarity and the presentation of this work.

About the AuthorALBERT DIPANDA received his Ph.D. degree in Computer Vision in 1990 from the University of Burgundy, France, where he is currently an associate professor and a member of the Image Processing group of the Laboratory LE2I (Laboratoire d'Electronique, Informatique et Image). His research interests include motion estimation, MRF modeling in Image Processing and 3-D reconstruction.

References (23)

  • R. Deriche, O. Faugeras, 2D-curves matching using high curvatures points: applications to stereovision, Proceedings of...
  • Cited by (0)

    About the AuthorALBERT DIPANDA received his Ph.D. degree in Computer Vision in 1990 from the University of Burgundy, France, where he is currently an associate professor and a member of the Image Processing group of the Laboratory LE2I (Laboratoire d'Electronique, Informatique et Image). His research interests include motion estimation, MRF modeling in Image Processing and 3-D reconstruction.

    About the AuthorSANGHYUK WOO received the BBC degree in Mechanical Engineering from the A-Jou University of South Korea in 1994. He is currently student at the University of Burgundy, in France, for obtaining the Ph.D. degree in Computer Vision and Image Processing. His research interests include 3-D reconstruction and 3-D movement analysis using Genetic Algorithms.

    About the AuthorFRANK MARZANI received his Ph.D. degree from the University of Burgundy, Dijon, France in 1998. He is currently an associate professor at the University of Burgundy and a member of the laboratory LE2I. His research interests include motion estimation, human motion analysis and image processing applied to 3-D reconstruction.

    About the AuthorJEAN-MARIE BILBAULT was born on March 6, 1955 in Compiegne, France. He became engineer of Ecole Centrale de Paris in 1978, and received the Ph.D. degree in Dijon in 1980. He is currently full professor at the University of Burgundy in Dijon, France where his present research includes nonlinear electronics and Genetic Algorithms related to image processing.

    View full text