Elsevier

Pattern Recognition

Volume 43, Issue 10, October 2010, Pages 3338-3347
Pattern Recognition

A novel iterative shape from focus algorithm based on combinatorial optimization

https://doi.org/10.1016/j.patcog.2010.05.029Get rights and content

Abstract

Shape from focus (SFF) is a technique to estimate the depth and 3D shape of an object from a sequence of images obtained at different focus settings. In this paper, the SFF is presented as a combinatorial optimization problem. The proposed algorithm tries to find the combination of pixel frames which produces maximum focus measure computed over pixels lying on those frames. To reduce the high computational complexity, a local search method is proposed. After the estimate of the initial depth map solution of an object, the neighborhood is defined, and an intermediate image volume is generated from the neighborhood. The updated depth map solution is found from the intermediate image volume. This update process of the depth map solution continues until the amount of improvement is negligible. The results of the proposed SFF algorithm have shown significant improvements in both the accuracy of the depth map estimation and the computational complexity, with respect to the existing SFF methods.

Introduction

In computer vision, the techniques to recover the three-dimensional (3D) geometry of a scene or an object from a collection of images are known as shape-from-X. Where, X denotes the accommodation cue to infer the shape, such as stereo, motion, shading, texture, etc. Shape-from-focus/defocus (SFF/SFD) deals with the recovery of shape from multiple images of the same scene that are captured with different geometries of the imaging device. SFD techniques [1], [2], [3], [4] estimate the depth of a point by measuring the amount of blur (error in focus) from one or two images. However, SFF requires searching the best focus setting that gives the best focus at each point [5], [6], [7]. Therefore, in SFF, every point of an object needs to be well focused in a particular image frame in the collection of images. SFF achieves better quality of shape in comparison to SFD with the cost of higher computational complexity. In this paper, we focus the discussion on SFF.

In SFF, a sequence of images of an unknown object is acquired by changing the level of the object focus. The change in the level of focus is made either by varying the focus value of the camera, or varying the distance of the object from the lens. The acquired image sequence is a three dimensional image volume where the row and the column of each image frame are the first and second dimension and the image frames along the optical (or temporal) axis the third dimension. In the image space, each object point is gradually focused until it attains maximum focus and then gradually blurred along the temporal axis. Then, a focus measure operator is applied to compute the focus value on the small regions of every pixel in the image volume. The focus value increases as the image sharpness or contrast increases and it attains maximum value at the best focused point. For each point (ith row, jth column) which corresponds to each object point, the image frame that exhibits maximum sharpness value along the axis of third dimension is determined. This image frame has the information about the distance v between the lens and the image detector when it is taken. Then, by thin lens formula, the distance D of an object point from the lens is given asD=vfvfwhere f is the focal length of the lens. The collection of depth information at each location (ith row, jth column) constitutes the depth map of an object. In addition, the collection of the gray level (or color) value of the best focused image frame at each point (ith row, jth column) constitutes all-in-focus image. Hence, in SFF, the measure of focus or sharpness is the most crucial part for the quality of the final depth estimation. Traditionally, the focus measure is applied on each 2D image frame of the image volume. However, for an object with complex geometry, the images acquired from the lens with limited depth of field have different focus level. Hence, the focus value of a focus measure operator on 2D image frame does not represent accurate focus level at the pixel.

In this paper, the SFF is modeled as combinatorial optimization problem. The optimal depth map solution of an object is considered as the combination of pixel frame numbers that gives the maximum focus measure. Trying all combinations of the pixel frames requires high computational time. To reduce the computational complexity, a local search algorithm is proposed. First the initial depth map solution is obtained by applying a focus measure and then finding the frame number that maximizes the focus value along the optical axis. At each point (ith row, jth column), the neighborhood is defined from the initial solution by taking several preceding and following frames with respect to the initial depth. The intermediate image volume is obtained by collecting the pixels values of neighborhood at each point. The updated solution is found from the intermediate image volume. This update process continues until the convergence criterion is met. The process to obtain temporary image volume has the effect of aligning the curved object patch, corresponding to the focused image surface, perpendicular to the optical axis. Therefore, applying the focus measure on the intermediate image volume gives more accurate focus level at each pixel.

This article is organized as follows. In Section 2, we give brief overview of the SFF techniques. In Section 3, the proposed SFF algorithm is described. Comparison with the previous SFF methods is done in Section 4. Finally, Section 5 concludes our work.

Section snippets

Previous works

A focus measure is defined as a quantity to locally evaluate the sharpness of a pixel. It takes small local neighborhood and computes the sharpness of a chosen center pixel. Since each object point has different surface characteristic and geometry, the focus measure values of the same object point from different optical settings are compared. A variety of focus measures have been proposed in the spatial domain and the transformed domains [5], [8], [9], [10]. Among them, Sum modified Laplacian

Motivation

The SFF attempts to search for the surface in an image volume that gives the best focus measure. Fig. 1 illustrates small 3D image volume of an object and its corresponding depth map. The combination of pixel frame numbers of the depth map in Fig. 1 produces maximum focus measure computed over pixels on these image frames. Therefore, the SFF problem can be thought of as the one of choosing the combination of the pixel frames in an image volume so that the focus measure is maximized, which is

Experimental setup

In this section, the proposed algorithm was analyzed and compared with the previous SFF methods. The proposed SFF method based on local search (SFF-LS), searches the optimal depth map using the initial estimates of shape (FIS). Therefore, the proposed technique was compared with other well known methods, i.e., SFF based on FIS (SFF-FIS) [7] and SFF based on dynamic programming (SFF-DP) [14], which use the FIS concept in their algorithms. In addition, the proposed SFF algorithm was compared with

Conclusions

In this paper, the SFF problem is presented as combinatorial optimization problem. To reduce the computational complexity, a local search algorithm is proposed. Based on the initial estimate of the object depth map, a temporary image volume is created by taking into account several image frames preceding and following the best focused image frames corresponding to the initial depth map. The depth map is updated by applying focus measure on the temporary image volume. Pixels on each image frames

Acknowledgement

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MEST) (No. 2009-0083733).

About the Author—SEONG-O Shim received the B.S. degree in electrical engineering from Ajou University, Suwon, Korea, in 1999, the M.S. degree in Mechatronics from Gwangju Institute of Science & Technology, Gwangju, Korea, in 2001.

From 2003 to 2007, he was with LG Electronics DTV Labs, Seoul, Korea, working on research and development of digital TV. He is currently doing research at Signal and Image Processing Lab in School of Information & Mechatronics at Gwangju Institute of Science and

References (21)

There are more references available in the full text version of this article.

Cited by (38)

  • Adaptive weighted guided image filtering for depth enhancement in shape-from-focus

    2022, Pattern Recognition
    Citation Excerpt :

    At the third step of SFF, the depth map can be estimated by finding the image number of the maximum focus measurement along the optical axis in the image focus volume [9]. In the literature, the FM operators can be affected by the noise level, contrast, the scene texture and other factors [10,11], which yields the focus volume to contain erroneous focus values, eventually leads to the depth map noisy and inaccurate. Therefore, a number of algorithms have been proposed to improve the focus volume to obtain an accurate depth map.

  • Deep momentum uncertainty hashing

    2022, Pattern Recognition
    Citation Excerpt :

    Combinatorial optimization (CO) has a great impact on business and society, ranging from locomotive dispatching to aerospace industry [1,2].

  • Image focus volume regularization for shape from focus through 3D weighted least squares

    2019, Information Sciences
    Citation Excerpt :

    In another work, Ahmed and Choi [1] used dynamic programming instead of any approximation technique to search for the FIS by optimizing the focus measure in 3D volume. Shim and Choi [23] presented the SFF as a combinatorial optimization problem. The initial depth map is updated from the intermediate image volume which is generated from the local neighborhood.

View all citing articles on Scopus

About the Author—SEONG-O Shim received the B.S. degree in electrical engineering from Ajou University, Suwon, Korea, in 1999, the M.S. degree in Mechatronics from Gwangju Institute of Science & Technology, Gwangju, Korea, in 2001.

From 2003 to 2007, he was with LG Electronics DTV Labs, Seoul, Korea, working on research and development of digital TV. He is currently doing research at Signal and Image Processing Lab in School of Information & Mechatronics at Gwangju Institute of Science and Technology, Korea. His research interests include image processing, 3D shape recovery, content based image retrieval (CBIR), medical imaging.

About the Author—Tae-Sun Choi received the B.S. degree in electrical engineering from the Seoul National University, Seoul, Korea, in 1976, the M.S. degree in electrical engineering from the Korea Advanced Institute of Science and Technology, Seoul, Korea, in 1979, and the Ph.D. degree in electrical engineering from the State University of New York at Stony Brook, in 1993.

He is currently a Professor in the Department of Mechatronics at Gwangju Institute of Science and Technology, Gwangju, Korea. His research interests include image processing, machine/robot vision, and visual communications.

View full text