Elsevier

Pattern Recognition

Volume 46, Issue 3, March 2013, Pages 1002-1011
Pattern Recognition

Multi-focus image fusion based on the neighbor distance

https://doi.org/10.1016/j.patcog.2012.09.012Get rights and content

Abstract

The effective measurement of pixel's sharpness is a key factor in multi-focus image fusion. In this paper, a gray image is considered as a two-dimensional surface, and the neighbor distance deduced from the oriented distance in differential geometry is used as a measure of pixel's sharpness, where the smooth image surface is restored by kernel regression. Based on the deduced neighbor distance filter, we construct a multi-scale image analysis framework, and propose a multi-focus image fusion method based on the neighbor distance. The experiments demonstrate that the proposed method is superior to the conventional image fusion methods in terms of some objective evaluation indexes, such as spatial frequency, standard deviation, average gradient, etc.

Highlights

► We propose a multi-focus image fusion based on the neighbor distance. ► The neighbor distance can effectively measure the sharpness of image pixels with different focus settings. ► The neighbor distance is deduced from oriented distance in differential geometry. ► The smooth image surface is fitted by kernel regression.

Introduction

Due to the limited depth-of-focus of optical lenses in imaging camera, it is impossible to capture an image in which all containing objects appear sharp. Only the objects within the depth of field are sharp, while other objects are blurred. A popular way to get an image with every object in focus is image fusion, in which one can acquire a series of pictures with different focus settings and fuse them to create a new and improved image that contains a better description of the scene than any of the individual source image. Up to now, image fusion has been successfully applied to many fields, such as military affairs, medical imaging, remote sensing, digital camera, and so on [1].

As we know, the objective of multi-focus image fusion is to produce an image that contains all relevant objects in focus by extracting and synthesizing the focused objects of source images. The basic assumption of the multi-focus image fusion is that the focused object is sharper than the unfocused object, and the sharpness is linked to some easily computed information measures. During the last decade, a number of sharpness measures for multi-focus image fusion have been proposed. Basically, these measures can be categorized into two categories. The first category is the spatial domain-based measures, which directly estimate the sharpness by intensity values. The other category is the frequency domain-based measures under the assumption that the sharpness can be indicated by the high frequency sub-band coefficients of the source image's multi-scale decomposition.

The commonly used spatial domain-based measures include [2]: variance, energy of image gradient (EOG), Tenenbaum's algorithm, energy of lap (EOL), sum-modified-Laplacian (SML) and spatial frequency (SF). Huang and Jing [2] assessed these measures according to some objective standards. Their experiment results show that SML and EOL can provide better performance than other sharpness measures. The common scheme of multi-focus image fusion methods with the spatial domain-based sharpness measures is the block-based fusion scheme, which first divide the source images into blocks or regions, second compute the block's sharpness, and finally select the sharper blocks from source images by some selecting method, such as pulse-coupled neural network [3], artificial neural network [4], genetic algorithm [5], support vector machine [6] or simply copy the sharper blocks into the fused image [2]. In those methods, the sharpness's efficiency strongly depends on the block's size and segmented algorithm. If a divided block is partly clear and partly blurry, its sharpness is not precise and may produce block effects because the blurry part may be selected as part of the fused image when considering the integrity of the segmented part. The block effects will significantly compromise the quality of the fused image. With a multi-resolution transform, an image can be decomposed into low frequency sub-band coefficients and high frequency sub-band coefficients. By the construction of the multi-resolution decomposition, the high frequency sub-band coefficients indirectly represent the gray value variation between a pixel and its neighbor. Hence, the high frequency sub-band coefficients can also indicate the pixel's sharpness. The basic scheme of multi-focus image fusion methods based on the multi-resolution transform is to perform a multi-resolution decomposition on each source image, then integrates all these decompositions to form a composite representation, and finally reconstructs the fused image by performing an inverse multi-resolution transform. Relative to the block-based fusion methods, the multi-resolution transform-based fusion methods can successfully overcome the block effects mentioned above, because the high frequency coefficients are selected out to compose fused image, not pixels or blocks in spatial domain. The commonly used multi-scale decomposition transforms for image fusion include Laplacian pyramid (LAP) [7], filter subtract decimate hierarchical pyramid (FSD) [8], gradient pyramid (GRP) [9], ratio of low-pass pyramid (RAP) [10]. Due to the advantages over the pyramid transform, such as localization and direction, the discrete wavelet transform (DWT)-based fusion methods [11] are generally superior to the pyramid-based fusion methods. However, because of the underlying down-sampling process, the DWT is shift-variant. And consequently, the DWT-based fusion methods are also shift-dependent, which are undesirable since different fusion results are obtained once input images are mis-registered [12]. To overcome this disadvantage of DWT, Hill et al. [13] introduced the shift-invariant and directionally selection dual tree complex wavelet transform to image fusion. Zhang et al. [12] proposed a fusion algorithm based on the shiftable complex directional pyramid transform. Besides the above spatial and frequency domain-based clarity measures, Sheng et al. [14] used the support value of support vector machine to measure the clarity of multi-focus image, and proposed a support value transform (SVT)-based image fusion method.

With the two spatial coordinates forming two of three dimensions and the gray value forming the third dimension, a gray image can be seen as a two-dimensional surface called as a gray image surface or image surface in the three-dimensional space. Fig. 1 shows two common test images with different focus settings, where the image (a) focuses on the big clock, and the image (b) focuses on the small clock. The image surfaces in image (a) and (b) correspond to the same block belonging to the big clock. As shown in Fig. 1, when the big clock is focused, the selected block is clearer and its corresponding surface is sharper; when the big clock is unfocused, the selected block is blurrier and its corresponding surface is flatter. Hence, the degree of surface curvature is related to the clarity of image block.

In differential geometry, the surface curvature—Gaussian curvature, mean curvature [15]—measure how much the surface curves at a particular point. The surface curvature has been used as a measure of image texture [16]. But the high computational complexity and the large value range (from to +) of curvature restrict its other applications in image processing. In differential geometry, the oriented distance between a point and its neighbor shown in Fig. 2 can measure how the surface is bending in the given direction (Δu,Δv) at a particular point (u,v). When (Δu,Δv) are given, the larger the oriented distance, the more curving the surface in the direction (Δu,Δv) at (u,v). Therefore, it is rational to use the oriented distance to measure the image surface curvature at a point, and consequently, the oriented distance is suitable to measure the pixel's clarity.

In an image, a pixel surrounded by eight pixels has eight neighbor directions, then the corresponding point in image surface has eight oriented distances. In this work, we name the sum of eight oriented distances as neighbor distance, and use it to measure the clarity of image pixels with different focus settings. Therefore, we can extract the focused objects of source images and integrate them into the fused image. The rest of this work is organized as follows. In Section 2, we introduce the definition of oriented distance and neighbor distance. The neighbor distance filter is also deduced in this section. In Section 3, we propose a neighbor distance transform, and describe a multi-focus image fusion method based on the neighbor distance transform. The experiment results are given in Section 4. Finally, conclusions are drawn in Section 5.

Section snippets

Neighbor distance and neighbor distance filter

This section is organized as follows. In the first section, we introduce the oriented distance and the neighbor distance defined on a smooth surface. Because the oriented distance is a property of smooth surface and the digit gray image is sampled from smooth image surface, we restore the smooth image surface by the non-parametric regression in the second subsection. The neighbor distance filter is also deduced in this section.

Multi-focus image fusion based on neighbor distance filter

Similar to the variance, EOG, SML etc, the neighbor distance is constructed in spatial domain, and we can choose the block-based fusion scheme as our fusion scheme. However, the block-based fusion methods have block effects and their performances strongly depend on the block size, and choosing a suitable block size for fusion is not a trivial question. Hence, we choose the multi-resolution transform-based fusion scheme as our fusion scheme. By the derivation of neighbor distance, the normalized

Experimental results

In this section, we test the proposed image fusion method based on the neighbor distance filter on the popular multi-focus test images (book, clock, desk, lab and pesi) shown in Fig. 5 Rockinger's Matlab toolbox1 is used as the reference for the implementation of LAP method, FSD method, GRP method and RAP method. The non-subsampled contourlet toolbox2 is used as the reference for non-subsampled contourlet transform

Conclusion

This work presents a new multi-focus image fusion method based on the neighbor distance. The neighbor distance is deduced and reduced from the oriented distance in differential geometry. Based on the experiments presented in Section 4, we can draw a conclusion that the neighbor distance can effectively measure the clarity of multi-focus image's pixel and the proposed image fusion method can extract the clearer pixels and integrate them into the resulting image.

Acknowledgments

We thank the associate editor and anonymous reviewers for their helpful and constructive suggestions. This work is supported by the Fundamental Research Funds for the Central University (CDJXS11182240) and the National Natural Science Foundation of China (61173129, 61173130, 61273244).

Hengjun Zhao received his B.Sc. and M.Sc. degrees in Applied Mathematics, from Southwest University in 2004 and 2007, respectively. Since 2009, he is currently a doctoral student in the Department of Computer Science at Chongqing University. His research interests include pattern recognition and image processing.

References (27)

  • J. Lewis et al.

    Pixel-and region-based image fusion with complex wavelets

    Information Fusion

    (2007)
  • J. Kong et al.

    Multi-focus image fusion using spatial frequency and genetic algorithm

    International Journal of Computer Science and Network Security

    (2008)
  • S. Li et al.

    Fusing images with different focuses using support vector machines

    IEEE Transactions on Neural Networks

    (2004)
  • Cited by (94)

    View all citing articles on Scopus

    Hengjun Zhao received his B.Sc. and M.Sc. degrees in Applied Mathematics, from Southwest University in 2004 and 2007, respectively. Since 2009, he is currently a doctoral student in the Department of Computer Science at Chongqing University. His research interests include pattern recognition and image processing.

    Zhaowei Shang received his Ph.D. degree in Computer Science from Xi’an Jiaotong University, China. He is presently a professor in the Department of Computer Science at Chongqing University. His research interests include wavelet theory and image processing.

    Yuanyan Tang received the Ph.D degree in Computer Science from Concordia University, Montreal, Canada. He is presently a professor in the Department of Computer Science at Chongqing University and a chair professor in the Faculty of Science and Technology, University of Macau. His current interests include wavelet theory and applications, pattern recognition, image processing, document processing, etc. He is the Founder and Editor-in-Chief of International Journal on Wavelets, Multiresolution, and Information Processing(IJWMIP). Professor Y.Y. Tang is an IAPR fellow and IEEE fellow.

    Bin Fang received the B.E. degree in Electrical Engineering from Xi'an Jiaotong University, Xi’an, China, the M.S degree in Electrical Engineering from Sichuan University, Chengdu, China, and the Ph.D degree in Electrical Engineering from the University of Hong Kong, Hong Kong, China. He is currently a professor in the Department of Computer Science at the Chongqing University. His research interests include computer vision, pattern recognition, medical image processing, biometrics applications, and document analysis.

    View full text