Unsupervised multiscale segmentation of color images

https://doi.org/10.1016/j.patrec.2006.10.001Get rights and content

Abstract

This paper proposes a new multiresolution technique for color image representation and segmentation, particularly suited for noisy images. A decimated wavelet transform is initially applied to each color channel of the image, and a multiresolution representation is built up to a selected scale 2J. Color gradient magnitudes are computed at the coarsest scale 2J, and an adaptive threshold is used to remove spurious responses. An initial segmentation is then computed by applying the watershed transform to thresholded magnitudes, and this initial segmentation is projected to finer resolutions using inverse wavelet transforms and contour refinements, until the full resolution 20 is achieved. Finally, a region merging technique is applied to combine adjacent regions with similar colors. Experimental results show that the proposed technique produces results comparable to other state-of-the-art algorithms for natural images, and performs better for noisy images.

Introduction

Image segmentation consists of partitioning an image into isolated regions, such that each region shares common properties and represents a different object. Such task is typically the first step in more advanced vision systems, in which object representation and recognition are needed. Although isolating different objects in a scene may be easy for humans, it is still surprisingly difficult for computers. An additional problem arises when dealing with color images, due to the variety of representations (color spaces) that can be used to characterize color similarity.

Several authors have tackled the problem of color image segmentation, using a variety of approaches, such as active contours, clustering, wavelets and watersheds, among others. Some of these techniques are revised next.

Sapiro (1997) proposed a framework for object segmentation in vector-valued images, called color snakes. As in the original snakes formulation for monochromatic images, color snakes present the nice property of smooth contours, but require a manual initialization and may face converge problems.

Comaniciu and Meer (2002) proposed a unified approach for color image denoising and segmentation based on the mean shift. A kernel in the joint spatial-range domain is used to filter image pixels in the CIELUV color space, and filtered pixels are clustered to obtain segmented objects. Although this technique presents good results, it requires a manual selection of spatial (hs) and color (hr) bandwidths, and optionally a minimum area parameter (M) for region merging.

Liapis et al. (2004) proposed a wavelet-based algorithm for image segmentation based on color and texture properties. A multichannel scale/orientation decomposition using wavelet frame analysis is performed for texture feature selection, and histograms in the CIELAB color space are used for color feature extraction. Two labelling algorithms are proposed to obtain the final segmentation results based on either or both features. This technique also achieves nice segmentation results for natural complex images, but the number of different color–texture classes must be selected by the user, which may not be easy to define in practical applications. Ma and Manjunath, 2000, Deng and Manjunath, 2001 also proposed approaches for image segmentation based on color and texture. In (Ma and Manjunath, 2000), a predictive coding model was created to identify the direction of change in color and texture at each image location at a given scale, and object boundaries are detected where propagated “edge flows” meet. In JSEG (Deng and Manjunath, 2001), a color quantization scheme is initially applied to simplify the image. Then, local windows are used to compute J-images, that return high values near object boundaries and low values in their interior. Finally, a multiscale region growing procedure is applied to obtain the final segmentation. It is important to notice that JSEG is intended to be an unsupervised segmentation method, meaning that it is free of user-defined parameters.

Nock and Nielsen, 2003, Nock and Nielsen, 2004, Nock and Nielsen, 2005 proposed fast segmentation techniques based on statistical properties of color images. These approaches take into account expected homogeneity and separability properties of image objects to obtain the final segmentation through region merging. In particular, the techniques described in (Nock and Nielsen, 2003, Nock and Nielsen, 2004) are unsupervised and well suited for noisy images, while the method presented in (Nock and Nielsen, 2005) requires some user assistance.

Other authors have combined watershed segmentation with multiresolution image representations. Scheunders and Sijbers (2002) used a non-decimated wavelet transform to build a multiscale color edge map, which was filtered by a multivalued anisotropic diffusion. The watershed transform is then applied at each scale, and a hierarchical region merging procedure is applied to connect segmented regions at different scales. Despite the denoising power of both wavelet transform and anisotropic diffusion, the experimental results presented in their paper indicate a considerably large number of segmented regions. Vanhamel et al. (2003) also explored multiscale image representations and watersheds for color image segmentation. In their approach, the scale-space is based on a vector-valued diffusion scheme, and color gradients in the YUV color space are computed at each scale. After applying the watershed transform, the dynamics of contours in scale-space are used to validate detected contours. Results presented in the paper are visually pleasant, but noisy images were not tested. Kazanov (2004) proposed a multiscale watershed-based approach for detecting both small and large objects, focused on scanned pages of color magazines. For detecting small objects, a small-support edge detector is used. For larger objects, a multiscale version of the gradient estimator is computed. One potential problem of this technique is its high sensitivity to noise/texture, due to the use of small-support edge detectors.

Although this bibliographical revision was mostly focused on techniques involving wavelets, watersheds or multiresolution analysis, there are several other recent competitive approaches for color image segmentation, such as Makrogiamis et al., 2003, Chen et al., 2004, Nikolaev and Nikolayev, 2004, Marfil et al., 2004, Navon et al., 2005. Also, it can be noticed that most authors have not considered the problem of noisy color image segmentation, specially for large amounts of noise contamination.

This work extends the procedure proposed in (Jung, 2003) for multiscale segmentation of color images with several improvements, such as the formulation of a statistical model for gradient magnitudes of color images using joint information of color channels, the automatic thresholding of color gradient magnitudes based on a posteriori probabilities, and the inclusion of a similarity metric in the CIELAB color space for region merging, based on perceived contrast between colors. It should be noticed that the approach in (Jung, 2003) can only be applied to monochromatic images.

In the proposed approach, a decimated wavelet transform (WT) is initially applied to each color channel, producing a multiresolution image representation up to a selected scale 2J. A color gradient magnitude image is computed at the coarsest resolution 2J, and an adaptive threshold is used to remove spurious responses. The watershed transform is applied to thresholded magnitudes, obtaining an initial segmentation. The inverse wavelet transform (IWT) is then used to project this initial segmentation to finer scales, until the full resolution image is achieved. Finally, a region merging procedure based on CIELAB color distances is applied to obtain the final segmentation. As it will be discussed along this manuscript, the denoising power of the WT combined with the probabilistic edge estimator effectively reduces the well-known oversegmentation problem of the watershed transform, even for images with significant noise contamination. Also, the same set of default parameters produces good results for most images, indicating that the proposed technique can be used for unsupervised image segmentation.

The remainder of this paper is organized as follows. Section 2 provides a very brief revision on wavelets and watersheds. The proposed method is described in Section 3, and experimental results are provided in Section 4. Finally, conclusions are drawn in the final section.

Section snippets

Decimated wavelet transforms

In a few words, the WT of an intensity image up to the scale 2J is a set of detail subimages W2jh, W2jv,W2jd, for j = 1, …, J, and a smoothed and downsampled version A2J of the original image, called in this work the approximation image. According to Mallat’s pyramid algorithm (Mallat, 1989), such subimages can be obtained by a combination of convolutions with low-pass and high-pass filters associated with a mother wavelet followed by downsamplings. It is important to notice that the original image

The proposed algorithm

The proposed algorithm can be summarized into the following steps:

  • (1)

    select a desired scale 2J, and compute the WT of each color channel up to this scale;

  • (2)

    compute and threshold the color gradient magnitude of approximation image at the coarsest resolution;

  • (3)

    apply the watershed transform to thresholded magnitudes, and obtain an image representation at scale 2J;

  • (4)

    project the image representation to the full resolution 20, using the IWT;

  • (5)

    apply a contrast-based region merging technique.

These steps are

Experimental results

This section presents results of the proposed technique (called Waveseg) applied to several images containing both natural and artificial noise. We also compared Waveseg with three state-of-the-art segmentation methods, namely Mean-shift (Comaniciu and Meer, 2002), JSEG (Deng and Manjunath, 2001) and SRM (Nock and Nielsen, 2004). For Mean-shift, we manually selected required parameters hs, hr and M to obtain good visual results, following the guidelines provided in (Comaniciu and Meer, 2002).

Discussion and conclusions

In this work, a multiresolution color segmentation algorithm was presented. The WT is applied up to a selected scale 2J, and color gradient magnitudes are computed at the coarsest resolution. An adaptive threshold based on statistical properties of gradient magnitudes is estimated, and watersheds are applied to obtain an initial segmentation at the coarsest resolution 2J. The initial segmentation is then projected to finer resolutions using the IWT, until the final segmentation at scale 20 is

Acknowledgement

This work was developed in collaboration with HP Brazil R&D and Brazilian research agency CNPq. The author would like to thank the anonymous reviewers, for their valuable contributions.

References (34)

  • R.S. Berns

    The science of digitizing paintings for color-accurate image archives: A review

    J. Imaging Sci. Technol.

    (2001)
  • G. Casella et al.

    Statistical Inference

    (1990)
  • Chen, J., Pappas, T., Mojsilovic, A., Rogowitz, B., 2004. Perceptually-tuned multiscale color–texture segmentation. In:...
  • H. Cheng et al.

    A hierarchical approach to color image segmentation using homogeneity

    IEEE Trans. Image Process.

    (2000)
  • D. Comaniciu et al.

    Mean shift: A robust approach toward feature space analysis

    IEEE Trans. Pattern Anal. Machine Intell.

    (2002)
  • Y. Deng et al.

    Unsupervised segmentation of color–texture regions in images and video

    IEEE Trans. Pattern Anal. Machine Intell.

    (2001)
  • C. Fuh et al.

    Hierarchical color image region segmentation for content-based image retrieval system

    IEEE Trans. Image Process.

    (2000)
  • Cited by (0)

    View full text