Global salient information maximization for saliency detection

https://doi.org/10.1016/j.image.2011.10.004Get rights and content

Abstract

In this paper, a new method for saliency detection is proposed. Based on the defined features of the salient object, we solve the problem of saliency detection from three aspects. Firstly, from the view of global information, we partition the image into two clusters, namely, salient component and background component by employing Principal Component Analysis (PCA) and k-means clustering. Secondly, the maximal salient information is applied to find the position of saliency and eliminate the noise. Thirdly, we enhance the saliency for the salient regions while weaken the background regions. Finally, the saliency map is obtained based on these aspects. Experimental results show that the proposed method achieves better results than the state of the art methods. And this method can be applied for graph based salient object segmentation.

Highlights

► Saliency detection based on our defined features of the salient object. ► Partition the image into two components by PCA and k-means clustering. ► The maximal salient information and salient information enhancement are applied. ► The proposed method can achieve better performance than the compared methods. ► This proposed method can be applied for graph based image segmentation.

Introduction

Humans and other animals cannot pay attention to more than one or very few items simultaneously. They only could facilitate learning and survival by enabling organisms to focus their limited perceptual and cognitive resources on the most pertinent subset of the available sensory data. This subset can be called as saliency of the sensory data which is the state or quality by which it stands out relative to its neighbors [1], such as an object, a person, a pixel. Usually, the saliency of the sensory data represents the most valuable information from a large amount of the sensory data in the fields of computer vision, design, graphics, and human computer interaction. For computer vision, substantial progresses [2], [3] have been made in the psychophysics of visual attention and many computational models [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24] of visual attention have been proposed in the early years. Visual attention is useful for some applications, such as complex scenes understanding [4], [6], [7], [8], [9], [10], [11], object detection [20], tracking [17], [19], retargeting [21], and recognition [24]. The salient object detection is one of the most popular applications of visual attention.

Most models of visual attention for saliency detection are biologically inspired and based on a bottom-up computational model [25]. They are usually used to detect the approximate position of the salient region or object, which can be classified as four categories of models of visual attention.

The first is about methods with graph representation. In the early report [26], random walks are employed on image lattice to compute the visual saliency. Harel et al. [18] extend this method by proposing a better dissimilarity measure to model the transition probability between two vertices. Wang et al. [27] utilize fully-connected graph structure to simulate the cortical neuron connection, and yield a new visual saliency measure called Site Entropy Rate.

The second category is about the methods based on information maximization. In these methods, the information is considered as the driving force behind attentive sampling, the visual saliency is measured using the rarity of feature. Bruce et al. [17] employ the self-information of sparse features to measure saliency. Mancas et al. [22] describe a rarity-based visual attention model to provide an approximation of human perception by visualizing its gradual discovery of the visual environment. It is based on the theory of self-information. Hou et al. [19] introduce the incremental coding length (ICL) to allocate different amount of energy to features according to their rarity under the assumption that salient feature can offer entropy gain. However, the obtained salient information is usually local information which has its limits to detect saliency regions accurately. And the obtained saliency map always represents the approximate position of the salient regions or objects, while lot of salient information is ignored.

The third category is about the methods for the center-surround mechanism. These methods model the center-surround mechanism of primary visual cortical cells. A biological-plausible visual saliency model [20] is proposed, which implies that visual perception relies on a linear measure of similarity on intensity, color, and orientation. However, Gao et al. [28], [29] find that this is in conflict with the well known properties of high level human judgments of similarity. Therefore, Gao et al. propose a discriminant center-surround hypothesis using mutual information, and it can provide optimal solutions for many other saliency problems for computer vision.

The fourth category is about the methods based on machine learning. Liu et al. [30] combine pixel-based saliency measurements in a CRF and derives a binary segmentation separating the object from the background. Alexe et al. [31] design an objectness measure under a Bayesian framework and explicitly training it to distinguish windows containing an object from background windows.

Recently, a new type of method [21] is proposed for context-aware saliency detection. This method aims at detecting the image regions which represent the scene. It is different from previous definitions whose goal is to either identify fixation points or detect the domain object. Cheng et al. [13] propose a histogram-based contrast method (HC) to measure saliency. HC-maps assign pixel-wise saliency values based simply on color separation from all other image pixels to produce full resolution saliency maps. They also incorporate spatial relations to produce region-based contrast (RC) maps to improve the HC-maps.

From the above analysis, we can see that most of the methods focus on obtaining the saliency map which is made up of parts of the salient region or object. The context-aware method [21] can obtain the saliency map which describes the whole salient region or object. However, it also introduces a lot of false alarms. Therefore, the challenge of saliency detection is to acquire the salient region or object as accurate as possible, not only approximate position of the saliency. Solving this problem is one of the most important goals for scene understanding, which involves interpreting the whole image by recognizing all the objects of interest within an image and their spatial extent or shape.

In this paper, we propose a novel method to saliency detection. Firstly, a definition about the features of the salient objects is introduced. Secondly, we present the method of global salient information maximization which can obtain the salient information from three aspects. (1) Detect global salient component by PCA based method; (2) extract maximal salient information to eliminate the noise of saliency detection; (3) enhance the salient information. Finally, the optimal saliency map is obtained.

This paper is organized into six sections. The next section introduces features of the salient object used in our paper. Section 3 describes the proposed theory that finds the salient objects. Section 4 provides some experimental results of the proposed approach, and comparison with 11 previous works about saliency detection is also performed in qualitative and quantitative measures. In Section 5, we apply the proposed saliency detection method to graph based image segmentation. Finally, Section 6 draws the conclusion of this paper.

Section snippets

The feature of salient object

The goal of our method is to detect salient information. The previous works briefly focused on the field of searching for the approximate position of the saliency. For that reason that neither the specific position nor the size of the salient object is known after the saliency detection, therefore, it is a challenge for these methods to be implemented to realize the accurate saliency detection. There are different applications [4], [32] for the saliency detection methods, and there are many

The proposed saliency detection method

In this section we propose an algorithm to realize the extraction of the salient region or object with the features (a)–(c). The PCA based method, which is used to extract the principal component (i.e. background) and minor component (salient objects) of the image, can realize the extraction of the salient object with features (a) and (b). The salient object with features (c) can be refined by the information maximization based method which can be used to eliminate the noise of saliency

Experimental results

In this section, to evaluate the performance of the proposed method, we compare the proposed method with 11 state of the art methods in three cases in the qualitative measure. The quantitative evaluation is also obtained by comparing precision and recall curves on the database.

Graph based salient object segmentation

As mentioned in Section 3, we propose the method of GSIM, which can be used to produce good segmentation results for most of the images in the testing database. However, due to the linear property of PCA, there are some false and miss detections. Therefore, we propose a graph based salient object segmentation method (GGSIM) to improve the segmentation results of GSIM.

There are many traditional methods of saliency detection which are used in unsupervised object segmentation. For example, Ma and

Conclusion

In this paper, we propose a novel method for saliency detection which is based on the features of the salient object. This method can extract the global salient information by PCA based method. And we use two methods, i.e., maximal salient information extraction and salient information enhancement methods, to effectively eliminate the noise and enhance the salient information respectively. Graph based salient object segmentation is also proposed to extract the salient object. The experimental

Acknowledgement

This work was partially supported by NSFC (Nos. 60972109, 61101091 and 61173121), the Program for New Century Excellent Talents in University (NCET-08-0090), the Fundamental Research Funds for the Central Universities (No. E022050205), and Sichuan Province Science Foundation for Youths (No. 2010JQ0003).

References (41)

  • C. Rother et al.

    GrabCut: interactive foreground extraction using iterated graph cuts

    ACM Transactions on Graphics

    (2004)
  • L. Lubor et al.

    What, where & how many? Combining object detectors and CRFs

  • J. Han et al.

    Unsupervised extraction of visual attention objects in color images

    IEEE Transactions on Circuits and Systems for Video Technology, USA

    (2006)
  • R. Achanta et al.

    Frequency-tuned salient region detection

  • M.-M. Cheng et al.

    Global contrast based salient region detection

  • Y. Zhai et al.

    Visual attention detection in video sequences using spatiotemporal cues

  • R. Achanta et al.

    Salient region detection and segmentation

  • X. Hou et al.

    Saliency detection: a spectral residual approach

  • N. Bruce et al.

    Saliency based on information maximization

  • J. Harel et al.

    Graph-based visual saliency

  • Cited by (34)

    • Multi-focus image fusion based on multi-scale sparse representation

      2021, Journal of Visual Communication and Image Representation
      Citation Excerpt :

      And some trying is done to overcome this deficiency, for example, self-supervised mask-optimization for multi-focus image fusion proposed by Ma [13], and unsupervised image fusion network proposed by Xu [14]. Salient information detection [15,16] also can be achieved by measuring the activity levels of image edges, textures, and so on. And the salient information changes obviously in the visual and often exists in the high frequency components of an image.

    • Saliency detection based on singular value decomposition

      2015, Journal of Visual Communication and Image Representation
      Citation Excerpt :

      The Y channel can be employed to detect salient regions based on the illumination information, while the Cb and Cr channels can be analyzed based on color information [48]. The idea of this color space is to separate luminance from chrominance because the human eye is more sensitive to brightness information (luminance) than to color information (chrominance) [49], which is a property of human vision [29]. An interesting phenomenon is that salient regions have different degrees of obviousness in different color channels.

    • An effective vector model for global-contrast-based saliency detection

      2015, Journal of Visual Communication and Image Representation
    • Efficient saliency analysis based on wavelet transform and entropy theory

      2015, Journal of Visual Communication and Image Representation
    • Automatic image segmentation using salient key point extraction and star shape prior

      2014, Signal Processing
      Citation Excerpt :

      We also do quantitative experiments on the 1500 images dataset provided by [59]. We compare our results with the results by [31,53,59] with three criteria, precision, recall rate and F-measure. The quantitative comparison of the results is shown in Fig. 9.

    View all citing articles on Scopus
    View full text