Global salient information maximization for saliency detection

doi:10.1016/j.image.2011.10.004

Signal Processing: Image Communication

Volume 27, Issue 3, March 2012, Pages 238-248

https://doi.org/10.1016/j.image.2011.10.004 Get rights and content

Abstract

In this paper, a new method for saliency detection is proposed. Based on the defined features of the salient object, we solve the problem of saliency detection from three aspects. Firstly, from the view of global information, we partition the image into two clusters, namely, salient component and background component by employing Principal Component Analysis (PCA) and k-means clustering. Secondly, the maximal salient information is applied to find the position of saliency and eliminate the noise. Thirdly, we enhance the saliency for the salient regions while weaken the background regions. Finally, the saliency map is obtained based on these aspects. Experimental results show that the proposed method achieves better results than the state of the art methods. And this method can be applied for graph based salient object segmentation.

Highlights

► Saliency detection based on our defined features of the salient object. ► Partition the image into two components by PCA and k-means clustering. ► The maximal salient information and salient information enhancement are applied. ► The proposed method can achieve better performance than the compared methods. ► This proposed method can be applied for graph based image segmentation.

Introduction

Humans and other animals cannot pay attention to more than one or very few items simultaneously. They only could facilitate learning and survival by enabling organisms to focus their limited perceptual and cognitive resources on the most pertinent subset of the available sensory data. This subset can be called as saliency of the sensory data which is the state or quality by which it stands out relative to its neighbors [1], such as an object, a person, a pixel. Usually, the saliency of the sensory data represents the most valuable information from a large amount of the sensory data in the fields of computer vision, design, graphics, and human computer interaction. For computer vision, substantial progresses [2], [3] have been made in the psychophysics of visual attention and many computational models [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24] of visual attention have been proposed in the early years. Visual attention is useful for some applications, such as complex scenes understanding [4], [6], [7], [8], [9], [10], [11], object detection [20], tracking [17], [19], retargeting [21], and recognition [24]. The salient object detection is one of the most popular applications of visual attention.

Most models of visual attention for saliency detection are biologically inspired and based on a bottom-up computational model [25]. They are usually used to detect the approximate position of the salient region or object, which can be classified as four categories of models of visual attention.

The first is about methods with graph representation. In the early report [26], random walks are employed on image lattice to compute the visual saliency. Harel et al. [18] extend this method by proposing a better dissimilarity measure to model the transition probability between two vertices. Wang et al. [27] utilize fully-connected graph structure to simulate the cortical neuron connection, and yield a new visual saliency measure called Site Entropy Rate.

The second category is about the methods based on information maximization. In these methods, the information is considered as the driving force behind attentive sampling, the visual saliency is measured using the rarity of feature. Bruce et al. [17] employ the self-information of sparse features to measure saliency. Mancas et al. [22] describe a rarity-based visual attention model to provide an approximation of human perception by visualizing its gradual discovery of the visual environment. It is based on the theory of self-information. Hou et al. [19] introduce the incremental coding length (ICL) to allocate different amount of energy to features according to their rarity under the assumption that salient feature can offer entropy gain. However, the obtained salient information is usually local information which has its limits to detect saliency regions accurately. And the obtained saliency map always represents the approximate position of the salient regions or objects, while lot of salient information is ignored.

The third category is about the methods for the center-surround mechanism. These methods model the center-surround mechanism of primary visual cortical cells. A biological-plausible visual saliency model [20] is proposed, which implies that visual perception relies on a linear measure of similarity on intensity, color, and orientation. However, Gao et al. [28], [29] find that this is in conflict with the well known properties of high level human judgments of similarity. Therefore, Gao et al. propose a discriminant center-surround hypothesis using mutual information, and it can provide optimal solutions for many other saliency problems for computer vision.

The fourth category is about the methods based on machine learning. Liu et al. [30] combine pixel-based saliency measurements in a CRF and derives a binary segmentation separating the object from the background. Alexe et al. [31] design an objectness measure under a Bayesian framework and explicitly training it to distinguish windows containing an object from background windows.

Recently, a new type of method [21] is proposed for context-aware saliency detection. This method aims at detecting the image regions which represent the scene. It is different from previous definitions whose goal is to either identify fixation points or detect the domain object. Cheng et al. [13] propose a histogram-based contrast method (HC) to measure saliency. HC-maps assign pixel-wise saliency values based simply on color separation from all other image pixels to produce full resolution saliency maps. They also incorporate spatial relations to produce region-based contrast (RC) maps to improve the HC-maps.

From the above analysis, we can see that most of the methods focus on obtaining the saliency map which is made up of parts of the salient region or object. The context-aware method [21] can obtain the saliency map which describes the whole salient region or object. However, it also introduces a lot of false alarms. Therefore, the challenge of saliency detection is to acquire the salient region or object as accurate as possible, not only approximate position of the saliency. Solving this problem is one of the most important goals for scene understanding, which involves interpreting the whole image by recognizing all the objects of interest within an image and their spatial extent or shape.

In this paper, we propose a novel method to saliency detection. Firstly, a definition about the features of the salient objects is introduced. Secondly, we present the method of global salient information maximization which can obtain the salient information from three aspects. (1) Detect global salient component by PCA based method; (2) extract maximal salient information to eliminate the noise of saliency detection; (3) enhance the salient information. Finally, the optimal saliency map is obtained.

This paper is organized into six sections. The next section introduces features of the salient object used in our paper. Section 3 describes the proposed theory that finds the salient objects. Section 4 provides some experimental results of the proposed approach, and comparison with 11 previous works about saliency detection is also performed in qualitative and quantitative measures. In Section 5, we apply the proposed saliency detection method to graph based image segmentation. Finally, Section 6 draws the conclusion of this paper.

Section snippets

The feature of salient object

The goal of our method is to detect salient information. The previous works briefly focused on the field of searching for the approximate position of the saliency. For that reason that neither the specific position nor the size of the salient object is known after the saliency detection, therefore, it is a challenge for these methods to be implemented to realize the accurate saliency detection. There are different applications [4], [32] for the saliency detection methods, and there are many

The proposed saliency detection method

In this section we propose an algorithm to realize the extraction of the salient region or object with the features (a)–(c). The PCA based method, which is used to extract the principal component (i.e. background) and minor component (salient objects) of the image, can realize the extraction of the salient object with features (a) and (b). The salient object with features (c) can be refined by the information maximization based method which can be used to eliminate the noise of saliency

Experimental results

In this section, to evaluate the performance of the proposed method, we compare the proposed method with 11 state of the art methods in three cases in the qualitative measure. The quantitative evaluation is also obtained by comparing precision and recall curves on the database.

Graph based salient object segmentation

As mentioned in Section 3, we propose the method of GSIM, which can be used to produce good segmentation results for most of the images in the testing database. However, due to the linear property of PCA, there are some false and miss detections. Therefore, we propose a graph based salient object segmentation method (GGSIM) to improve the segmentation results of GSIM.

There are many traditional methods of saliency detection which are used in unsupervised object segmentation. For example, Ma and

Conclusion

In this paper, we propose a novel method for saliency detection which is based on the features of the salient object. This method can extract the global salient information by PCA based method. And we use two methods, i.e., maximal salient information extraction and salient information enhancement methods, to effectively eliminate the noise and enhance the salient information respectively. Graph based salient object segmentation is also proposed to extract the salient object. The experimental

Acknowledgement

This work was partially supported by NSFC (Nos. 60972109, 61101091 and 61173121), the Program for New Century Excellent Talents in University (NCET-08-0090), the Fundamental Research Funds for the Central Universities (No. E022050205), and Sichuan Province Science Foundation for Youths (No. 2010JQ0003).

References (41)

H. Li et al.
Saliency model based face segmentation in head-and-shoulder video sequences
Journal of Visual Communication and Image Representation, Elsevier Science
(2008)
L. Itti et al.
A saliency-based search mechanism for overt and covert shifts of visual attention
Vision Research
(2000)
A. Bell et al.
The independent components of natural scenes are edge filters
Vision Research
(1997)
A. Berengolts et al.
On the distribute of saliency
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2006)
A. Treisman et al.
A feature-integration theory of attention
Cognitive Psychology
(1980)
A. Yarbus, Eye Movements and Vision, Plenum, NY,...
U. Rutishauser et al.
Is bottom-up attention useful for object recognition?
N. Murray et al.
Saliency estimation using a non-parametric low-level vision model
J. Freixenet et al.
Yet another survey on image segmentation: region and boundary information integration
S. Avidan et al.
Seam carving for content-aware image resizing
ACM Transactions on Graphics
(2007)

C. Rother et al.

GrabCut: interactive foreground extraction using iterated graph cuts

ACM Transactions on Graphics

(2004)

L. Lubor et al.

What, where & how many? Combining object detectors and CRFs

J. Han et al.

Unsupervised extraction of visual attention objects in color images

IEEE Transactions on Circuits and Systems for Video Technology, USA

(2006)

R. Achanta et al.

Frequency-tuned salient region detection

M.-M. Cheng et al.

Global contrast based salient region detection

Y. Zhai et al.

Visual attention detection in video sequences using spatiotemporal cues

R. Achanta et al.

Salient region detection and segmentation

X. Hou et al.

Saliency detection: a spectral residual approach

N. Bruce et al.

Saliency based on information maximization

J. Harel et al.

Graph-based visual saliency

Cited by (34)

Multi-focus image fusion based on multi-scale sparse representation
2021, Journal of Visual Communication and Image Representation
Citation Excerpt :
And some trying is done to overcome this deficiency, for example, self-supervised mask-optimization for multi-focus image fusion proposed by Ma [13], and unsupervised image fusion network proposed by Xu [14]. Salient information detection [15,16] also can be achieved by measuring the activity levels of image edges, textures, and so on. And the salient information changes obviously in the visual and often exists in the high frequency components of an image.
Although colorful information in natural scenes can be collected, due to the limitation of camera depth of field, it is hard to capture an image with all-in-focus. Sparse representation (SR)-based methods have shown their powerful potentiality and ability in multi-focus image fusion. However, because of sparse coding and information compress, the existing fusion methods based on SR are imperfect to seize the rich details and significant texture information in source images. As a result, a fusion method based on multi-scale sparse representation for registered multi-focus images (MIF-MsSR) is proposed in this paper, where an adaptive fusion rule for sparse coefficients is presented. At first, source images are processed by multi-scale decomposition and sub-images with different scales can be obtained. According to image features with different richness in these sub-images, dictionaries with different sizes and redundancy are thereby trained. By comprehensively considering the relationships of focused areas, out-of-focused areas and boundary areas between the source images, an adaptive fusion rule based on $l_{0} - max$ and Sum Modified Laplacian (SML) is proposed. Finally, a fused image with all-in-focus can be obtained by sparse reconstruction and inverse multi-scale decomposition. Excessive experiments on multi-focus images have demonstrated that the proposed MIF-MsSR not only reserves the integrity of the information in source images, but also has better fusion performance on subjective and objective indicators than other state-of-the-art methods.
Saliency detection based on singular value decomposition
2015, Journal of Visual Communication and Image Representation
Citation Excerpt :
The Y channel can be employed to detect salient regions based on the illumination information, while the Cb and Cr channels can be analyzed based on color information [48]. The idea of this color space is to separate luminance from chrominance because the human eye is more sensitive to brightness information (luminance) than to color information (chrominance) [49], which is a property of human vision [29]. An interesting phenomenon is that salient regions have different degrees of obviousness in different color channels.
Saliency detection has gained popularity in many applications, and many different approaches have been proposed. In this paper, we propose a new approach based on singular value decomposition (SVD) for saliency detection. Our algorithm considers both the human-perception mechanism and the relationship between the singular values of an image decomposed by SVD and its salient regions. The key concept of our proposed algorithms is based on the fact that salient regions are the important parts of an image. The singular values of an image are divided into three groups: large, intermediate, and small singular values. We propose the hypotheses that the large singular values mainly contain information about the non-salient background and slight information about the salient regions, while the intermediate singular values contain most or even all of the saliency information. The small singular values contain little or even none of the saliency information. These hypotheses are validated by experiments. By regularization based on the average information, regularization using the leading largest singular values or regularization based on machine learning, the salient regions will become more conspicuous. In our proposed approach, learning-based methods are proposed to improve the accuracy of detecting salient regions in images. Gaussian filters are also employed to enhance the saliency information. Experimental results prove that our methods based on SVD achieve superior performance compared to other state-of-the-art methods for human-eye fixations, as well as salient-object detection, in terms of the area under the receiver operating characteristic (ROC) curve (AUC) score, the linear correlation coefficient (CC) score, the normalized scan-path saliency (NSS) score, the F-measure score, and visual quality.
An effective vector model for global-contrast-based saliency detection
2015, Journal of Visual Communication and Image Representation
The saliency detection methods based on global contrast can generate full-resolution saliency map with uniformly highlighted regions and defined boundaries. For the images consisting of large salient objects, the use of unweighted sum of the color distances in the existing global-contrast-based methods may result in the detection of the background instead of the outstanding objects. In this paper, we propose a new global-contrast-based saliency detection method, called LRSW method, by deriving a new vector model which uses the weighted mean vector and contains the features of CIELAB color, chromatic double opponency, and similarity distribution. By using the vector model, the proposed method can significantly increase the detection precision and suppress the background in the saliency map, especially for large salient objects. The experimental results on the MSRA benchmark images show the effectiveness of the proposed method which outperforms the existing methods on visual saliency detection in terms of precision and recall.
Efficient saliency analysis based on wavelet transform and entropy theory
2015, Journal of Visual Communication and Image Representation
Saliency detection has extensive applications in daily life. In this paper, an efficient saliency-detection method based on wavelet transform and entropy theory is proposed. In the algorithm proposed in this paper, salient regions are viewed as uncommon regions in the background of an image. The uncommon regions can be caused by differences in color, orientation, texture, shape, or other factors. Considering the fact that wavelet coefficients can represent the local features of an image in different scales and orientations, the wavelet transform is therefore employed to identify the salient regions. Unlike those conventional wavelet-based methods, our proposed method need not perform the inverse wavelet transformation; this can reduce the computational requirements. In addition, because the different factors (i.e. color, orientation, texture, shape, etc.) stimulate different aspects of the human visual system, a saliency-map combination scheme based on the entropy theory is devised in this paper, which can evaluate the influence or significance of the different factors. Experimental results show that our method, based on wavelet transformation and entropy theory, can achieve excellent performance in terms of the area under the receiver operating characteristic curve (AUC) score, the linear correlation coefficient (CC), the normalized scan-path saliency (NSS) score, and visual performance, as compared to existing state-of-the-art methods.
Automatic image segmentation using salient key point extraction and star shape prior
2014, Signal Processing
Citation Excerpt :
We also do quantitative experiments on the 1500 images dataset provided by [59]. We compare our results with the results by [31,53,59] with three criteria, precision, recall rate and F-measure. The quantitative comparison of the results is shown in Fig. 9.
In this paper, a new unsupervised segmentation method is proposed. The method integrates the star shape prior of the image object with salient point detection algorithm. In the proposed method, the Harris salient point detection is first applied to the color image to obtain the initial salient points. A regional contrast based saliency extraction method is then used to select rough object regions in the image. To restrict the distribution of salient points, an adaptive threshold segmentation is applied to the saliency map to get the saliency mask. And then the salient region points can be obtained by placing the saliency mask on the initial Harris salient points. In order to make sure the salient points which we get are inside the image object thus the star shape constraint can be applied to the graph cuts segmentation, the Affinity Propagation (AP) clustering is employed to find the salient key points among the salient region points. Finally, these salient key points are regarded as foreground seeds and the star shape prior is introduced to graph cuts segmentation framework to extract the foreground object. Extensive experiments and comparisons on public database are provided to demonstrate the good performance of the proposed method.
Significant Target Detection of Traffic Signs Based on Walsh-Hadamard Transform
2019, Advances in Intelligent Systems and Computing

View all citing articles on Scopus

View full text

Global salient information maximization for saliency detection

Abstract

Highlights

Introduction

Section snippets

The feature of salient object

The proposed saliency detection method

Experimental results

Graph based salient object segmentation

Conclusion

Acknowledgement

Journal of Visual Communication and Image Representation, Elsevier Science

Vision Research

Vision Research

On the distribute of saliency

IEEE Transactions on Pattern Analysis and Machine Intelligence

A feature-integration theory of attention

Cognitive Psychology

Is bottom-up attention useful for object recognition?

Saliency estimation using a non-parametric low-level vision model

Yet another survey on image segmentation: region and boundary information integration

Seam carving for content-aware image resizing

ACM Transactions on Graphics

GrabCut: interactive foreground extraction using iterated graph cuts

ACM Transactions on Graphics

What, where & how many? Combining object detectors and CRFs

Unsupervised extraction of visual attention objects in color images

IEEE Transactions on Circuits and Systems for Video Technology, USA

Frequency-tuned salient region detection

Global contrast based salient region detection

Visual attention detection in video sequences using spatiotemporal cues

Salient region detection and segmentation

Saliency detection: a spectral residual approach

Saliency based on information maximization

Graph-based visual saliency