Global salient information maximization for saliency detection
Highlights
► Saliency detection based on our defined features of the salient object. ► Partition the image into two components by PCA and k-means clustering. ► The maximal salient information and salient information enhancement are applied. ► The proposed method can achieve better performance than the compared methods. ► This proposed method can be applied for graph based image segmentation.
Introduction
Humans and other animals cannot pay attention to more than one or very few items simultaneously. They only could facilitate learning and survival by enabling organisms to focus their limited perceptual and cognitive resources on the most pertinent subset of the available sensory data. This subset can be called as saliency of the sensory data which is the state or quality by which it stands out relative to its neighbors [1], such as an object, a person, a pixel. Usually, the saliency of the sensory data represents the most valuable information from a large amount of the sensory data in the fields of computer vision, design, graphics, and human computer interaction. For computer vision, substantial progresses [2], [3] have been made in the psychophysics of visual attention and many computational models [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24] of visual attention have been proposed in the early years. Visual attention is useful for some applications, such as complex scenes understanding [4], [6], [7], [8], [9], [10], [11], object detection [20], tracking [17], [19], retargeting [21], and recognition [24]. The salient object detection is one of the most popular applications of visual attention.
Most models of visual attention for saliency detection are biologically inspired and based on a bottom-up computational model [25]. They are usually used to detect the approximate position of the salient region or object, which can be classified as four categories of models of visual attention.
The first is about methods with graph representation. In the early report [26], random walks are employed on image lattice to compute the visual saliency. Harel et al. [18] extend this method by proposing a better dissimilarity measure to model the transition probability between two vertices. Wang et al. [27] utilize fully-connected graph structure to simulate the cortical neuron connection, and yield a new visual saliency measure called Site Entropy Rate.
The second category is about the methods based on information maximization. In these methods, the information is considered as the driving force behind attentive sampling, the visual saliency is measured using the rarity of feature. Bruce et al. [17] employ the self-information of sparse features to measure saliency. Mancas et al. [22] describe a rarity-based visual attention model to provide an approximation of human perception by visualizing its gradual discovery of the visual environment. It is based on the theory of self-information. Hou et al. [19] introduce the incremental coding length (ICL) to allocate different amount of energy to features according to their rarity under the assumption that salient feature can offer entropy gain. However, the obtained salient information is usually local information which has its limits to detect saliency regions accurately. And the obtained saliency map always represents the approximate position of the salient regions or objects, while lot of salient information is ignored.
The third category is about the methods for the center-surround mechanism. These methods model the center-surround mechanism of primary visual cortical cells. A biological-plausible visual saliency model [20] is proposed, which implies that visual perception relies on a linear measure of similarity on intensity, color, and orientation. However, Gao et al. [28], [29] find that this is in conflict with the well known properties of high level human judgments of similarity. Therefore, Gao et al. propose a discriminant center-surround hypothesis using mutual information, and it can provide optimal solutions for many other saliency problems for computer vision.
The fourth category is about the methods based on machine learning. Liu et al. [30] combine pixel-based saliency measurements in a CRF and derives a binary segmentation separating the object from the background. Alexe et al. [31] design an objectness measure under a Bayesian framework and explicitly training it to distinguish windows containing an object from background windows.
Recently, a new type of method [21] is proposed for context-aware saliency detection. This method aims at detecting the image regions which represent the scene. It is different from previous definitions whose goal is to either identify fixation points or detect the domain object. Cheng et al. [13] propose a histogram-based contrast method (HC) to measure saliency. HC-maps assign pixel-wise saliency values based simply on color separation from all other image pixels to produce full resolution saliency maps. They also incorporate spatial relations to produce region-based contrast (RC) maps to improve the HC-maps.
From the above analysis, we can see that most of the methods focus on obtaining the saliency map which is made up of parts of the salient region or object. The context-aware method [21] can obtain the saliency map which describes the whole salient region or object. However, it also introduces a lot of false alarms. Therefore, the challenge of saliency detection is to acquire the salient region or object as accurate as possible, not only approximate position of the saliency. Solving this problem is one of the most important goals for scene understanding, which involves interpreting the whole image by recognizing all the objects of interest within an image and their spatial extent or shape.
In this paper, we propose a novel method to saliency detection. Firstly, a definition about the features of the salient objects is introduced. Secondly, we present the method of global salient information maximization which can obtain the salient information from three aspects. (1) Detect global salient component by PCA based method; (2) extract maximal salient information to eliminate the noise of saliency detection; (3) enhance the salient information. Finally, the optimal saliency map is obtained.
This paper is organized into six sections. The next section introduces features of the salient object used in our paper. Section 3 describes the proposed theory that finds the salient objects. Section 4 provides some experimental results of the proposed approach, and comparison with 11 previous works about saliency detection is also performed in qualitative and quantitative measures. In Section 5, we apply the proposed saliency detection method to graph based image segmentation. Finally, Section 6 draws the conclusion of this paper.
Section snippets
The feature of salient object
The goal of our method is to detect salient information. The previous works briefly focused on the field of searching for the approximate position of the saliency. For that reason that neither the specific position nor the size of the salient object is known after the saliency detection, therefore, it is a challenge for these methods to be implemented to realize the accurate saliency detection. There are different applications [4], [32] for the saliency detection methods, and there are many
The proposed saliency detection method
In this section we propose an algorithm to realize the extraction of the salient region or object with the features (a)–(c). The PCA based method, which is used to extract the principal component (i.e. background) and minor component (salient objects) of the image, can realize the extraction of the salient object with features (a) and (b). The salient object with features (c) can be refined by the information maximization based method which can be used to eliminate the noise of saliency
Experimental results
In this section, to evaluate the performance of the proposed method, we compare the proposed method with 11 state of the art methods in three cases in the qualitative measure. The quantitative evaluation is also obtained by comparing precision and recall curves on the database.
Graph based salient object segmentation
As mentioned in Section 3, we propose the method of GSIM, which can be used to produce good segmentation results for most of the images in the testing database. However, due to the linear property of PCA, there are some false and miss detections. Therefore, we propose a graph based salient object segmentation method (GGSIM) to improve the segmentation results of GSIM.
There are many traditional methods of saliency detection which are used in unsupervised object segmentation. For example, Ma and
Conclusion
In this paper, we propose a novel method for saliency detection which is based on the features of the salient object. This method can extract the global salient information by PCA based method. And we use two methods, i.e., maximal salient information extraction and salient information enhancement methods, to effectively eliminate the noise and enhance the salient information respectively. Graph based salient object segmentation is also proposed to extract the salient object. The experimental
Acknowledgement
This work was partially supported by NSFC (Nos. 60972109, 61101091 and 61173121), the Program for New Century Excellent Talents in University (NCET-08-0090), the Fundamental Research Funds for the Central Universities (No. E022050205), and Sichuan Province Science Foundation for Youths (No. 2010JQ0003).
References (41)
- et al.
Saliency model based face segmentation in head-and-shoulder video sequences
Journal of Visual Communication and Image Representation, Elsevier Science
(2008) - et al.
A saliency-based search mechanism for overt and covert shifts of visual attention
Vision Research
(2000) - et al.
The independent components of natural scenes are edge filters
Vision Research
(1997) - et al.
On the distribute of saliency
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2006) - et al.
A feature-integration theory of attention
Cognitive Psychology
(1980) - A. Yarbus, Eye Movements and Vision, Plenum, NY,...
- et al.
Is bottom-up attention useful for object recognition?
- et al.
Saliency estimation using a non-parametric low-level vision model
- et al.
Yet another survey on image segmentation: region and boundary information integration
- et al.
Seam carving for content-aware image resizing
ACM Transactions on Graphics
(2007)
GrabCut: interactive foreground extraction using iterated graph cuts
ACM Transactions on Graphics
What, where & how many? Combining object detectors and CRFs
Unsupervised extraction of visual attention objects in color images
IEEE Transactions on Circuits and Systems for Video Technology, USA
Frequency-tuned salient region detection
Global contrast based salient region detection
Visual attention detection in video sequences using spatiotemporal cues
Salient region detection and segmentation
Saliency detection: a spectral residual approach
Saliency based on information maximization
Graph-based visual saliency
Cited by (34)
Multi-focus image fusion based on multi-scale sparse representation
2021, Journal of Visual Communication and Image RepresentationCitation Excerpt :And some trying is done to overcome this deficiency, for example, self-supervised mask-optimization for multi-focus image fusion proposed by Ma [13], and unsupervised image fusion network proposed by Xu [14]. Salient information detection [15,16] also can be achieved by measuring the activity levels of image edges, textures, and so on. And the salient information changes obviously in the visual and often exists in the high frequency components of an image.
Saliency detection based on singular value decomposition
2015, Journal of Visual Communication and Image RepresentationCitation Excerpt :The Y channel can be employed to detect salient regions based on the illumination information, while the Cb and Cr channels can be analyzed based on color information [48]. The idea of this color space is to separate luminance from chrominance because the human eye is more sensitive to brightness information (luminance) than to color information (chrominance) [49], which is a property of human vision [29]. An interesting phenomenon is that salient regions have different degrees of obviousness in different color channels.
An effective vector model for global-contrast-based saliency detection
2015, Journal of Visual Communication and Image RepresentationEfficient saliency analysis based on wavelet transform and entropy theory
2015, Journal of Visual Communication and Image RepresentationAutomatic image segmentation using salient key point extraction and star shape prior
2014, Signal ProcessingCitation Excerpt :We also do quantitative experiments on the 1500 images dataset provided by [59]. We compare our results with the results by [31,53,59] with three criteria, precision, recall rate and F-measure. The quantitative comparison of the results is shown in Fig. 9.
Significant Target Detection of Traffic Signs Based on Walsh-Hadamard Transform
2019, Advances in Intelligent Systems and Computing