Elsevier

Neurocomputing

Volume 124, 26 January 2014, Pages 57-62
Neurocomputing

Letters
Background contrast based salient region detection

https://doi.org/10.1016/j.neucom.2013.07.043Get rights and content

Abstract

This paper proposes a novel salient region detection method based on background contrast. Salient regions are widely considered as those distinct to the image background. However, owning to the lack of information about the image background, existing methods measure saliency of an image pixel/region using its contrast to local neighborhoods or the entire image rather than the image background. Inappropriate contrast region leads to difficulty in highlighting the whole large salient region. In this paper, we discover that the absence of eye fixations provides an important clue to the image background. Regions without eye fixations are very likely the image background. To further acquire the spatial extent of image background, the complementary area of the convex-hull of eye fixations is used to represent the possible image background. Then we measure the saliency of each region by computing its contrast to the estimated image background. The experimental results demonstrate that our approach outperforms previous state-of-the-art methods.

Introduction

Humans can rapidly and accurately identify salient regions in their visual fields. Simulating such an ability in machine is critical to make machine handle visual content like humans. In the past decades, extensive research efforts have been conducted to construct the computational model for detecting salient region. Most of these methods [1], [2], [3], [4] intend to predict human eye fixations, where the eye fixates for a while to gather and process interesting information. Itti et al. [1] directly summed multi-scale center-surround responses of visual features to generate saliency map. Murray et al. [2] integrated the information at multiple scales by performing inverse wavelet transform on the scale-weighted center-surround responses. Hou et al. [3] measured the saliency of each pixel in the frequency domain. Ban et al. [4] introduced psychological distance into the prediction of eye fixations. However, eye fixations are just some spatial discrete points, they cannot accurately represent the whole salient region [5], [6].

Recent years have witnessed more interest in salient region detection, which is useful in applications such as image segmentation [7], object detection [8], and content aware image resizing [9]. We focus on data-driven salient region detection.

It is widely agreed that salient regions are distinct to the image background [5], [6], [10]. However, due to the lack of any knowledge about the image background, current methods measure saliency of an image pixel or region using its contrast with respect to local neighborhoods [11], [5], [6], [12] or the entire image [13], [14], [15], [16], [17]. Though these methods have achieved promising results, they still have difficulty in highlighting the entire salient region uniformly (Fig. 1(1)). Besides, in the presence of large salient objects or complex backgrounds, they may fail to correctly highlight the salient regions (Fig. 1(2)).

Though the eye fixations cannot mark the whole salient region [5], [6], the absence of eye fixations provides a clue to image background. Observing the outside world, we move our fovea in the retina to specific salient regions to fixate on them [18]. During the fixation, we get a detailed impression of them. We then move, or saccade, to the next location of interest. Thus human eye fixations always fix on the interesting parts of image [19], seldom on the image background. The regions without eye fixations are very likely the image background.

In this paper, guided by the eye fixations, we estimate the spatial extent of image background. Then we measure the saliency of each small image region using its contrast to the estimated image background. In Section 2 of the paper, the limitations of existing methods are described. Section 3 introduces the proposed salient region detection approach. Experimental results and conclusion are respectively presented in 4 Experimental results, 5 Conclusion.

Section snippets

Limitation of the existing methods

According to the results from perceptual research [10], contrast is the most influential factor in low-level visual saliency. Current methods detect saliency by calculating the contrast of each image region relative to its certain context. Based on the extent of the context, such methods can be classified as local methods [6], [11], [12] or global methods [13], [14], [15], [16].

Local methods estimate the saliency of each image pixel/region using its contrast with respect to a small local

Overview of the proposed approach

Owing to the conflict between limited cognitive resources and large amounts of visual input data, only some important aspects of the input are selected for further detailed processing [20]. To accomplish this task, Human Visual Attention (HVA) is employed to align a salient region with the fovea [18], which is located at the center of the retina and has the highest resolution. It involves the detection of rapid eye movements called saccades and stationary phases called fixations. During the

Experimental results

The publicly available dataset [13] is used to evaluate salient region detection methods. This dataset contains 1000 images and associated binary ground truth. Similar as previous works, precision–recall curve is used as performance metric. For comprehensive comparison, we further compute AUC (Area Under precision–recall Curve) metric. The threshold Tf used in our approach to get eye fixations from the saliency map is set to 60.

Conclusion

In this paper, a novel salient region detection method is proposed. Different from existing methods based on local or global contrast, it takes the contrast to image background as the saliency measurement. Based on the discovery of the absence of eye fixations providing an indication to image background, the complementary area of the convex-hull of eye fixations is used to represent the possible image background. Eye fixations are automatically predicted by a computational model based on local

Acknowledgments

This work is supported by the National Natural Science Foundation of China (60832010, 61100187), the Fundamental Research Funds for the Central Universities (Grant no. HIT. NSRIF. 2010046) and the China Postdoctoral Science Foundation (2011M500666).

Huiyun Jing received the B.S. degree in Computer Science and Technology from JiLin University, Changchun, China, in 2008. She received the M.E. degree in Computer Science and Technology from Harbin Institute of Technology, Harbin, China, in 2010. Since 2010, she studies for her Ph.D. degree in Computer Science and Technology from Harbin Institute of Technology. Her areas of interests are machine vision, image/video processing.

References (25)

  • S. Ban et al.

    Affective saliency map considering psychological distance

    Neurocomputing

    (2011)
  • L. Itti et al.

    A saliency-based search mechanism for overt and covert shifts of visual attention

    Vision Research

    (2000)
  • Z. Li

    A saliency map in primary visual cortex

    Trends in Cognitive Sciences

    (2002)
  • L. Itti et al.

    A model of saliency-based visual attention for rapid scene analysis

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (1998)
  • N. Murray, M. Vanrell, X. Otazu, C. Parraga, Saliency estimation using a non-parametric low-level vision model, in:...
  • X. Hou, L. Zhang, Saliency detection: a spectral residual approach, in: IEEE Conference on Computer Vision and Pattern...
  • T. Liu, J. Sun, N. Zheng, X. Tang, H. Shum, Learning to detect a salient object, in: IEEE Conference on Computer Vision...
  • T. Liu et al.

    Learning to detect a salient object

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2011)
  • B.C. Ko et al.

    Object-of-interest image segmentation based on human attention and semantic region clustering

    Journal of Optical Society of America A

    (2006)
  • U. Rutishauser, D. Walther, C. Koch, P. Perona, Is bottom-up attention useful for object recognition?, in: IEEE...
  • G.-X. Zhang, M.-M. Cheng, S.-M. Hu, R.R. Martin, A shape-preserving approach to image resizing, in: Computer Graphics...
  • W. Einhäuser et al.

    Does luminance-contrast contribute to a saliency map for overt visual attention?

    European Journal of Neuroscience

    (2003)
  • Cited by (7)

    • Saliency detection based on foreground appearance and background-prior

      2018, Neurocomputing
      Citation Excerpt :

      For instance, in the flower image in Fig. 1, the stamen and petals all belong to the flower, hence they should not be differentiated in the desired saliency map. As an important complementary for the prime role of contrast, background priors [25–29] have been proposed recently to exploit the problem mentioned above, which have been already certified that it can highlight the object uniformly due to the high contrast of the object with the background seed [41]. The background prior is based on the observation that the distance of a pair of background regions is shorter than that of a region from the salient object and a region from the background (the regions along the image border) [21].

    • Edge guided salient object detection

      2017, Neurocomputing
      Citation Excerpt :

      Then the background measure and several other low level cues were integrated into a principled optimization framework to obtain high quality saliency maps. Jing et al. [34] discovered that the absence of eye fixations provides an important clue to the image background. Regions without eye fixations were very likely to be the image background.

    • Improved salient object detection via boundary components affinity

      2019, Pertanika Journal of Science and Technology
    • Bionic Vision-Based Intelligent Power Line Inspection System

      2017, Computational and Mathematical Methods in Medicine
    View all citing articles on Scopus

    Huiyun Jing received the B.S. degree in Computer Science and Technology from JiLin University, Changchun, China, in 2008. She received the M.E. degree in Computer Science and Technology from Harbin Institute of Technology, Harbin, China, in 2010. Since 2010, she studies for her Ph.D. degree in Computer Science and Technology from Harbin Institute of Technology. Her areas of interests are machine vision, image/video processing.

    Xin He received the B.S. degree from China University of Mining and Technology, Xuzhou, China, and the M.S. degree and Ph.D. degree from Harbin Institute of Technology, Harbin, China, in 2005, 2008, and 2013, respectively, all in computer science and technology. He joined the National Computer network Emergency Response technical Team/Coordination Center of China (CNCERT/CC), Beijing, China, in 2013. His current research interests include computer vision, media forensic, image/video analysis and retrieval.

    Qi Han received the B.S. degree and M.S. degree from Harbin Institute of Technology, Harbin, PR China, in 2002 and 2004 respectively, and received the Ph.D. degree in Computer Science and Technology from Harbin Institute of Technology, 2009. He is a ACM member. His current research fields include digital video forensic, hiding communication and digital watermarking.

    Xiamu Niu was born in China, in May 1961, received the B.S. degree and M.S. degree in Communication and Electronic Engineering from Harbin Institute of Technology (HIT), Harbin, PR China, in 1982 and 1989 respectively, and received the Ph.D. degree in Instrument Science and Technology in 2000. He was an invited scientist and staff member in Department of Security Technology for Graphics and Communication System, Fruanhofer Institute for Computer Graphics, Germany, from 2000 to 2002. He was awarded the Excellent Ph.D. Dissertation of China in 2002. He now is the Professor (doctoral advisor) and Superintendent of Information Countermeasure Technique Institute HIT, Director of Information Security Technique Research Center, HIT-ShenZhen. He is a SPIE member, ACM member, IEEE member, and the advanced CIE member. He has published 3 works and more than 110 papers cited by SCI and EI. His current research fields include computer information security, hiding communication, cryptography, digital watermarking, signal processing and image processing, etc.

    View full text