An Improved Saliency Detection Method Using Hypergraphs on Adaptive Multiscales

Han, Feilin; Han, Aili; Hao, Jing

doi:10.1007/978-981-10-7305-2_59

Feilin Han¹⁷,
Aili Han¹⁶ &
Jing Hao¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 773))

Included in the following conference series:

CCF Chinese Conference on Computer Vision

2690 Accesses

Abstract

We present an improved saliency detection method by means of hypergraphs on adaptive multiscales (HAM). An input image is charaterized by hypergraphs in which hyperedges are used to capture contextual properties of regions. Thus, the saliency detection problem is transformed into that of finding salient vertices and hyperedges in hypergraphs. The HAM method first adjusts adaptively the ranges of pixel-values in R, G, B channels in an input image and uses the ranges to determine adaptive scales. And then, it models the image as a hypergraph for each scale in which hyperedges are clustered by means of agglomerative mean-shift. The HAM method can get more single-scale hypergraphs and thus has higher accuracy than the previous ones because each hypergraph is on an adaptive scale instead of a fixed scale. Extensive experiments on three benchmark datasets demonstrate that the HAM method improves the performance of saliency detection, especially for the images with narrow ranges of pixel-values.

You have full access to this open access chapter, Download conference paper PDF

Probabilistic Hypergraph Optimization for Salient Object Detection

Salient Object Detection Based on Histogram-Based Contrast and Guided Image Filtering

Aggregating complementary boundary contrast with smoothing for salient region detection

Article 02 June 2016

Keywords

1 Introduction

Saliency detection plays an important role in the field of computer vision, which aims to identify the most attractive region in an image. Its important applications include image segmentation [12], target recognition [10] and image retrieval [13]. In order to extract salient object from input image, many visual attention models have been proposed [4, 6,7,8,9, 11].

Since Itti et al. [7] proposed a visual attention system inspired by animal visual system, the researches on saliency detection have been developed rapidly. Ma and Zhang [9] designed a saliency detection method based on local contrast. Harel et al. [11] gave a graph-based visual saliency model. Cheng et al. [4, 6] proposed a saliency detection method based on global contrast and space coherence. Li et al. [8] employed hypergraphs to capture contextual attributes for saliency detection.

The contextual hypergraph modeling method for saliency detection (CHMS) [8] has good performance of capturing salient objects in most images, which has higher accuracy than many previous methods. Through the experiments, we find that, for the images with wide ranges of pixel-values (e.g. covering almost the whole range of [0, 255]), the CHMS method always has good performance, as shown in the top three lines in Fig. 1. However, for the images with narrow ranges of pixel-values (e.g. covering only the first or middle or last part of the range [0, 255]), the CHMS method usually cannot get very good performance, as shown in the bottom three lines in Fig. 1. Further experiments show that hypergraphs on different scales affect experimental results directly. For example, the images in the buttom three lines in Fig. 1 have no sufficient valid single-scale hypergraphs, so the experiments cannot give good performance by the CHMS method.

Inspired by this, we propose an improved saliency detection method using hypergraphs on adaptive multiscales (HAM), which is an improvement on the CHMS method. The HAM method adaptively adjusts the ranges of pixel-values in R, G, B channels and detects salient objects on adaptive multiscales. As shown in Fig. 2, the HAM method provides more valid single-scale hypergraphs and better final saliency maps. The experimental results show that the HAM method improves the performance of saliency detection, especially for the images with narrow ranges of pixel-values. The basic idea of adjusting adaptively the ranges of pixel-values in an image can be widely used in other applications in computer vision or artificial intelligence.

2 Contextual Hypergraph Modeling

The CHMS method uses a set of hyperedges to capture the contextual properties of superpixels, which improves significantly the performance of saliency detection. The CHMS method firstly segments an input image into some superpixels by means of the SLIC method [2], which applies k-means clustering method to generate superpixels efficiently. And then, it clusters superpixels by means of agglomerative mean-shift [1] on a set of fixed scales, which is based on an iterative query set compression mechanism and runs in linear time. Thus, an input image I is converted to a set of hypergraphs. Each hypergraph is denoted by $G = (V, E)$, where $V = \{v_i\}$ is a set of vertices corresponding to the superpixels, and $E = \{e_j\}$ is a set of hyperedges (a hyperedge is a clique of superpixels) that satisfy $\bigcup _{{e_j}\in {E}} {e_j=V}$. Thus, the saliency detection problem is converted to that of finding salient vertices and hyperedges in hypergraphs on adaptive multiscales.

The saliency of a hyperedge can be determined by the gradient magnitudes of the superpixels within a narrow band along the boundary of the hyperedge. For any hyperedge $e_j$, the saliency score of $e_j$ is defined as follows.

$$\begin{aligned}&\varGamma (e_j)=\omega _{e_j}({{\Vert {I_g^{*}}\bullet {M_g(e_j)}\Vert }_1}-\rho (e_j)), \end{aligned}$$

(1)

where $\omega _{e_j}$ is the weight of $e_j$, $I_g^{*}$ is the binary gradient map, $M_g(e_j)$ is a binary mask indicating the superpixels within a narrow band along the boundary of the hyperedge $e_j$, $\bullet $ is the elementwise dot product operator, ${\Vert ~\Vert }_1$ is the 1-norm, and $\rho (e_j)$ is a penalty factor that is equal to the number of the superpixels in the intersection of the hyperedge $e_j$ and the boundary superpixels of the input image.

The saliency of a vertex (i.e. a superpixel) is associated to the superpixel and its contexts. For any hyperpixel $v_i$ in a hypergraph, the saliency score of $v_i$ is defined as follows.

$$\begin{aligned}&HSa(v_i)=\sum _{{e_j}\in E}\varGamma (e_j)h(v_i,e_j), \end{aligned}$$

(2)

where $\varGamma (e_j)$ is the saliency score of $e_j$, and $h(v_i,e_j)$ is the element value in the $i^{th}$ row and $j^{th}$ column in the incidence matrix H. If $v_i\in e_j$, then $h(v_i,e_j) = 1$; otherwise $h(v_i,e_j) = 0$.

3 Saliency Detection Using Hypergraphs on Adaptive Multiscales

The CHMS method uses a set of empirical values [0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75] as the fixed scales to detect salient objects in an image, which has good performance for most images. It is widely acecepted that there are great differences between different images. If all the pixel-values in an image lie in the first or middle or last part of the range [0, 255], there may be no hyperedges for some scales which result in bad performance of saliency detection, as shown in the last three lines in Fig. 1.

We propose an improved saliency detection method using hypergraphs on adaptive multiscales (HAM). Different from the CHMS method, our HAM method uses a set of adaptive scales instead of fixed scales. Thus, it can get more single-scale hypergraphs than the previous ones for an input image, which results in higher accuracy.

3.1 Adaptive Scaling of Pixel-Values

In order to adaptively adjust the ranges of pixel-values in R, G, B channels, we count for the pixel-values in each channel by means of histogram. The statistical results are used to determine the range of pixel-values covering more than 95% pixels in each channel. Selecting the range of covering more than 95% pixels is to avoid the influence of outliers [6]. The pixel-values outside the range are replaced by the nearest pixel-values. And then, all the pixel-values are normalized and remapped to the range [0, 255]. For each channel i, $i\in \{R,G,B\}$, the formula for normalization and remapping is as follows.

$$\begin{aligned}&\left( {\left( {\frac{I_i-{low}_i}{{high}_i-{low}_i}}\right) ^\gamma }\bullet ({high}_{i,out}-{low}_{i,out}) \right) +{low}_{i,out}, \end{aligned}$$

(3)

where $I_i$ is a pixel value in channel i in the input image I, ${low}_i$ is the lower bound of pixel-values in channel i, ${high}_i$ is the upper bounder of pixel-values in channel i, ${low}_{i,out}$ is the lower bound of the remapped values in channel i, ${high}_{i,out}$ is the upper bound of the remapped values in channel i, and $\gamma $ indicates the shape of curve. Here, ${low}_{i,out} = 0$, ${high}_{i,out} = 255$, and $\gamma = 1$ representing a linear mapping.

3.2 Construction of Hypergraphs on Adaptive Scales

The CHMS method constructs hypergraphs on a set of fixed scales by means of the Alggo-MS method. For the details, refer to [14]. Different from the CHMS method, we use a set of adaptive scales instead of fixed scales. We first adjust adaptively the ranges of pixel-values in R, G, B channels in an image to get three remapped ranges of pixel-values, and then combine the remapped pixel-values with a set of fixed scales. The results of the two operations are equivalent to that of the original image with a set of adaptive scales. Thus, in the HAM method, we use hypergraphs on a set of adaptive scales to detect salient objects. The distance between any pixels $x_i$ and $x_j$ on an adaptive scale $\mu $ is computed as follows.

$$\begin{aligned}&dist = \left( {\frac{x_i-x_j}{\mu }}\right) ^2. \end{aligned}$$

(4)

The adaptive scaling of pixel-values in an image can result in more single-scale hypergraphs for the images with narrow ranges of R, G, B pixel-values (covering only the first or middle or last part of the range [0, 255]). Thus, the performance of saliency detection has been improved significantly by means of the HAM method. Take the three images shown in the 1st column of Fig. 2 as examples. When using the HAM method to detect salient objects, the number of single-scale hypergraphs is 7, 7, 6, respectively, as shown in Fig. 2; when using the CHMS method, the number of single-scale hypergraphs is 3, 2, 2, respectively, as shown in the last three lines in Fig. 1. By comparing the results in Fig. 2 with the last three lines in Fig. 1, it can be concluded that the HAM method has better performance than the CHMS method for the images with narrow ranges of pixels values.

From more experimental results, we find that, for the images with narrow ranges of pixel-values (all the pixel-values lie in the first or middle or last part of the range [0, 255]), the number of single-scale hypergraphs obtained by the HAM method is always greater than the number of single-scale hypergraphs obtained by the CHMS method; for the images with wide ranges of pixel-values, the two methods always get similar number of single-scale hypergraphs.

3.3 Saliency Detection Using Hypergraph on Adaptive Multiscales

The CHMS method uses hypergraphs on a set of fixed scales, no considering the color difference of different input images. For any input image, it takes the empirical values [0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75] as the fixed scales. In order to detect salient objects according to the color difference in an image [15], we make an adaptive scaling of pixel-values and use hypergraphs on a set of adaptive scales to detect salient objects. The HAM method improves signifycantly the performance of saliency detection for the images with narrow ranges of R, G, B pixel-values.

In step 5, the saliency map based on LS-SVM is obtained from one alternative method in the CHMS method [8]. In order to achieve a fair comparison, we also use this method. For further details, refer to [8].

The HAM method is an improvement on the CHMS method. The comparisons between the HAM method and the CHMS method are illustrated in Fig. 3 (More experimental results are shown in Sect. 4). Consider the images in the $1^{st}$ line in Fig. 3, there are less noises in the saliency map obtained by the HAM method than that by the CHMS method, and the boundaries of the CAUTION banner are smoother in the saliency map obtained by the HAM method than that by the CHMS method. For the images in the $2^{nd}$ line in Fig. 3, the background around the dog is usually marked as a part of salient object by the previous methods including the CHMS method because the dog has the similar color features with the background. The HAM method can detect the salient object better than the previous ones. For the images in the $3^{rd}$ line in Fig. 3, the remarkable object is a flower, but the green leaves are marked as salient object by the CHMS method due to their large size; while the HAM method highlights the flower and darkens the background which achieves better performance than the CHMS method. The experimental results show that the HAM method can better capture salient objects and improve the performance of saliency detection.

4 Experimental Results and Analysis

The HAM method is compared with six state-of-the-art saliency detection methods including CHMS [8], GC [5], RC [15], MSS [3], FT [1], LC [16] on the public datasets MSRA-1000 and SED-100, which are used as standard benchmarks. In addition, we use a new dataset, marked as IMNR, to evaluate the HAM method. Images in the IMNR dataset is with narrow ranges of R, G, B pixel-values. Some experimental results are shown in Fig. 4. The experimental results show that the HAM method is the most competitive one in all the seven methods.

We use the precision-recall (PR) curves and F-measures to evaluate the quantitative performances of the six state-of-the-art methods and the HAM method. The F-measures is computed as follows.

$$\begin{aligned}&F_\beta = \frac{(1+\beta ^2){P}\times {R}}{{\beta ^2}P+R}, \end{aligned}$$

(5)

where P is the precision rate, and R is the recall rate. Let $\beta ^2 = 0.3$.

The quantitative performances of all the seven methods on the IMNR dataset are shown in Fig. 5. From the PR curves and F-measures on the IMNR dataset, the HAM method has the best performance on the IMNR dataset in all the seven methods. As you can see from Fig. 5, the HAM method always has better performance than the CHMS method. When the recall rate is greater than 0.7, the HAM method is better than all the six state-of-the-art methods. Furthermore, the saliency map obtained by the HAM method is more smooth and robust than other six state-of-the-art methods.

The quantitative performances of all the seven methods on the MSRA-1000 dataset are shown in Fig. 6. From the PR curves and F-measures on MSRA-1000, the HAM method also has the best performance in all the seven methods but its advantages are not obvious since the most images in MSRA-1000 are with wide ranges of pixel-values and seldom of them are with narrow ranges of pixel-values.

The quantitative performances of the CHMS and HAM methods on the SED-100 dataset are shown in Fig. 7. From the PR curves and F-measures, the HAM method consistently performs better than the CHMS method.

5 Conclusion

The previous saliency detection methods based on hypergraphs uses a set of fixed scales, no considering the color difference of different input images. In order to detect salient objects according to the color difference in an image, we propose an improved saliency detection method baserd on hypergraphs on adaptive multiscales (HAM). Our HAM method adaptively adjusts the ranges of pixel-values in R, G, B channels, and uses hypergraphs on a set of adaptive scales to detect salient objects. The experimental results show that our HAM method improves significantly the performance of saliency detection for the images with narrow ranges of R, G, B pixel-values. For the images with wide ranges of pixel-values, the performance is also improved to some extent. The basic idea of our HAM method can be widely used in other applications in computer vision or artificial intelligence.

References

Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1597–1604 (2009)
Google Scholar
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., SüSstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)
Article Google Scholar
Achanta, R., Süsstrunk, S.: Saliency detection using maximum symmetric surround. In: IEEE International Conference on Image Processing, pp. 2653–2656 (2010)
Google Scholar
Cheng, M., Mitra, N.J., Huang, X., Torr, P.H.S.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 569–582 (2015)
Article Google Scholar
Cheng, M.M., Warrell, J., Lin, W.Y., Zheng, S., Vineet, V., Crook, N.: Efficient salient region detection with soft image abstraction. In: IEEE International Conference on Computer Vision, pp. 1529–1536 (2013)
Google Scholar
Cheng, M.M., Zhang, G.X., Mitra, N.J., Huang, X., Hu, S.M.: Global contrast based salient region detection. In: Computer Vision and Pattern Recognition, pp. 409–416 (2011)
Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)
Article Google Scholar
Li, X., Li, Y., Shen, C., Dick, A., Hengel, A.V.D.: Contextual hypergraph modeling for salient object detection. In: IEEE International Conference on Computer Vision, pp. 3328–3335 (2013)
Google Scholar
Ma, Y.F., Zhang, H.J.: Contrast-based image attention analysis by using fuzzy growing. In: Eleventh ACM International Conference on Multimedia, pp. 374–381 (2003)
Google Scholar
Navalpakkam, V., Itti, L.: An integrated model of top-down and bottom-up attention for optimizing detection speed. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2049–2056 (2006)
Google Scholar
Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems, vol. 19, pp. 545–552 (2007)
Google Scholar
Wang, L., Xue, J., Zheng, N., Hua, G.: Automatic salient object extraction with contextual cue. In: IEEE International Conference on Computer Vision, pp. 105–112 (2011)
Google Scholar
Wang, X.J., Ma, W.Y., Li, X.: Data-driven approach for bridging the cognitive gap in image retrieval (2004)
Google Scholar
Yuan, X., Hu, B.G., He, R.: Agglomerative mean-shift clustering via query set compression. IEEE Trans. Knowl. Data Eng. 24(2), 209–219 (2011)
Article Google Scholar
Yuan, Y., Han, A., Han, F.: Saliency detection based on non-uniform quantification for RGB channels and weights for lab channels. In: Zha, H., Chen, X., Wang, L., Miao, Q. (eds.) CCCV 2015. CCIS, vol. 546, pp. 258–266. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48558-3_26
Chapter Google Scholar
Zhai, Y., Shah, M.: Visual attention detection in video sequences using spatiotemporal cues. In: ACM International Conference on Multimedia, pp. 815–824 (2006)
Google Scholar

Download references

Acknowledgement

This work is supported by the Shandong Provincial Natural Science Foundation of China under Grant No. ZR2016FM20.

Author information

Authors and Affiliations

Department of Computer Science and Technology, Shandong University, Weihai, China
Aili Han & Jing Hao
College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Feilin Han

Authors

Feilin Han
View author publications
You can also search for this author in PubMed Google Scholar
Aili Han
View author publications
You can also search for this author in PubMed Google Scholar
Jing Hao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aili Han .

Editor information

Editors and Affiliations

Civil Aviation University of China, Tianjin, China
Jinfeng Yang
Tianjin University, Tianjin, China
Qinghua Hu
Nankai University, Tianjin, China
Ming-Ming Cheng
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Liang Wang
Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Huazhong University of Science and Technology, Wuhan, China
Xiang Bai
Xi’an Jiaotong University, Xi’an, China
Deyu Meng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Han, F., Han, A., Hao, J. (2017). An Improved Saliency Detection Method Using Hypergraphs on Adaptive Multiscales. In: Yang, J., et al. Computer Vision. CCCV 2017. Communications in Computer and Information Science, vol 773. Springer, Singapore. https://doi.org/10.1007/978-981-10-7305-2_59

Download citation

DOI: https://doi.org/10.1007/978-981-10-7305-2_59
Published: 08 December 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7304-5
Online ISBN: 978-981-10-7305-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics