Saliency detection via multi-view graph based saliency optimization

doi:10.1016/j.neucom.2019.03.066

Neurocomputing

Volume 351, 25 July 2019, Pages 156-166

https://doi.org/10.1016/j.neucom.2019.03.066 Get rights and content

Abstract

Saliency detection is an important problem in computer vision and pattern recognition area. Many works have been proposed for addressing the saliency detection task. As a popular method, graph based saliency optimization has been widely studied. However, previous works have universally focussed on single graph optimization which fails to consider multi-view feature representation of image content. In this paper, we first provide a general framework for traditional graph based saliency optimization models. Then, we extend the general framework to the multi-view case and propose our general multi-view graph based saliency optimization model. Finally, we present a particular implementation of our general model and derive an effective updating algorithm to solve it. Experimental results using several benchmark datasets demonstrate the effectiveness of our proposed saliency model.

Introduction

Saliency detection aims to locate the most important and informative regions in a visual image by simulating human visual attention [1]. It is an important problem in computer vision and pattern recognition area, and has been widely used in many applications such as image cropping [2], [3], image segmentation [4], [5], [6], object tracking [7], object recognition [8], [9], image compression [10], [11] and so on.

In the past few decades, many works have been proposed for saliency detection task. By the developing of graph theories [12], graph based saliency optimization has been widely studied [13], [14], [15], [16]. However, previous works universally focus on single graph optimization. It is well-known that human eyes are sensitive to multiple features including color, texture and so on [17]. It is essential to make full use of multiple features, which is more coincident with the human vision mechanism. Nonetheless, the key problem is how to integrate these multiple features. One simple and straightforward way is to joint multiple features into one single feature vector and then apply to the saliency detection directly [18]. Another way is to combine multiple saliency maps into a joint saliency map via a linear or nonlinear computation [19], [20].

One drawback of these traditional methods is that they ignore the potentially relationships among different kinds of features. In recent years, there have been relevant research and development in the field of machine learning and computer vision [21], [22], [23]. For better to use multi-view features to represent an object, Xia et al. [21] propose a multi-view spectral embedding (MSE) method to concatenate different vectors together as a new vector. Zhang et al. [23] present a novel Latent Multi-view Subspace Clustering (LMSC) method to get the latent representation of object and can be optimized efficiently. Inspired by these, in this paper, we propose a multi-view graph based saliency optimized method by using different features from multiple feature spaces. The results are showed as Fig. 1. We first provide a general framework for traditional graph based saliency optimization models. Then, we extend the general framework to multi-view case and propose our general multi-view graph based saliency optimization model. At last, we provide a particular implement of our general model and derive an effective updating algorithm to solve it. Experimental results on several benchmark datasets demonstrate the effectiveness of the proposed saliency model.

Section snippets

Related work

In the past few decades, many works have been proposed for addressing the saliency detection task [24], [25], [26]. Generally speaking, all methods fall into two categories: top-down (task-driven) and bottom-up (data-driven) methods. In top-down models, by the development of deep learning method, researchers propose many methods to get a good saliency detection results [27], [28], [29], [30], [31] by using convolutional neural networks or other networks. In this work, we focus on bottom-up

Graph based saliency optimization

Graph based optimization models have been successfully used for saliency detection problem. Most of these models first segment the input image into n non-overlapping super-pixels $S = {s_{1}, s_{2}, \dots, s_{n}}$ by simple linear iterative clustering (SLIC) algorithm [33]. Then, they construct an undirected weighted graph G(V, E) whose nodes V represent super-pixels S and edges E denote the relationship among super-pixels. The weight of edges $W$ are defined as the similarity between the feature descriptors of

Multi-view graph based saliency optimization

In this section, we extend the above graph based saliency optimization model to multi-view case and propose a new general multi-view graph saliency optimization for saliency detection problem.

Multi-view feature extraction and graph construction

Given an input image $I,$ we divide $I$ into N super-pixels $S = {s_{1}, s_{2}, \dots s_{N}}$ . For each super-pixel s_i, we extract five types of visual features including average of RGB values, average of CIE LAB, LAB histogram, LBP and HOG histogram [37], [39] and denote them as ${x_{i}^{1}, x_{i}^{2}, \dots x_{i}^{5}},$ respectively. As suggested in other works [37], [39], the dimensions of LAB, LBP and HOG histogram are 128, 59 and 8, respectively. For each feature, we construct a neighborhood graph $G^{m} (V, E^{m}), m = 1, 2 \dots 5$ whose nodes represent the

Experiments

We set multiple features $M = 5,$ edge weight parameter $σ^{2} = 0.1,$ two parameters $β = 4$ and $k = 4$ in all experiments. Then, we demonstrate our method on three public datasets: SED [40], SOD [41] and ASD [42]. SED have 200 images. One hundred images have only one salient object. Another hundred images have two salient objects. SOD contains 300 images and is based on the Berkeley segmentation dataset(BSD) [43], in which the consistency score is computed by seven subjects who are asked to choose one or

Conclusion

We propose a general framework for traditional graph based saliency optimization models, and extend the general framework to multi-view case and propose our general multi-view graph based saliency optimization model instead of previous works that focus on single graph optimization. We add a particular implement to our general model to obtain a multi-view graph based saliency optimization model and solve it by an effective updating algorithm. Experimental results on several benchmark datasets

Acknowledgments

This work was sponsored by the National Natural Science Foundation of China (Nos. 61472002, 61602001, 61502006).

Yun Xiao received B.S. degree in mathematics and applied mathematics and the M.Eng. degree in computer science from Anhui University of China in 2008 and 2011, respectively. She is currently a Lecturer and a Ph.D. student in computer science at Anhui University. Her current research interests include computer vision and saliency detection.

References (64)

W. Wang et al.
A deep network solution for attention and aesthetics aware photo cropping
IEEE Trans. Pattern Anal. Mach. Intell.
(2018)
S. Buoncompagni et al.
Saliency-based keypoint selection for fast object detection and matching
Pattern Recognit. Lett.
(2015)
Y. Xiao et al.
A prior regularized multi-layer graph ranking model for image saliency computation
Neurocomputing
(2018)
A.M. Treisman et al.
A feature-integration theory of attention
Cognit. Psychol.
(1980)
W. Wang et al.
Video salient object detection via fully convolutional networks
IEEE Trans. on Image Process.
(2017)
Q. Wang et al.
Grab: visual saliency via novel graph model and background priors
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(2016)
N. Tong et al.
Salient object detection via global and local cues
Pattern Recognit.
(2015)
C. Yang et al.
Saliency detection via graph-based manifold ranking
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(2013)
W. Wang et al.
Stereoscopic thumbnail creation via efficient stereo saliency detection
IEEE Trans. Visual. Comput. Graph.
(2017)
L. V et al.
Image segmentation with a bounding box prior
Proceedings of the IEEE International Conference on Computer Vision
(2009)

W. Wang et al.

Robust video object co-segmentation

IEEE Trans. Image Process.

(2015)

J. Shen et al.

Real-time superpixel segmentation by DBSCAN clustering algorithm

IEEE Trans. Image Process.

(2016)

A. Borji et al.

Adaptive object tracking by learning background context

Proceedings of the IEEE Conference Computer Vision and Pattern Recognition Workshops

(2012)

Z. Ren et al.

Region-based saliency detection and its application in object recognition

IEEE Trans. Circuits Syst. Video Technol.

(2014)

L. Itti

Automatic foveation for video compression using a neurobiological model of visual attention

IEEE Trans. Image Process.

(2004)

Y. Fang et al.

Saliency detection in the compressed domain for adaptive image retargeting

IEEE Trans. Image Process.

(2012)

J. Bo et al.

Graph-Laplacian PCA: closed-form solution and robustness

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

(2013)

J. Harel et al.

Graph-based visual saliency

Adv. Neural Inf. Process. Syst.

(2006)

W. Zhu et al.

Saliency optimization from robust background detection

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

(2014)

J. Bo et al.

Saliency detection via a multi-layer graph based diffusion model

Neurocomputing

(2018)

X. Shen et al.

A unified approach to salient object detection via low rank matrix recovery

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

(2012)

L. Itti et al.

A model of saliency-based visual attention for rapid scene analysis

IEEE Trans. Pattern Anal. Mach. Intell.

(1998)

Q. Zhao et al.

Learning visual saliency by combining feature maps in a nonlinear manner using adaboost

J. Vis.

(2012)

T. Xia et al.

Multiview spectral embedding

IEEE Trans. Syst. Man Cybern. Part B

(2010)

X. Cao et al.

Self-adaptively weighted co-saliency detection via rank constraint

IEEE Trans. Image Process. Publication of the IEEE Signal Processing Society

(2014)

C. Zhang et al.

Generalized latent multi-view subspace clustering

IEEE Trans. Pattern Anal. Mach. Intell.

(2018)

A. Borji et al.

Salient object detection: a benchmark

IEEE Trans. Image Process.

(2015)

T. Wang et al.

A stagewise refinement model for detecting salient objects in images

Proceedings of the IEEE International Conference on Computer Vision (ICCV)

(2017)

W. Wang et al.

Revisiting video saliency: a large-scale benchmark and a new model

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

(2018)

W. Wang et al.

Consistent video saliency using local gradient flow optimization and global refinement

IEEE Trans. Image Process.

(2015)

W. Wang et al.

Correspondence driven saliency transfer

IEEE Trans. Image Process.

(2016)

W. Wang et al.

Saliency-aware video object segmentation

IEEE Trans. Pattern Anal. Mach. Intell.

(2018)

Cited by (11)

Reweighted Discriminative Optimization for least-squares problems with point cloud registration
2021, Neurocomputing
Citation Excerpt :
Mathematical optimization plays an essential role in solving many computer graphics and vision problems [1–3].
Optimization plays a pivotal role in computer graphics and vision. Learning-based optimization algorithms have emerged as a powerful optimization technique for solving problems with robustness and accuracy because it learns gradients from data without calculating the Jacobian and Hessian matrices. The key aspect of the algorithms is the least-squares method, which formulates a general parametrized model of unconstrained optimizations and makes a residual vector approach to zeros to approximate a solution. The method may suffer from undesirable local optima for many applications, especially for point cloud registration, where each element of transformation vectors has a different impact on registration. In this paper, Reweighted Discriminative Optimization (RDO) method is proposed. By assigning different weights to components of the parameter vector, RDO explores the impact of each component and the asymmetrical contributions of the components on fitting results. The weights of parameter vectors are adjusted according to the characteristics of the mean square error of fitting results over the parameter vector space at per iteration. Theoretical analysis for the convergence of RDO is provided, and the benefits of RDO are demonstrated with tasks of 3D point cloud registrations and multi-views stitching. The experimental results show that RDO outperforms state-of-the-art registration methods in terms of accuracy and robustness to perturbations and achieves further improvement than non-weighting learning-based optimization.
Saliency detection via coarse-to-fine diffusion-based compactness with weighted learning affinity matrix
2021, Journal of Visual Communication and Image Representation
Citation Excerpt :
Recently, Zhu et al. [34] presented a saliency detection via affinity graph learning and weighted manifold ranking, in which the affinity matrix constructed by an unsupervised learning approach based on image data self-representation. Xiao et al. [35] built a saliency model that formulates a multi-view graph to achieve a perfect saliency optimization model. Zhang et al. [36] exploited a learning-based ranking framework for saliency detection, which integrates the low-level features and high-level semantic information extracted by deep neural networks.
Diffusion-based compactness is an effective method for foreground-based saliency detection, in which one key is the conventional graph construction. However, the conventional graph only displays the local structure but not preserves global relevance information. Therefore, diffusion-based compactness cannot highlight complete salient object which contains multiple areas with different features, and the extracted salient regions with weak homogeneous. Aiming to address these problems, we propose a saliency detection method via coarse-to-fine diffusion-based compactness with a weighted learning affinity matrix. Firstly, we construct multi-view conventional graphs to calculate the rough compactness cue. Secondly, we build a two-stage multi-view weighted graphs using a weighted learning affinity matrix and compute the coarse-to-fine compactness cue. Extensive experiments tested on three benchmark datasets, demonstrating the superior against several state-of-the-art methods.
A simple saliency detection approach via automatic top-down feature fusion
2020, Neurocomputing
Citation Excerpt :
We adopt the valve module and another generator module to build an elegant network1which is demonstrated to achieve state-of-the-art performance for saliency detection through extensive experiments. Traditional saliency detection methods usually extract various hand-crafted low-level features and then apply classifiers to classify these features [31–35]. Most of them utilize heuristic saliency priors, such as center prior [18,36], color contrast [1,2], and background prior [37–39].
It is widely accepted that the top sides of convolutional neural networks (CNNs) convey high-level semantic features, and the bottom sides contain low-level details. Therefore, most of recent salient object detection methods aim at designing effective fusion strategies for side-output features. Although significant progress has been achieved in this direction, the network architectures become more and more complex, which will make the future improvement difficult and heavily engineered. Moreover, the manually designed fusion strategies would be sub-optimal due to the large search space of possible solutions. To address above problems, we propose an Automatic Top-Down Fusion (ATDF) method, in which the global information at the top sides are flowed into bottom sides to guide the learning of low layers. We design a novel valve module and add it at each side to control the coarse semantic information flowed into a specific bottom side. Through these valve modules, each bottom side at the top-down pathway is expected to receive necessary top information. We also design a generator to improve the prediction capability of fused deep features for saliency detection. We perform extensive experiments to demonstrate that ATDF is simple yet effective and thus opens a new path for saliency detection.
Mutual Information Regularization for Weakly-Supervised RGB-D Salient Object Detection
2024, IEEE Transactions on Circuits and Systems for Video Technology
Intensifying graph diffusion-based salient object detection with sparse graph weighting
2023, Multimedia Tools and Applications
Mutual Information Regularization for Weakly-supervised RGB-D Salient Object Detection
2023, arXiv

View all citing articles on Scopus

Bo Jiang received the B.S. degree in mathematics and applied mathematics and the M.Eng. and Ph.D. degrees in computer science from Anhui University of China in 2009, 2012 and 2015, respectively. He is currently an associated professor in computer science at Anhui University. His current research interests include image feature extraction and matching, data representation and learning.

Aihua Zheng received the Ph.D. degree in computer science from University of Greenwich of UK in 2012. She is currently an Associate Professor in Anhui University. Her current research interests include vision based artificial intelligence and pattern recognition, person / vehicle re-identification, moving object detection

Aiwu Zhou received the M.Eng. degree in Computer Science in 1989 from Anhui University, Hefei, China. Since 1998, she has been an Associate Professor in the School of Computer Science and Technology at Anhui University. Her research interests include Software Engineering and Information Systems, computer vision and so on.

Amir Hussain received the B. Eng. degree and the Ph.D. degree in Electronic & Electrical Engineering from University of Strathclyde, Scotland, UK, in 1992 and 1997, respectively. He is a Professor in Computing Science, Edinburgh Napier University in Scotland, UK. His research interests include cognitive computation, machine learning and computer vision.

Jin Tang received the B.Eng. degree in automation in 1999, and the Ph.D. degree in computer science in 2007 from Anhui University, Hefei, China. Since 2012, he has been a Professor in the School of Computer Science and Technology at Anhui University. His research interests include image processing, pattern recognition, machine learning and computer vision.

View full text

Saliency detection via multi-view graph based saliency optimization

Abstract

Introduction

Section snippets

Related work

Graph based saliency optimization

Multi-view graph based saliency optimization

Multi-view feature extraction and graph construction

Experiments

Conclusion

Acknowledgments

IEEE Trans. Pattern Anal. Mach. Intell.

Pattern Recognit. Lett.

Neurocomputing

Cognit. Psychol.

IEEE Trans. on Image Process.

Pattern Recognit.

Saliency detection via graph-based manifold ranking

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Stereoscopic thumbnail creation via efficient stereo saliency detection

IEEE Trans. Visual. Comput. Graph.

Image segmentation with a bounding box prior

Proceedings of the IEEE International Conference on Computer Vision

Robust video object co-segmentation

IEEE Trans. Image Process.

Real-time superpixel segmentation by DBSCAN clustering algorithm

IEEE Trans. Image Process.

Adaptive object tracking by learning background context

Proceedings of the IEEE Conference Computer Vision and Pattern Recognition Workshops

Region-based saliency detection and its application in object recognition

IEEE Trans. Circuits Syst. Video Technol.

Automatic foveation for video compression using a neurobiological model of visual attention

IEEE Trans. Image Process.

Saliency detection in the compressed domain for adaptive image retargeting

IEEE Trans. Image Process.

Graph-Laplacian PCA: closed-form solution and robustness

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Graph-based visual saliency

Adv. Neural Inf. Process. Syst.

Saliency optimization from robust background detection

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Saliency detection via a multi-layer graph based diffusion model

Neurocomputing

A unified approach to salient object detection via low rank matrix recovery

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

A model of saliency-based visual attention for rapid scene analysis

IEEE Trans. Pattern Anal. Mach. Intell.

Learning visual saliency by combining feature maps in a nonlinear manner using adaboost

J. Vis.

Multiview spectral embedding

IEEE Trans. Syst. Man Cybern. Part B

Self-adaptively weighted co-saliency detection via rank constraint

IEEE Trans. Image Process. Publication of the IEEE Signal Processing Society

Generalized latent multi-view subspace clustering

IEEE Trans. Pattern Anal. Mach. Intell.

Salient object detection: a benchmark

IEEE Trans. Image Process.

A stagewise refinement model for detecting salient objects in images

Proceedings of the IEEE International Conference on Computer Vision (ICCV)

Revisiting video saliency: a large-scale benchmark and a new model

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Consistent video saliency using local gradient flow optimization and global refinement

IEEE Trans. Image Process.

Correspondence driven saliency transfer

IEEE Trans. Image Process.

Saliency-aware video object segmentation

IEEE Trans. Pattern Anal. Mach. Intell.