Abstract
Although the Fully Convolutional Neural Networks (FCNs) has achieved good performance in salient object detection, there are problems, such as fuzzy boundary and unsatisfactory performance in complex scenes. Hence, how to better integrate multi-level convolution feature requires further investigation. This paper proposes a salient object detection algorithm, which uses Gram matrix and its F norm to weigh the importance of each multi-level feature map and uses weight to fuse multi-level prediction results recursively, finally generate the final saliency map. The algorithm evaluates the importance of different depth multi-level feature maps by calculating the Gram matrix's F norm of feature tensor slices. The multi-level feature maps are fused effectively according to the weight. It reduces the loss of multi-level prediction results during fusion, and preserves the spatial details. Besides, to achieve a more accurate boundary, a deep supervision is used to optimize salient feature maps’ results. Pixel-level supervision information from ground truth will guide each layer’s prediction. Experiments on five benchmark data sets demonstrate that the proposed method performs well in various scenes, especially in complex scenes.




Similar content being viewed by others
Data availability
The raw/processed data required to reproduce these findings cannot be shared at this time as the data also forms part of an ongoing study.
References
Chang, C.: Research on natural scene image classification algorithm based on saliency detection [D], pp. 1–55. Wuhan University of Technology, Wuhan (2018)
Muratov, O., Zontone, P., Boato, G., et al.: A segment-based image saliency detection [C]. IEEE International Conference on Acoustics. IEEE, pp.1217–1220 (2011)
Runchun, Ye.: The optimization model of saliency detection and its application in image compression [D], pp. 1–45. University of Science and Technology of China, Hefei (2018)
Cox, I.J., Kilian, J., Leighton, F.T., et al.: Secure spread spectrum watermarking for multimedia [J]. IEEE Trans. Image Process. 6(12), 1673–1687 (1997)
Achanta, R., Estrada, F., Wils, P., et al.: Salient Region Detection and Segmentation [C]. International Conference on Computer Vision Systems, pp. 66–75. Springer, Berlin (2008)
Ma, Y.-F., Zhang, H.-J.: Contrast-based image attention analysis by using fuzzy growing [C]. Proceedings of the Eleventh ACM International Conference on Multimedia, pp. 374–381 (2003)
Liu, T., Yuan, Z., Sun, J., et al.: Learning to detect a salient object [J]. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 353–367 (2010)
Wei, Y., Wen, F., Zhu, W., et al.: Geodesic saliency using background priors [C]. European Conference on Computer Vision, pp. 29–42. Springer, Berlin (2012)
Tong, N., Lu, H., Ruan, X., et al.: Salient object detection via bootstrap learning [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1884–1892 (2015)
Cheng, M.M., Mitra, N.J., Huang, X., et al.: Global contrast based salient region detection [J]. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 569–582 (2014)
Wang L, Lu H, Ruan X, et al. Deep networks for saliency detection via local estimation and global search [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3183–3192 (2015)
Lee, G., Tai, Y.W., Kim, J.: Deep saliency with encoded low level distance map and high level features [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 660–668 (2016)
Li, H., Chen, J., Lu, H., et al.: CNN for saliency detection with low-level feature integration [J]. Neurocomputing 226, 212–220 (2017)
Wang, T., Borji, A., Zhang, L., et al.: A stagewise refinement model for detecting salient objects in images [C]. Proceedings of the IEEE International Conference on Computer Vision, pp. 4019–4028 (2017)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition [J]. arXiv preprint arXiv:1409.1556, (2014)
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Zhang, P., Wang, D., Lu, H., et al.: Amulet: Aggregating multi-level convolutional features for salient object detection [C]. Proceedings of the IEEE International Conference on Computer Vision. 2017: 202–211.
Hariharan, B., Arbeláez, P., Girshick, R., et al.: Hypercolumns for object segmentation and fine-grained localization [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 447–456 (2015)
Huang, J., Rathod, V., Sun, C., et al.: Speed/accuracy trade-offs for modern convolutional object detectors [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7310–7311 (2017)
Liu, N., Han, J.W.: Dhsnet: deep hierarchical saliency network for salient object detection [C]. IEEE Conference on Computer Vision and Pattern Recognition, pp. 678–686. IEEE Computer Society Press, Los Alamitos (2016)
Zhang, L., Dai, J., Lu, H., et al.: A bi-directional message passing model for salient object detection [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1741–1750 (2018)
Zhao, H., Shi, J., Qi, X., et al.: Pyramid scene parsing network [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Hu, X., Zhu, L., Qin, J., et al.: Recurrently aggregating deep features for salient object detection [C]. Proceedings of the AAAI Conference on Artificial Intelligence, pp. 32–39 (2018)
Gatys, L., Ecker, A., Bethge, M.: A neural algorithm of artistic style [J]. J. Vis. 16(12), 326–326 (2016)
Wang, L., Lu, H., Wang, Y., et al.: Learning to detect salient objects with image-level supervision [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 136–145 (2017).
Li, Y., Hou, X., Koch, C., et al.: The secrets of salient object segmentation [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 280–287 (2014)
Movahedi, V., Elder, J.H.: Design and perceptual validation of performance measures for salient object segmentation [C]. Computer Vision & Pattern Recognition Workshops. IEEE, pp. 49–56 (2010)
Li, G., Yu, Y.: Visual saliency based on multiscale deep features [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5455–5463 (2015)
Yang, C., Zhang, L., Lu, H., et al.: Saliency detection via graph-based manifold ranking [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3166–3173 (2013)
Lecun, Y., Bottou, L.: Gradient-based learning applied to document recognition [J]. Proc. IEEE 86(11), 2278–2324 (1998)
Borji, A., Cheng, M.M., Jiang, H., et al.: Salient object detection: a benchmark [J]. IEEE Trans. Image Process. 24(12), 5706–5722 (2015)
Funding
This research was supported by National Natural Science Foundation of China (Grant No. 62172132).
Author information
Authors and Affiliations
Contributions
All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shanqing, Z., Yujie, C., Yiheng, M. et al. A multi-level feature weight fusion model for salient object detection. Multimedia Systems 29, 887–895 (2023). https://doi.org/10.1007/s00530-022-01018-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-022-01018-1