Skip to main content
Log in

Semantic feature-guided and correlation-aggregated salient object detection

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Most current salient object detection (SOD) methods employ an encoder-decoder architecture based on fully convolutional neural networks. However, the subjective nature of the saliency object detection task and the local nature of convolutional processing may result in missing global contextual information. In addition, feature fusion without information filtering may introduce more noise and thus weaken the localization ability of prominent objects. Therefore, we propose a transformer-based semantic feature-guided and correlation-aggregated salient object detection (SFC-SOD) method. Specifically, the method takes a pyramid vision transformer (PVT) as the encoder backbone to extract features and designs a top-level feature guidance (TFG) module in the decoder to explore the correlation between the highest-level features and the low-level features. The low-level features are guided in the channel dimension to enhance the expression of the low-level features. Based on the features obtained from TFG, the adaptive feature fusion (AFF) module is designed to efficiently utilize the essential features of different layers for fusion to obtain salient critical information while reducing redundant features. After feature fusion, the top-down correlation-aggregation (TCA) module is introduced to further enhance and refine the salient features by using the high-level output results to guide the lower-level features to establish global dependencies, thus achieving better saliency results. The results of extensive experiments conducted on six widely used datasets show the superior performance of the proposed SFC-SOD by comparing it with several state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data Availibility Statement

The raw data required to reproduce these findings cannot be shared, as the data also form part of an ongoing study.

References

  1. Borji A, Cheng MM, Jiang H et al (2015) Salient object detection: A benchmark. IEEE Trans Image Process 24(12):5706–5722

    Article  MathSciNet  Google Scholar 

  2. Wang W, Lai Q, Fu H et al (2021) Salient object detection in the deep learning era: An in-depth survey. IEEE Trans Patt Anal Mach Intell 44(6):3239–3259

    Article  Google Scholar 

  3. Zhang T, Zou J, Jia W (2018) Fast and robust road sign detection in driver assistance systems. Appl Intell 48:4113–4127

    Article  Google Scholar 

  4. Yu L, Jin M, Zhou K (2020) Multi-channel biomimetic visual transformation for object feature extraction and recognition of complex scenes. Appl Intell 50(3):792–811

    Article  Google Scholar 

  5. Madani K, Kachurka V, Sabourin C et al (2018) A human-like visual-attention-based artificial vision system for wildland firefighting assistance. Appl Intell 48:2157–2179

    Article  Google Scholar 

  6. Chaki J, Woźniak M (2023) A deep learning based four-fold approach to classify brain mri: Btscnet. Biomed Signal Process Control 85:104902

    Article  Google Scholar 

  7. Siłka W, Wieczorek M, Siłka J et al (2023) Malaria detection using advanced deep learning architecture. Sensors 23(3):1501

    Article  Google Scholar 

  8. Goferman S, Zelnik-Manor L, Tal A (2011) Context-aware saliency detection. IEEE Trans Patt Anal Mach Intell 34(10):1915–1926

    Article  Google Scholar 

  9. Yan Q, Xu L, Shi J et al (2013) Hierarchical saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1155–1162

  10. Perazzi F, Krähenbühl P, Pritch Y et al (2012) Saliency filters: Contrast based filtering for salient region detection. In: 2012 IEEE conference on computer vision and pattern recognition, IEEE, pp 733–740

  11. Cheng MM, Mitra NJ, Huang X et al (2014) Global contrast based salient region detection. IEEE Trans Patt Anal Mach Intell 37(3):569–582

    Article  Google Scholar 

  12. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Patt Anal Mach Intell 20(11):1254–1259

    Article  Google Scholar 

  13. Kumar A, Sato Y, Oishi T et al (2014) Improving gps position accuracy by identification of reflected gps signals using range data for modeling of urban structures. Seisan Kenkyu 66(2):101–107. https://doi.org/10.11188/seisankenkyu.66.101

  14. Kumar A, Banno A, Ono S et al (2013) Global coordinate adjustment of the 3d survey models under unstable gps condition. Seisan Kenkyu 65(2):91–95. https://doi.org/10.11188/seisankenkyu.65.91

  15. Li G, Yu Y (2015) Visual saliency based on multiscale deep features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5455–5463

  16. Wang L, Lu H, Ruan X et al (2015) Deep networks for saliency detection via local estimation and global search. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3183–3192

  17. Jia F, Guan J, Qi S et al (2020) A mix-supervised unified framework for salient object detection. Appl Intell 50:2945–2958

    Article  Google Scholar 

  18. Xia C, Gao X, Li KC et al (2020) Salient object detection based on distribution-edge guidance and iterative bayesian optimization. Appl Intell 50:2977–2990

    Article  Google Scholar 

  19. Qin X, Zhang Z, Huang C et al (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7479–7489

  20. Zhang L, Dai J, Lu H et al (2018) A bi-directional message passing model for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1741–1750

  21. Wang J, Zhao Z, Yang S et al (2022) Global contextual guided residual attention network for salient object detection. Appl Intell 1–19

  22. Liu N, Han J, Yang MH (2018) Picanet: Learning pixel-wise contextual attention for saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3089–3098

  23. Feng M, Lu H, Ding E (2019) Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1623–1632

  24. Wang L, Lu H, Wang Y et al (2017) Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 136–145

  25. Li G, Yu Y (2016) Deep contrast learning for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 478–487

  26. Li Z, Lang C, Liew JH et al (2021) Cross-layer feature pyramid network for salient object detection. IEEE Trans Image Process 30:4587–4598

    Article  Google Scholar 

  27. Liu T, Yuan Z, Sun J et al (2010) Learning to detect a salient object. IEEE Trans Patt Anal Mach Intell 33(2):353–367

    Google Scholar 

  28. Achanta R, Hemami S, Estrada F et al (2009) Frequency-tuned salient region detection. In: 2009 IEEE conference on computer vision and pattern recognition, IEEE, pp 1597–1604

  29. Cheng MM, Mitra NJ, Huang X et al (2014) Global contrast based salient region detection. IEEE Trans Patt Anal Mach Intell 37(3):569–582

    Article  Google Scholar 

  30. Perazzi F, Krähenbühl P, Pritch Y et al (2012) Saliency filters: Contrast based filtering for salient region detection. In: 2012 IEEE conference on computer vision and pattern recognition, IEEE, pp 733–740

  31. Liu Y, Han J, Zhang Q et al (2019) Deep salient object detection with contextual information guidance. IEEE Trans Image Process 29:360–374

    Article  MathSciNet  Google Scholar 

  32. Huang Z, Chen H, Liu B et al (2021) Semantic-guided attention refinement network for salient object detection in optical remote sensing images. Remote Sens 13(11):2163

    Article  Google Scholar 

  33. Chen X, Zhang Q, Zhang L (2021) Edge-aware salient object detection network via context guidance. Image Vis Comput 110:104166

    Article  Google Scholar 

  34. Mohammadi S, Noori M, Bahri A et al (2020) Cagnet: Content-aware guidance for salient object detection. Pattern Recogn 103:107303

    Article  Google Scholar 

  35. Liu JJ, Hou Q, Cheng MM et al (2019) A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3917–3926

  36. Qin X, Zhang Z, Huang C et al (2020) U2-net: Going deeper with nested u-structure for salient object detection. Pattern Recogn 106:107404

    Article  Google Scholar 

  37. Pei J, Zhou T, Tang H et al (2023) Fgo-net: Feature and gaussian optimization network for visual saliency prediction. Appl Intell 53(6):6214–6229

    Article  Google Scholar 

  38. Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. Adv Neural Inf Process Sys 30

  39. Wang H, Zhu Y, Adam H et al (2021) Max-deeplab: End-to-end panoptic segmentation with mask transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5463–5474

  40. Zhang G, Luo Z, Cui K et al (2022) Meta-detr: Image-level few-shot detection with inter-class correlation exploitation. IEEE Trans Pattern Anal Mach Intell

  41. Mao W, Ge Y, Shen C et al (2022) Poseur: Direct human pose regression with transformers. In: Part VI (ed) Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings. Springer, pp 72–88

    Chapter  Google Scholar 

  42. Jiang S, Campbell D, Lu Y et al (2021) Learning to estimate hidden motions with global motion aggregation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 9772–9781

  43. Dosovitskiy A, Beyer L, Kolesnikov A et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint. arXiv:2010.11929

  44. Touvron H, Cord M, Douze M et al (2021) Training data-efficient image transformers & distillation through attention. In: International conference on machine learning, PMLR, pp 10347–10357

  45. Carion N, Massa F, Synnaeve G et al (2020) End-to-end object detection with transformers. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16, Springer, pp 213–229

  46. Wang W, Xie E, Li X et al (2021) Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 568–578

  47. Liu N, Zhang N, Wan K et al (2021) Visual saliency transformer. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4722–4732

  48. Qiu Y, Liu Y, Zhang L et al (2021) Boosting salient object detection with transformer-based asymmetric bilateral u-net. arXiv preprint. arXiv:2108.07851

  49. Mao Y, Zhang J, Wan Z et al (2021) Generative transformer for accurate and reliable salient object detection. arXiv e-prints pp arXiv–2104

  50. Zhuge M, Fan DP, Liu N et al (2022) Salient object detection via integrity learning. IEEE Trans Pattern Anal Mach Intell 45(3):3738–3752

    Google Scholar 

  51. Wu Z, Su L, Huang Q (2019) Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3907–3916

  52. Li Z, Lang C, Liew JH et al (2021) Cross-layer feature pyramid network for salient object detection. IEEE Trans Image Process 30:4587–4598

    Article  Google Scholar 

  53. Fu J, Liu J, Tian H et al (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3146–3154

  54. Wang L, Lu H, Wang Y et al (2017) Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 136–145

  55. Li Y, Hou X, Koch C et al (2014) The secrets of salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 280–287

  56. Yan Q, Xu L, Shi J et al (2013) Hierarchical saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1155–1162

  57. Yang C, Zhang L, Lu H et al (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3166–3173

  58. Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint. arXiv:1412.6980

  59. Fan DP, Gong C, Cao Y et al (2018) Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint. arXiv:1805.10421

  60. Fan DP, Cheng MM, Liu Y et al (2017) Structure-measure: A new way to evaluate foreground maps. In: Proceedings of the IEEE international conference on computer vision, pp 4548–4557

  61. Borji A, Cheng MM, Jiang H et al (2015) Salient object detection: A benchmark. IEEE Trans Image Process 24(12):5706–5722

    Article  MathSciNet  Google Scholar 

  62. Pang Y, Zhao X, Zhang L et al (2020) Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9413–9422

  63. Wu YH, Liu Y, Zhang L et al (2022) Edn: Salient object detection via extremely-downsampled network. IEEE Trans Image Process 31:3125–3136

    Article  Google Scholar 

  64. Wu Z, Su L, Huang Q (2019) Stacked cross refinement network for edge-aware salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7264–7273

  65. Ke YY, Tsubono T (2022) Recursive contour-saliency blending network for accurate salient object detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2940–2950

  66. Zhao JX, Liu JJ, Fan DP et al (2019) Egnet: Edge guidance network for salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8779–8788

  67. Zhao X, Pang Y, Zhang L et al (2020) Suppress and balance: A simple gated network for salient object detection. In: Computer vision–ECCV 2020: 16th european conference, Glasgow, UK, August 23–28, 2020, proceedings, part II 16, Springer, pp 35–51

  68. Feng M, Lu H, Ding E (2019) Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1623–1632

  69. Wei J, Wang S, Huang Q (2020) F\(^3\)net: fusion, feedback and focus for salient object detection. In: Proceedings of the AAAI conference on artificial intelligence, pp 12321–12328

  70. Wu Z, Su L, Huang Q (2021) Decomposition and completion network for salient object detection. IEEE Trans Image Process 30:6226–6239

  71. Tang L, Li B, Zhong Y et al (2021) Disentangled high quality salient object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 3580–3590

  72. Zheng S, Lu J, Zhao H et al (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6881–6890

  73. Wang Z, Zhang Y, Liu Y et al (2022) Tf-sod: a novel transformer framework for salient object detection. Neural Comput & Applic 34(14):11789–11806

    Article  Google Scholar 

  74. Liu G, Xu B, Huang H et al (2022) Sdetr: Attention-guided salient object detection with transformer. In: ICASSP 2022 - 2022 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 1611–1615. https://doi.org/10.1109/ICASSP43922.2022.9746367

  75. Mohammadi S, Noori M, Bahri A et al (2020) Cagnet: Content-aware guidance for salient object detection. Pattern Recogn 103:107303

    Article  Google Scholar 

  76. Chen S, Tan X, Wang B et al (2020) Reverse attention-based residual network for salient object detection. IEEE Trans Image Process 29:3763–3776

  77. Wei J, Wang S, Wu Z et al (2020) Label decoupling framework for salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13025–13034

  78. Feng M, Lu H, Yu Y (2020) Residual learning for salient object detection. IEEE Trans Image Process 29:4696–4708

    Article  Google Scholar 

  79. Qin X, Fan DP, Huang C et al (2021) Boundary-aware segmentation network for mobile and web applications. arXiv preprint. arXiv:2101.04704

Download references

Acknowledgements

This work was supported in part by the Key Research and Development and Promotion Projects in Henan Province under Grant 212102210151, in part by the Kaifeng Science and Technology Development Program under Grant 2101006, and in part by the Postgraduate Cultivating Innovation and Quality Improvement Action Plan of Henan University under Grant SYLYC2022218.

Author information

Authors and Affiliations

Authors

Contributions

Jincheng Luo: writing - original draft, writing - review & editing, methodology, software. Yongjun Li: Supervision, resources, project administration, conceptualization. Bo Li: software, validation. Xinru Zhang: software, validation. Chaoyue Li: software, validation. Zhimin Chenjin: software, validation. Dongming Zhang: supervision.

Corresponding author

Correspondence to Yongjun Li.

Ethics declarations

Ethical and informed consent for data used

No statement from the author regarding ethical and informed consent for the data used.

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, J., Li, Y., Li, B. et al. Semantic feature-guided and correlation-aggregated salient object detection. Appl Intell 53, 30169–30185 (2023). https://doi.org/10.1007/s10489-023-05141-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-05141-y

Keywords

Navigation