Skip to main content

Advertisement

A novel embedded cross framework for high-resolution salient object detection

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Salient object detection (SOD) is a fundamental research topic in computer vision and has attracted significant interest from various fields, it has revealed two issues while driving the rapid development of salient detection. (1) The salient regions in high-resolution images exhibit significant differences in location, structure, and edge details, which makes them difficult to recognize and depict. (2) The traditional salient detection architecture is insensitive to detecting targets in high-resolution feature spaces, which leads to incomplete saliency predictions. To address these limitations, this paper proposes a novel embedded cross framework with a dual-path transformer (ECF-DT) for high-resolution SOD. The framework consists of a dual-path transformer and a unit fusion module for partitioning the salient targets. Specifically, we first design a cross network as a baseline model for salient object detection. Then, the dual-path transformer is embedded into the cross network with the objective of integrating fine-grained visual contextual information and target details while suppressing the disparity of the feature space. To generate more robust feature representations, we also introduce a unit fusion module, which highlights the positive information in the feature channels and encourages saliency prediction. Extensive experiments are conducted on nine benchmark databases, and the performance of the ECF-DT is compared with that of other existing state-of-the-art methods. The results indicate that our method outperforms its competitors and accurately detects the targets in high-resolution images with large objects, cluttered backgrounds, and complex scenes. It achieves MAEs of 0.017, 0.026, and 0.031 on three high-resolution public databases. Moreover, it reaches S-measure rates of 0.909, 0.876, 0.936, 0.854, 0.929, and 0.826 on six low-resolution public databases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Zong M, Wang R, Chen X, Chen Z, Gong Y (2021) Motion saliency based multi-stream multiplier resnets for action recognition. Image Vis Comput 107(104):108

    MATH  Google Scholar 

  2. Bi HB, Lu D, Zhu HH, Yang LN, Guan HP (2021) Sta-net: spatial-temporal attention network for video salient object detection. Appl Intell 51:3450–3459

    Article  MATH  Google Scholar 

  3. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  4. Zhu H, Wang B, Zhang X, Liu J (2020) Semantic image segmentation with shared decomposition convolution and boundary reinforcement structure. Appl Intell 50:2676–2689

    Article  MATH  Google Scholar 

  5. Luo J, Li Y, Pan Y, Yao T, Feng J, Chao H, Mei T (2023) Semantic-conditional diffusion networks for image captioning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 23,359–23,368

  6. Wang H, Fan Y, Wang Z, Jiao L, Schiele B (2018) Parameter-free spatial attention network for person re-identification. arXiv preprint arXiv:1811.12150

  7. Zhang Y, Li K, Li K, Wang L, Zhong B, Fu Y (2018) Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European conference on computer vision (ECCV), pp 286–301

  8. Li J, Pan Z, Liu Q, Cui Y, Sun Y (2020) Complementarity-aware attention network for salient object detection. IEEE Trans Cybern 52(2):873–886

    Article  MATH  Google Scholar 

  9. Wu Z, Li S, Chen C, Qin H, Hao A (2022) Salient object detection via dynamic scale routing. IEEE Trans Image Process 31:6649–6663

    Article  MATH  Google Scholar 

  10. Yuan J, Zhu A, Xu Q, Wattanachote K, Gong Y (2023) Ctif-net: A cnn-transformer iterative fusion network for salient object detection. IEEE Transactions on Circuits and Systems for Video Technology

  11. Yang W, Wu W, Chen XD, Tao X, Mao X (2023) How to use extra training data for better edge detection? Appl Intell 53(17):20,499–20,513

  12. Yang W, Chen XD, Wu W, Qin H, Yan K, Mao X, Song H (2024) Boosting deep unsupervised edge detection via segment anything model. IEEE Transactions on Industrial Informatics

  13. Yun YK, Lin W (2023) Towards a complete and detail-preserved salient object detection. IEEE Transactions on Multimedia

  14. Yan R, Yan L, Geng G, Cao Y, Zhou P, Meng Y (2024) Asnet: Adaptive semantic network based on transformer-cnn for salient object detection in optical remote sensing images. IEEE Transactions on Geoscience and Remote Sensing

  15. Lin Y, Sun H, Liu N, Bian Y, Cen J, Zhou H (2022) Attention guided network for salient object detection in optical remote sensing images. In: International conference on artificial neural networks, pp 25–36. Springer

  16. Yuan J, Wei J, Wattanachote K, Zeng K, Luo X, Xu Q, Gong Y (2022) Attention-based bi-directional refinement network for salient object detection. Appl Intell 52(12):14,349–14,361

  17. Yang A, Liu Y, Cheng S, Cao J, Ji Z, Pang Y (2023) Spatial attention-guided deformable fusion network for salient object detection. Multimedia Systems 29(5):2563–2573

    Article  Google Scholar 

  18. Peng C, Zhang K, Ma Y, Ma J (2021) Cross fusion net: A fast semantic segmentation network for small-scale semantic information capturing in aerial scenes. IEEE Trans Geosci Remote Sens 60:1–13

    MATH  Google Scholar 

  19. Zhou W, Zhu Y, Lei J, Wan J, Yu L (2021) Ccafnet: Crossflow and cross-scale adaptive fusion network for detecting salient objects in rgb-d images. IEEE Trans Multimedia 24:2192–2204

    Article  MATH  Google Scholar 

  20. Han H, Lu F, Deng Y, Luo X, Jin H, Tu W, Xie X (2023) M 2 cf-net: A multi-resolution and multi-scale cross fusion network for segmenting pathology lesion of the focal lymphocytic sialadenitis. In: 2023 IEEE International conference on medical artificial intelligence (MedAI), pp 425–434. IEEE

  21. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259

    Article  MATH  Google Scholar 

  22. Vidal R, Ma Y, Sastry S (2005) Generalized principal component analysis (gpca). IEEE Trans Pattern Anal Mach Intell 27(12):1945–1959

    Article  MATH  Google Scholar 

  23. Li N, Sun B, Yu J (2015) A weighted sparse coding framework for saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5216–5223

  24. Sheng H, Zhang S, Liu X, Xiong Z (2016) Relative location for light field saliency detection. In: 2016 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 1631–1635. IEEE

  25. Piao Y, Li X, Zhang M, Yu J, Lu H (2019) Saliency detection via depth-induced cellular automata on light field. IEEE Trans Image Process 29:1879–1889

    Article  MathSciNet  MATH  Google Scholar 

  26. Liu Y, Zhang Y, Liu S, Coleman S, Wang Z, Qiu F (2022) Salient object detection by aggregating contextual information. Pattern Recogn Lett 153:190–199

    Article  MATH  Google Scholar 

  27. Zhao JX, Liu JJ, Fan DP, Cao Y, Yang J, Cheng MM (2019) Egnet: Edge guidance network for salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8779–8788

  28. Liu JJ, Hou Q, Cheng MM, Feng J, Jiang J (2019) A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3917–3926

  29. Wei J, Wang S, Wu Z, Su C, Huang Q, Tian Q (2020) Label decoupling framework for salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13,025–13,034

  30. Zhou H, Xie X, Lai JH, Chen Z, Yang L (2020) Interactive two-stream decoder for accurate and fast saliency detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9141–9150

  31. Jing L, Wang B (2024) Emnet: Edge-guided multi-level network for salient object detection in low-light images. Image Vis Comput 143(104):933

    MATH  Google Scholar 

  32. Yang C, Xiao Y, Chu L, Yu Z, Zhou J, Zheng H (2024) Saliency and edge features-guided end-to-end network for salient object detection. Expert Syst Appl 257(125):016

    Google Scholar 

  33. Zhao H, Qi X, Shen X, Shi J, Jia J (2018) Icnet for real-time semantic segmentation on high-resolution images. In: Proceedings of the European conference on computer vision (ECCV), pp 405–420

  34. Poudel RP, Liwicki S, Cipolla R (2019) Fast-scnn: Fast semantic segmentation network. arXiv preprint arXiv:1902.04502

  35. Zhang Q, Wang S, Wang X, Sun Z, Kwong S, Jiang J (2020) A multi-task collaborative network for light field salient object detection. IEEE Trans Circ Syst Video Technol 31(5):1849–1861

    Article  MATH  Google Scholar 

  36. Wang J, Yang Q, Yang S, Chai X, Zhang W (2022) Dual-path processing network for high-resolution salient object detection. Appl Intell 52(10):12,034–12,048

  37. Yi Y, Zhang N, Zhou W, Shi Y, Xie G, Wang J (2024) Gponet: A two-stream gated progressive optimization network for salient object detection. Pattern Recogn 150(110):330

    MATH  Google Scholar 

  38. Zhao J, Jia Y, Ma L, Yu L (2024) Adaptive dual-stream sparse transformer network for salient object detection in optical remote sensing images. IEEE J Sel Top Appl Earth Obs Remote Sens 17:5173–5192

    Article  MATH  Google Scholar 

  39. Lv Y, Zhou W, Lei J, Ye L, Luo T (2019) Attention-based fusion network for human eye-fixation prediction in 3d images. Opt Express 27(23):34,056–34,066

  40. Ghiasi G, Fowlkes CC (2016) Laplacian pyramid reconstruction and refinement for semantic segmentation. In: Computer vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III 14, pp 519–534. Springer

  41. Lai WS, Huang JB, Ahuja N, Yang MH (2017) Deep laplacian pyramid networks for fast and accurate super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 624–632

  42. Huang H, Liu P, Wang Y, Zhou T, Qu B, Tao A, Zhang H (2023) Multi-feature aggregation network for salient object detection. SIViP 17(4):1043–1051

    Article  MATH  Google Scholar 

  43. Wang Z, Zhang Y, Liu Y, Zhu D, Coleman SA, Kerr D (2023) Elwnet: An extremely lightweight approach for real-time salient object detection. IEEE Transactions on Circuits and Systems for Video Technology

  44. Ji CL, Yu T, Gao P, Wang F, Yuan RY (2024) Yolo-tla: An efficient and lightweight small object detection model based on yolov5. J Real-Time Image Proc 21(4):141

    Article  MATH  Google Scholar 

  45. Xia C, Sun Y, Li KC, Ge B, Zhang H, Jiang B, Zhang J (2024) Rcnet: Related context-driven network with hierarchical attention for salient object detection. Expert Syst Appl 237(121):441

    MATH  Google Scholar 

  46. Zhou X, Shen K, Liu Z (2024) Admnet: Attention-guided densely multi-scale network for lightweight salient object detection. IEEE Transactions on Multimedia

  47. Wang L, Lu H, Wang Y, Feng M, Wang D, Yin B, Ruan X (2017) Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 136–145

  48. Shi J, Yan Q, Xu L, Jia J (2015) Hierarchical image saliency detection on extended cssd. IEEE Trans Pattern Anal Mach Intell 38(4):717–729

    Article  MATH  Google Scholar 

  49. Li G, Yu Y (2015) Visual saliency based on multiscale deep features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5455–5463

  50. Yang C, Zhang L, Lu H, Ruan X, Yang MH (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3166–3173

  51. Li Y, Hou X, Koch C, Rehg JM, Yuille AL (2014) The secrets of salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 280–287

  52. Siris A, Jiao J, Tam GK, Xie X, Lau RW (2021) Scene context-aware salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 4156–4166

  53. Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 724–732

  54. Zeng Y, Zhang P, Zhang J, Lin Z, Lu H (2019) Towards high-resolution salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7234–7243

  55. Xie C, Xia C, Ma M, Zhao Z, Chen X, Li J (2022) Pyramid grafting network for one-stage high resolution saliency detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11,717–11,726

  56. Li X, Yang F, Cheng H, Liu W, Shen D (2018) Contour knowledge transfer for salient object detection. In: Proceedings of the european conference on computer vision (ECCV), pp 355–370

  57. Chen S, Tan X, Wang B, Hu X (2018) Reverse attention for salient object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 234–250

  58. Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7479–7489

  59. Feng M, Lu H, Ding E (2019) Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1623–1632

  60. Chen Z, Xu Q, Cong R, Huang Q (2020) Global context-aware progressive aggregation network for salient object detection. In: Proceedings of the AAAI conference on artificial intelligence, pp 10,599–10,606

  61. Ren Q, Lu S, Zhang J, Hu R (2020) Salient object detection by fusing local and global contexts. IEEE Trans Multimedia 23:1442–1453

    Article  MATH  Google Scholar 

  62. Liu N, Zhang N, Wan K, Shao L, Han J (2021) Visual saliency transformer. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4722–4732

  63. Mei H, Liu Y, Wei Z, Zhou D, Wei X, Zhang Q, Yang X (2021) Exploring dense context for salient object detection. IEEE Trans Circ Syst Video Technol 32(3):1378–1389

    Article  MATH  Google Scholar 

  64. Ke YY, Tsubono T (2022) Recursive contour-saliency blending network for accurate salient object detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2940–2950

  65. Zhu J, Qin X, Elsaddik A (2023) Dc-net: Divide-and-conquer for salient object detection. arXiv preprint arXiv:2305.14955

  66. Chen L, Cao T, Zheng Y, Yang J, Wang Y, Wang Y, Zhang B (2023) A non-negative feedback self-distillation method for salient object detection. PeerJ Comput Sci 9:e1435

    Article  MATH  Google Scholar 

  67. Qin X, Zhang Z, Huang C, Dehghan M, Zaiane OR, Jagersand M (2020) U2-net: Going deeper with nested u-structure for salient object detection. Pattern Recogn 106(107):404

    Google Scholar 

  68. Zhuge M, Fan DP, Liu N, Zhang D, Xu D, Shao L (2022) Salient object detection via integrity learning. IEEE Trans Pattern Anal Mach Intell 45(3):3738–3752

    Google Scholar 

  69. Wu Z, Su L, Huang Q (2019) Stacked cross refinement network for edge-aware salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7264–7273

  70. Song G, Song K, Yan Y (2020) Edrnet: Encoder-decoder residual network for salient object detection of strip steel surface defects. IEEE Trans Instrum Meas 69(12):9709–9719

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the Natural Science Foundation Guide Project of Liaoning Province (No. 2022BS105), the Fundamental Research Funds for Technical Study of Ministry of Public Security of China (No. 2023JSYJC23), the Public Security Theory and Soft Science Foundation of Ministry of Public Security of China (No. 2023LL21), and the Fundamental Research Funds of Criminal Investigation Police University of China (No. D2022056).

Author information

Authors and Affiliations

Authors

Contributions

Baoyu Wang: Conceptualization, Methodology, Software, Investigation, Data Curation, Writing-Original Draft; Mao Yang: Formal analysis, Project administration, Funding acquisition; Pingping Cao: Methodology, Project administration, Supervision; Yan Liu: Validation, Supervision.

Corresponding author

Correspondence to Mao Yang.

Ethics declarations

Ethical and informed consent for data used

The data used in this paper are from publicly available datasets and do not violate any ethical guidelines.

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, B., Yang, M., Cao, P. et al. A novel embedded cross framework for high-resolution salient object detection. Appl Intell 55, 277 (2025). https://doi.org/10.1007/s10489-024-06073-x

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-06073-x

Keywords