A novel embedded cross framework for high-resolution salient object detection

Wang, Baoyu; Yang, Mao; Cao, Pingping; Liu, Yan

doi:10.1007/s10489-024-06073-x

A novel embedded cross framework for high-resolution salient object detection

Published: 07 January 2025

Volume 55, article number 277, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Baoyu Wang^1,2,
Mao Yang¹,
Pingping Cao² &
…
Yan Liu³

98 Accesses
Explore all metrics

Abstract

Salient object detection (SOD) is a fundamental research topic in computer vision and has attracted significant interest from various fields, it has revealed two issues while driving the rapid development of salient detection. (1) The salient regions in high-resolution images exhibit significant differences in location, structure, and edge details, which makes them difficult to recognize and depict. (2) The traditional salient detection architecture is insensitive to detecting targets in high-resolution feature spaces, which leads to incomplete saliency predictions. To address these limitations, this paper proposes a novel embedded cross framework with a dual-path transformer (ECF-DT) for high-resolution SOD. The framework consists of a dual-path transformer and a unit fusion module for partitioning the salient targets. Specifically, we first design a cross network as a baseline model for salient object detection. Then, the dual-path transformer is embedded into the cross network with the objective of integrating fine-grained visual contextual information and target details while suppressing the disparity of the feature space. To generate more robust feature representations, we also introduce a unit fusion module, which highlights the positive information in the feature channels and encourages saliency prediction. Extensive experiments are conducted on nine benchmark databases, and the performance of the ECF-DT is compared with that of other existing state-of-the-art methods. The results indicate that our method outperforms its competitors and accurately detects the targets in high-resolution images with large objects, cluttered backgrounds, and complex scenes. It achieves MAEs of 0.017, 0.026, and 0.031 on three high-resolution public databases. Moreover, it reaches S-measure rates of 0.909, 0.876, 0.936, 0.854, 0.929, and 0.826 on six low-resolution public databases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dual-path Processing Network for High-resolution Salient Object Detection

Article 31 January 2022

CFA-Net: Cross-Level Feature Fusion and Aggregation Network for Salient Object Detection

DIG: dual interaction and guidance network for salient object detection

Article 22 September 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Zong M, Wang R, Chen X, Chen Z, Gong Y (2021) Motion saliency based multi-stream multiplier resnets for action recognition. Image Vis Comput 107(104):108
MATH Google Scholar
Bi HB, Lu D, Zhu HH, Yang LN, Guan HP (2021) Sta-net: spatial-temporal attention network for video salient object detection. Appl Intell 51:3450–3459
Article MATH Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Zhu H, Wang B, Zhang X, Liu J (2020) Semantic image segmentation with shared decomposition convolution and boundary reinforcement structure. Appl Intell 50:2676–2689
Article MATH Google Scholar
Luo J, Li Y, Pan Y, Yao T, Feng J, Chao H, Mei T (2023) Semantic-conditional diffusion networks for image captioning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 23,359–23,368
Wang H, Fan Y, Wang Z, Jiao L, Schiele B (2018) Parameter-free spatial attention network for person re-identification. arXiv preprint arXiv:1811.12150
Zhang Y, Li K, Li K, Wang L, Zhong B, Fu Y (2018) Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European conference on computer vision (ECCV), pp 286–301
Li J, Pan Z, Liu Q, Cui Y, Sun Y (2020) Complementarity-aware attention network for salient object detection. IEEE Trans Cybern 52(2):873–886
Article MATH Google Scholar
Wu Z, Li S, Chen C, Qin H, Hao A (2022) Salient object detection via dynamic scale routing. IEEE Trans Image Process 31:6649–6663
Article MATH Google Scholar
Yuan J, Zhu A, Xu Q, Wattanachote K, Gong Y (2023) Ctif-net: A cnn-transformer iterative fusion network for salient object detection. IEEE Transactions on Circuits and Systems for Video Technology
Yang W, Wu W, Chen XD, Tao X, Mao X (2023) How to use extra training data for better edge detection? Appl Intell 53(17):20,499–20,513
Yang W, Chen XD, Wu W, Qin H, Yan K, Mao X, Song H (2024) Boosting deep unsupervised edge detection via segment anything model. IEEE Transactions on Industrial Informatics
Yun YK, Lin W (2023) Towards a complete and detail-preserved salient object detection. IEEE Transactions on Multimedia
Yan R, Yan L, Geng G, Cao Y, Zhou P, Meng Y (2024) Asnet: Adaptive semantic network based on transformer-cnn for salient object detection in optical remote sensing images. IEEE Transactions on Geoscience and Remote Sensing
Lin Y, Sun H, Liu N, Bian Y, Cen J, Zhou H (2022) Attention guided network for salient object detection in optical remote sensing images. In: International conference on artificial neural networks, pp 25–36. Springer
Yuan J, Wei J, Wattanachote K, Zeng K, Luo X, Xu Q, Gong Y (2022) Attention-based bi-directional refinement network for salient object detection. Appl Intell 52(12):14,349–14,361
Yang A, Liu Y, Cheng S, Cao J, Ji Z, Pang Y (2023) Spatial attention-guided deformable fusion network for salient object detection. Multimedia Systems 29(5):2563–2573
Article Google Scholar
Peng C, Zhang K, Ma Y, Ma J (2021) Cross fusion net: A fast semantic segmentation network for small-scale semantic information capturing in aerial scenes. IEEE Trans Geosci Remote Sens 60:1–13
MATH Google Scholar
Zhou W, Zhu Y, Lei J, Wan J, Yu L (2021) Ccafnet: Crossflow and cross-scale adaptive fusion network for detecting salient objects in rgb-d images. IEEE Trans Multimedia 24:2192–2204
Article MATH Google Scholar
Han H, Lu F, Deng Y, Luo X, Jin H, Tu W, Xie X (2023) M 2 cf-net: A multi-resolution and multi-scale cross fusion network for segmenting pathology lesion of the focal lymphocytic sialadenitis. In: 2023 IEEE International conference on medical artificial intelligence (MedAI), pp 425–434. IEEE
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
Article MATH Google Scholar
Vidal R, Ma Y, Sastry S (2005) Generalized principal component analysis (gpca). IEEE Trans Pattern Anal Mach Intell 27(12):1945–1959
Article MATH Google Scholar
Li N, Sun B, Yu J (2015) A weighted sparse coding framework for saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5216–5223
Sheng H, Zhang S, Liu X, Xiong Z (2016) Relative location for light field saliency detection. In: 2016 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 1631–1635. IEEE
Piao Y, Li X, Zhang M, Yu J, Lu H (2019) Saliency detection via depth-induced cellular automata on light field. IEEE Trans Image Process 29:1879–1889
Article MathSciNet MATH Google Scholar
Liu Y, Zhang Y, Liu S, Coleman S, Wang Z, Qiu F (2022) Salient object detection by aggregating contextual information. Pattern Recogn Lett 153:190–199
Article MATH Google Scholar
Zhao JX, Liu JJ, Fan DP, Cao Y, Yang J, Cheng MM (2019) Egnet: Edge guidance network for salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8779–8788
Liu JJ, Hou Q, Cheng MM, Feng J, Jiang J (2019) A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3917–3926
Wei J, Wang S, Wu Z, Su C, Huang Q, Tian Q (2020) Label decoupling framework for salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13,025–13,034
Zhou H, Xie X, Lai JH, Chen Z, Yang L (2020) Interactive two-stream decoder for accurate and fast saliency detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9141–9150
Jing L, Wang B (2024) Emnet: Edge-guided multi-level network for salient object detection in low-light images. Image Vis Comput 143(104):933
MATH Google Scholar
Yang C, Xiao Y, Chu L, Yu Z, Zhou J, Zheng H (2024) Saliency and edge features-guided end-to-end network for salient object detection. Expert Syst Appl 257(125):016
Google Scholar
Zhao H, Qi X, Shen X, Shi J, Jia J (2018) Icnet for real-time semantic segmentation on high-resolution images. In: Proceedings of the European conference on computer vision (ECCV), pp 405–420
Poudel RP, Liwicki S, Cipolla R (2019) Fast-scnn: Fast semantic segmentation network. arXiv preprint arXiv:1902.04502
Zhang Q, Wang S, Wang X, Sun Z, Kwong S, Jiang J (2020) A multi-task collaborative network for light field salient object detection. IEEE Trans Circ Syst Video Technol 31(5):1849–1861
Article MATH Google Scholar
Wang J, Yang Q, Yang S, Chai X, Zhang W (2022) Dual-path processing network for high-resolution salient object detection. Appl Intell 52(10):12,034–12,048
Yi Y, Zhang N, Zhou W, Shi Y, Xie G, Wang J (2024) Gponet: A two-stream gated progressive optimization network for salient object detection. Pattern Recogn 150(110):330
MATH Google Scholar
Zhao J, Jia Y, Ma L, Yu L (2024) Adaptive dual-stream sparse transformer network for salient object detection in optical remote sensing images. IEEE J Sel Top Appl Earth Obs Remote Sens 17:5173–5192
Article MATH Google Scholar
Lv Y, Zhou W, Lei J, Ye L, Luo T (2019) Attention-based fusion network for human eye-fixation prediction in 3d images. Opt Express 27(23):34,056–34,066
Ghiasi G, Fowlkes CC (2016) Laplacian pyramid reconstruction and refinement for semantic segmentation. In: Computer vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III 14, pp 519–534. Springer
Lai WS, Huang JB, Ahuja N, Yang MH (2017) Deep laplacian pyramid networks for fast and accurate super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 624–632
Huang H, Liu P, Wang Y, Zhou T, Qu B, Tao A, Zhang H (2023) Multi-feature aggregation network for salient object detection. SIViP 17(4):1043–1051
Article MATH Google Scholar
Wang Z, Zhang Y, Liu Y, Zhu D, Coleman SA, Kerr D (2023) Elwnet: An extremely lightweight approach for real-time salient object detection. IEEE Transactions on Circuits and Systems for Video Technology
Ji CL, Yu T, Gao P, Wang F, Yuan RY (2024) Yolo-tla: An efficient and lightweight small object detection model based on yolov5. J Real-Time Image Proc 21(4):141
Article MATH Google Scholar
Xia C, Sun Y, Li KC, Ge B, Zhang H, Jiang B, Zhang J (2024) Rcnet: Related context-driven network with hierarchical attention for salient object detection. Expert Syst Appl 237(121):441
MATH Google Scholar
Zhou X, Shen K, Liu Z (2024) Admnet: Attention-guided densely multi-scale network for lightweight salient object detection. IEEE Transactions on Multimedia
Wang L, Lu H, Wang Y, Feng M, Wang D, Yin B, Ruan X (2017) Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 136–145
Shi J, Yan Q, Xu L, Jia J (2015) Hierarchical image saliency detection on extended cssd. IEEE Trans Pattern Anal Mach Intell 38(4):717–729
Article MATH Google Scholar
Li G, Yu Y (2015) Visual saliency based on multiscale deep features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5455–5463
Yang C, Zhang L, Lu H, Ruan X, Yang MH (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3166–3173
Li Y, Hou X, Koch C, Rehg JM, Yuille AL (2014) The secrets of salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 280–287
Siris A, Jiao J, Tam GK, Xie X, Lau RW (2021) Scene context-aware salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 4156–4166
Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 724–732
Zeng Y, Zhang P, Zhang J, Lin Z, Lu H (2019) Towards high-resolution salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7234–7243
Xie C, Xia C, Ma M, Zhao Z, Chen X, Li J (2022) Pyramid grafting network for one-stage high resolution saliency detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11,717–11,726
Li X, Yang F, Cheng H, Liu W, Shen D (2018) Contour knowledge transfer for salient object detection. In: Proceedings of the european conference on computer vision (ECCV), pp 355–370
Chen S, Tan X, Wang B, Hu X (2018) Reverse attention for salient object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 234–250
Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7479–7489
Feng M, Lu H, Ding E (2019) Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1623–1632
Chen Z, Xu Q, Cong R, Huang Q (2020) Global context-aware progressive aggregation network for salient object detection. In: Proceedings of the AAAI conference on artificial intelligence, pp 10,599–10,606
Ren Q, Lu S, Zhang J, Hu R (2020) Salient object detection by fusing local and global contexts. IEEE Trans Multimedia 23:1442–1453
Article MATH Google Scholar
Liu N, Zhang N, Wan K, Shao L, Han J (2021) Visual saliency transformer. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4722–4732
Mei H, Liu Y, Wei Z, Zhou D, Wei X, Zhang Q, Yang X (2021) Exploring dense context for salient object detection. IEEE Trans Circ Syst Video Technol 32(3):1378–1389
Article MATH Google Scholar
Ke YY, Tsubono T (2022) Recursive contour-saliency blending network for accurate salient object detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2940–2950
Zhu J, Qin X, Elsaddik A (2023) Dc-net: Divide-and-conquer for salient object detection. arXiv preprint arXiv:2305.14955
Chen L, Cao T, Zheng Y, Yang J, Wang Y, Wang Y, Zhang B (2023) A non-negative feedback self-distillation method for salient object detection. PeerJ Comput Sci 9:e1435
Article MATH Google Scholar
Qin X, Zhang Z, Huang C, Dehghan M, Zaiane OR, Jagersand M (2020) U2-net: Going deeper with nested u-structure for salient object detection. Pattern Recogn 106(107):404
Google Scholar
Zhuge M, Fan DP, Liu N, Zhang D, Xu D, Shao L (2022) Salient object detection via integrity learning. IEEE Trans Pattern Anal Mach Intell 45(3):3738–3752
Google Scholar
Wu Z, Su L, Huang Q (2019) Stacked cross refinement network for edge-aware salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7264–7273
Song G, Song K, Yan Y (2020) Edrnet: Encoder-decoder residual network for salient object detection of strip steel surface defects. IEEE Trans Instrum Meas 69(12):9709–9719
Article MATH Google Scholar

Download references

Acknowledgements

This work was supported by the Natural Science Foundation Guide Project of Liaoning Province (No. 2022BS105), the Fundamental Research Funds for Technical Study of Ministry of Public Security of China (No. 2023JSYJC23), the Public Security Theory and Soft Science Foundation of Ministry of Public Security of China (No. 2023LL21), and the Fundamental Research Funds of Criminal Investigation Police University of China (No. D2022056).

Author information

Authors and Affiliations

The Key Laboratory of Modern Power System Simulation and Control & Renewable Energy Technology, Ministry of Education (Northeast Electric Power University), Jilin, 132012, China
Baoyu Wang & Mao Yang
College of Basic Education and Research, Criminal Investigation Police University of China, Shenyang, 110854, China
Baoyu Wang & Pingping Cao
Faculty of Robot Science and Engineering, Northeastern University, Shenyang, 110819, China
Yan Liu

Authors

Baoyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Mao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Pingping Cao
View author publications
You can also search for this author in PubMed Google Scholar
Yan Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Baoyu Wang: Conceptualization, Methodology, Software, Investigation, Data Curation, Writing-Original Draft; Mao Yang: Formal analysis, Project administration, Funding acquisition; Pingping Cao: Methodology, Project administration, Supervision; Yan Liu: Validation, Supervision.

Corresponding author

Correspondence to Mao Yang.

Ethics declarations

Ethical and informed consent for data used

The data used in this paper are from publicly available datasets and do not violate any ethical guidelines.

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, B., Yang, M., Cao, P. et al. A novel embedded cross framework for high-resolution salient object detection. Appl Intell 55, 277 (2025). https://doi.org/10.1007/s10489-024-06073-x

Download citation

Accepted: 13 November 2024
Published: 07 January 2025
DOI: https://doi.org/10.1007/s10489-024-06073-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel embedded cross framework for high-resolution salient object detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Dual-path Processing Network for High-resolution Salient Object Detection

CFA-Net: Cross-Level Feature Fusion and Aggregation Network for Salient Object Detection

DIG: dual interaction and guidance network for salient object detection

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical and informed consent for data used

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A novel embedded cross framework for high-resolution salient object detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Dual-path Processing Network for High-resolution Salient Object Detection

CFA-Net: Cross-Level Feature Fusion and Aggregation Network for Salient Object Detection

DIG: dual interaction and guidance network for salient object detection

Explore related subjects

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical and informed consent for data used

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation