Abstract
Salient object detection (SOD) is a fundamental research topic in computer vision and has attracted significant interest from various fields, it has revealed two issues while driving the rapid development of salient detection. (1) The salient regions in high-resolution images exhibit significant differences in location, structure, and edge details, which makes them difficult to recognize and depict. (2) The traditional salient detection architecture is insensitive to detecting targets in high-resolution feature spaces, which leads to incomplete saliency predictions. To address these limitations, this paper proposes a novel embedded cross framework with a dual-path transformer (ECF-DT) for high-resolution SOD. The framework consists of a dual-path transformer and a unit fusion module for partitioning the salient targets. Specifically, we first design a cross network as a baseline model for salient object detection. Then, the dual-path transformer is embedded into the cross network with the objective of integrating fine-grained visual contextual information and target details while suppressing the disparity of the feature space. To generate more robust feature representations, we also introduce a unit fusion module, which highlights the positive information in the feature channels and encourages saliency prediction. Extensive experiments are conducted on nine benchmark databases, and the performance of the ECF-DT is compared with that of other existing state-of-the-art methods. The results indicate that our method outperforms its competitors and accurately detects the targets in high-resolution images with large objects, cluttered backgrounds, and complex scenes. It achieves MAEs of 0.017, 0.026, and 0.031 on three high-resolution public databases. Moreover, it reaches S-measure rates of 0.909, 0.876, 0.936, 0.854, 0.929, and 0.826 on six low-resolution public databases.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Zong M, Wang R, Chen X, Chen Z, Gong Y (2021) Motion saliency based multi-stream multiplier resnets for action recognition. Image Vis Comput 107(104):108
Bi HB, Lu D, Zhu HH, Yang LN, Guan HP (2021) Sta-net: spatial-temporal attention network for video salient object detection. Appl Intell 51:3450–3459
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Zhu H, Wang B, Zhang X, Liu J (2020) Semantic image segmentation with shared decomposition convolution and boundary reinforcement structure. Appl Intell 50:2676–2689
Luo J, Li Y, Pan Y, Yao T, Feng J, Chao H, Mei T (2023) Semantic-conditional diffusion networks for image captioning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 23,359–23,368
Wang H, Fan Y, Wang Z, Jiao L, Schiele B (2018) Parameter-free spatial attention network for person re-identification. arXiv preprint arXiv:1811.12150
Zhang Y, Li K, Li K, Wang L, Zhong B, Fu Y (2018) Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European conference on computer vision (ECCV), pp 286–301
Li J, Pan Z, Liu Q, Cui Y, Sun Y (2020) Complementarity-aware attention network for salient object detection. IEEE Trans Cybern 52(2):873–886
Wu Z, Li S, Chen C, Qin H, Hao A (2022) Salient object detection via dynamic scale routing. IEEE Trans Image Process 31:6649–6663
Yuan J, Zhu A, Xu Q, Wattanachote K, Gong Y (2023) Ctif-net: A cnn-transformer iterative fusion network for salient object detection. IEEE Transactions on Circuits and Systems for Video Technology
Yang W, Wu W, Chen XD, Tao X, Mao X (2023) How to use extra training data for better edge detection? Appl Intell 53(17):20,499–20,513
Yang W, Chen XD, Wu W, Qin H, Yan K, Mao X, Song H (2024) Boosting deep unsupervised edge detection via segment anything model. IEEE Transactions on Industrial Informatics
Yun YK, Lin W (2023) Towards a complete and detail-preserved salient object detection. IEEE Transactions on Multimedia
Yan R, Yan L, Geng G, Cao Y, Zhou P, Meng Y (2024) Asnet: Adaptive semantic network based on transformer-cnn for salient object detection in optical remote sensing images. IEEE Transactions on Geoscience and Remote Sensing
Lin Y, Sun H, Liu N, Bian Y, Cen J, Zhou H (2022) Attention guided network for salient object detection in optical remote sensing images. In: International conference on artificial neural networks, pp 25–36. Springer
Yuan J, Wei J, Wattanachote K, Zeng K, Luo X, Xu Q, Gong Y (2022) Attention-based bi-directional refinement network for salient object detection. Appl Intell 52(12):14,349–14,361
Yang A, Liu Y, Cheng S, Cao J, Ji Z, Pang Y (2023) Spatial attention-guided deformable fusion network for salient object detection. Multimedia Systems 29(5):2563–2573
Peng C, Zhang K, Ma Y, Ma J (2021) Cross fusion net: A fast semantic segmentation network for small-scale semantic information capturing in aerial scenes. IEEE Trans Geosci Remote Sens 60:1–13
Zhou W, Zhu Y, Lei J, Wan J, Yu L (2021) Ccafnet: Crossflow and cross-scale adaptive fusion network for detecting salient objects in rgb-d images. IEEE Trans Multimedia 24:2192–2204
Han H, Lu F, Deng Y, Luo X, Jin H, Tu W, Xie X (2023) M 2 cf-net: A multi-resolution and multi-scale cross fusion network for segmenting pathology lesion of the focal lymphocytic sialadenitis. In: 2023 IEEE International conference on medical artificial intelligence (MedAI), pp 425–434. IEEE
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
Vidal R, Ma Y, Sastry S (2005) Generalized principal component analysis (gpca). IEEE Trans Pattern Anal Mach Intell 27(12):1945–1959
Li N, Sun B, Yu J (2015) A weighted sparse coding framework for saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5216–5223
Sheng H, Zhang S, Liu X, Xiong Z (2016) Relative location for light field saliency detection. In: 2016 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 1631–1635. IEEE
Piao Y, Li X, Zhang M, Yu J, Lu H (2019) Saliency detection via depth-induced cellular automata on light field. IEEE Trans Image Process 29:1879–1889
Liu Y, Zhang Y, Liu S, Coleman S, Wang Z, Qiu F (2022) Salient object detection by aggregating contextual information. Pattern Recogn Lett 153:190–199
Zhao JX, Liu JJ, Fan DP, Cao Y, Yang J, Cheng MM (2019) Egnet: Edge guidance network for salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8779–8788
Liu JJ, Hou Q, Cheng MM, Feng J, Jiang J (2019) A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3917–3926
Wei J, Wang S, Wu Z, Su C, Huang Q, Tian Q (2020) Label decoupling framework for salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13,025–13,034
Zhou H, Xie X, Lai JH, Chen Z, Yang L (2020) Interactive two-stream decoder for accurate and fast saliency detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9141–9150
Jing L, Wang B (2024) Emnet: Edge-guided multi-level network for salient object detection in low-light images. Image Vis Comput 143(104):933
Yang C, Xiao Y, Chu L, Yu Z, Zhou J, Zheng H (2024) Saliency and edge features-guided end-to-end network for salient object detection. Expert Syst Appl 257(125):016
Zhao H, Qi X, Shen X, Shi J, Jia J (2018) Icnet for real-time semantic segmentation on high-resolution images. In: Proceedings of the European conference on computer vision (ECCV), pp 405–420
Poudel RP, Liwicki S, Cipolla R (2019) Fast-scnn: Fast semantic segmentation network. arXiv preprint arXiv:1902.04502
Zhang Q, Wang S, Wang X, Sun Z, Kwong S, Jiang J (2020) A multi-task collaborative network for light field salient object detection. IEEE Trans Circ Syst Video Technol 31(5):1849–1861
Wang J, Yang Q, Yang S, Chai X, Zhang W (2022) Dual-path processing network for high-resolution salient object detection. Appl Intell 52(10):12,034–12,048
Yi Y, Zhang N, Zhou W, Shi Y, Xie G, Wang J (2024) Gponet: A two-stream gated progressive optimization network for salient object detection. Pattern Recogn 150(110):330
Zhao J, Jia Y, Ma L, Yu L (2024) Adaptive dual-stream sparse transformer network for salient object detection in optical remote sensing images. IEEE J Sel Top Appl Earth Obs Remote Sens 17:5173–5192
Lv Y, Zhou W, Lei J, Ye L, Luo T (2019) Attention-based fusion network for human eye-fixation prediction in 3d images. Opt Express 27(23):34,056–34,066
Ghiasi G, Fowlkes CC (2016) Laplacian pyramid reconstruction and refinement for semantic segmentation. In: Computer vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III 14, pp 519–534. Springer
Lai WS, Huang JB, Ahuja N, Yang MH (2017) Deep laplacian pyramid networks for fast and accurate super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 624–632
Huang H, Liu P, Wang Y, Zhou T, Qu B, Tao A, Zhang H (2023) Multi-feature aggregation network for salient object detection. SIViP 17(4):1043–1051
Wang Z, Zhang Y, Liu Y, Zhu D, Coleman SA, Kerr D (2023) Elwnet: An extremely lightweight approach for real-time salient object detection. IEEE Transactions on Circuits and Systems for Video Technology
Ji CL, Yu T, Gao P, Wang F, Yuan RY (2024) Yolo-tla: An efficient and lightweight small object detection model based on yolov5. J Real-Time Image Proc 21(4):141
Xia C, Sun Y, Li KC, Ge B, Zhang H, Jiang B, Zhang J (2024) Rcnet: Related context-driven network with hierarchical attention for salient object detection. Expert Syst Appl 237(121):441
Zhou X, Shen K, Liu Z (2024) Admnet: Attention-guided densely multi-scale network for lightweight salient object detection. IEEE Transactions on Multimedia
Wang L, Lu H, Wang Y, Feng M, Wang D, Yin B, Ruan X (2017) Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 136–145
Shi J, Yan Q, Xu L, Jia J (2015) Hierarchical image saliency detection on extended cssd. IEEE Trans Pattern Anal Mach Intell 38(4):717–729
Li G, Yu Y (2015) Visual saliency based on multiscale deep features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5455–5463
Yang C, Zhang L, Lu H, Ruan X, Yang MH (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3166–3173
Li Y, Hou X, Koch C, Rehg JM, Yuille AL (2014) The secrets of salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 280–287
Siris A, Jiao J, Tam GK, Xie X, Lau RW (2021) Scene context-aware salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 4156–4166
Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 724–732
Zeng Y, Zhang P, Zhang J, Lin Z, Lu H (2019) Towards high-resolution salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7234–7243
Xie C, Xia C, Ma M, Zhao Z, Chen X, Li J (2022) Pyramid grafting network for one-stage high resolution saliency detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11,717–11,726
Li X, Yang F, Cheng H, Liu W, Shen D (2018) Contour knowledge transfer for salient object detection. In: Proceedings of the european conference on computer vision (ECCV), pp 355–370
Chen S, Tan X, Wang B, Hu X (2018) Reverse attention for salient object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 234–250
Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7479–7489
Feng M, Lu H, Ding E (2019) Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1623–1632
Chen Z, Xu Q, Cong R, Huang Q (2020) Global context-aware progressive aggregation network for salient object detection. In: Proceedings of the AAAI conference on artificial intelligence, pp 10,599–10,606
Ren Q, Lu S, Zhang J, Hu R (2020) Salient object detection by fusing local and global contexts. IEEE Trans Multimedia 23:1442–1453
Liu N, Zhang N, Wan K, Shao L, Han J (2021) Visual saliency transformer. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4722–4732
Mei H, Liu Y, Wei Z, Zhou D, Wei X, Zhang Q, Yang X (2021) Exploring dense context for salient object detection. IEEE Trans Circ Syst Video Technol 32(3):1378–1389
Ke YY, Tsubono T (2022) Recursive contour-saliency blending network for accurate salient object detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2940–2950
Zhu J, Qin X, Elsaddik A (2023) Dc-net: Divide-and-conquer for salient object detection. arXiv preprint arXiv:2305.14955
Chen L, Cao T, Zheng Y, Yang J, Wang Y, Wang Y, Zhang B (2023) A non-negative feedback self-distillation method for salient object detection. PeerJ Comput Sci 9:e1435
Qin X, Zhang Z, Huang C, Dehghan M, Zaiane OR, Jagersand M (2020) U2-net: Going deeper with nested u-structure for salient object detection. Pattern Recogn 106(107):404
Zhuge M, Fan DP, Liu N, Zhang D, Xu D, Shao L (2022) Salient object detection via integrity learning. IEEE Trans Pattern Anal Mach Intell 45(3):3738–3752
Wu Z, Su L, Huang Q (2019) Stacked cross refinement network for edge-aware salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7264–7273
Song G, Song K, Yan Y (2020) Edrnet: Encoder-decoder residual network for salient object detection of strip steel surface defects. IEEE Trans Instrum Meas 69(12):9709–9719
Acknowledgements
This work was supported by the Natural Science Foundation Guide Project of Liaoning Province (No. 2022BS105), the Fundamental Research Funds for Technical Study of Ministry of Public Security of China (No. 2023JSYJC23), the Public Security Theory and Soft Science Foundation of Ministry of Public Security of China (No. 2023LL21), and the Fundamental Research Funds of Criminal Investigation Police University of China (No. D2022056).
Author information
Authors and Affiliations
Contributions
Baoyu Wang: Conceptualization, Methodology, Software, Investigation, Data Curation, Writing-Original Draft; Mao Yang: Formal analysis, Project administration, Funding acquisition; Pingping Cao: Methodology, Project administration, Supervision; Yan Liu: Validation, Supervision.
Corresponding author
Ethics declarations
Ethical and informed consent for data used
The data used in this paper are from publicly available datasets and do not violate any ethical guidelines.
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, B., Yang, M., Cao, P. et al. A novel embedded cross framework for high-resolution salient object detection. Appl Intell 55, 277 (2025). https://doi.org/10.1007/s10489-024-06073-x
Accepted:
Published:
DOI: https://doi.org/10.1007/s10489-024-06073-x