RGB-T salient object detection via CNN feature and result saliency map fusion

Xu, Chang; Li, Qingwu; Zhou, Mingyu; Zhou, Qingkai; Zhou, Yaqin; Ma, Yunpeng

doi:10.1007/s10489-021-02984-1

RGB-T salient object detection via CNN feature and result saliency map fusion

Published: 24 January 2022

Volume 52, pages 11343–11362, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Chang Xu¹,
Qingwu Li ORCID: orcid.org/0000-0003-3224-9831¹,
Mingyu Zhou¹,
Qingkai Zhou¹,
Yaqin Zhou¹ &
…
Yunpeng Ma¹

1220 Accesses
15 Citations
1 Altmetric
Explore all metrics

Abstract

Thermal infrared sensors have unique advantages under the conditions of insufficient illumination, complex scenarios, or occluded appearances. RGB-T salient object detection methods integrate the complementary advantages of visual and thermal modalities to capture salient objects more accurately. Considering the characteristics of visual and thermal images, we combine CNN feature and result saliency map fusion methods to achieve RGB-T salient object detection. First, a two-stream encoder-decoder network is proposed to handle the different saliency cues within RGB-T images. Specifically, the global attention module introduces the complementary saliency cues within thermal images to visual images, thereby ensuring the consistency of salient object locations. Subsequently, the two-stream decoder module gradually fuses the high-level salient object location cues with low-level detail saliency cues to obtain single-modality saliency maps. Then, saliency maps are fused and refined by the proposed result saliency map fusion method to achieve the final saliency map with high precision. In this way, the salient object is segmented with the fine boundary, and the noise inside the salient object is effectively suppressed. Experimental results demonstrate the effectiveness of each component within the CNN feature and result saliency map fusion methods. The proposed method facilitates desirable complementation for RGB-T images and performs favorably against state-of-the-art methods, especially in the challenges of low illumination, cluttered background, and low contrast.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Interactive context-aware network for RGB-T salient object detection

Article 08 February 2024

Modal complementary fusion network for RGB-T salient object detection

Article 05 August 2022

Cross-Collaboration Weighted Fusion Network for RGB-T Salient Detection

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Marchesotti L, Cifarelli C, Csurka G A framework for visual saliency detection with applications to image thumbnailing. In: Proceedings of the IEEE International Conference on Computer vision(ICCV), pp 2232–2239
Qin X, He S, Yang X, Dehghan M, Qin Q, Martin J (2018) Accurate outline extraction of individual building from very high-resolution optical images. IEEE Geosci Remote Sens Lett 15(11):1775–1779
Article Google Scholar
Borji A, Cheng M-M, Jiang H, Li J (2015) Salient object detection: A benchmark. IEEE Trans Image Process 24(12):5706–5722
Rutishauser U, Walther D, Koch C, Perona P (2004) Is bottom-up attention useful for object recognition?. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern recognition(CVPR), vol 2, pp II–II
Itti L (2004) Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans Image Process 13(10):1304–1318
Guo C, Zhang L (2010) A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans Image Process 19(1):185–198
Article MathSciNet Google Scholar
Zhao C, Huang Y, Qiu S (2019) Infrared and visible image fusion algorithm based on saliency detection and adaptive double-channel spiking cortical model. Infrared Phys Technol 102:102976
Article Google Scholar
Minghui S, Liu L, Yuanxi P, Tian J, Li J (2019) Infrared and visible images fusion based on redundant directional lifting-based wavelet and saliency detection. Infrared Phys Technol 101:45–55
Article Google Scholar
Chattopadhay A, Sarkar A, Howlader P, Balasubramanian VN (2018) Grad-cam++ Generalized gradient-based visual explanations for deep convolutional networks. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp 839–847
Liu N, Han J, Yang M (2018) Picanet: learning pixel-wise contextual attention for saliency detect ion. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3089–3098
Gao Y, Li C, Zhu Y, Tang J, He T (2019) Deep adaptive fusion network for high performance rgbt tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, pp 91–99, 10
Zimmermann C, Welschehold T, Dornhege C (2018) Wolfram Burgard, and Thomas Brox. 3d human pose estimation in rgbd images for robotic task learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp 1986–1992
Ji Y, Zhang H, Zhang Z, Liu M (2021) Cnn-based encoder-decoder networks for salient object detection A comprehensive review and recent advances. Inf Sci 546:835–857
Article MathSciNet Google Scholar
Li C, Cong R, Kwong S, Hou J, Fu H, Zhu G, Zhang D, Huang Q (2021) Asif-net: Attention steered interweave fusion network for rgb-d salient object detection. IEEE Trans Cybern 51(1):88–100
Article Google Scholar
Tu Z, Xia T, Li C, Wang X, Ma Y, Tang J (2019) Rgb-t image saliency detection via collaborative graph learning. IEEE Trans Multimed 22(1):160–173, 06
Bai X, Yu Z, Zhou F, Xue B (2015) Quadtree-based multi-focus image fusion using a weighted focus-measure. Inf Fusion 22:105–118, 03
Zhang L (2008) In situ image segmentation using the convexity of illumination distribution of the light sources
Liu Z, Zhang X, Luo S, Meur OL (2014) Superpixel-based spatiotemporal saliency detection. IEEE Trans Circ Syst Video Technol 24(9):1522–1540
Article Google Scholar
Wang Q, Zheng W, Piramuthu R (2016) Grab: visual saliency via novel graph model and background priors. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 535–543
Ren J, Gong X, Yu L, Zhou W, Yang MY (2015) Exploiting global priors for rgb-d saliency detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp 25–32
Zhu W, Liang S, Wei Y, Sun J (2014) Saliency optimization from robust background detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 2814–2821
Yang C, Zhang L, Lu H, Ruan X, Yang M-H (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3166–3173
Li H, Lu H, Lin Z, Shen X, Price B (2015) Inner and inter label propagation: salient object detection in the wild. IEEE Trans Image Process 24(10):3176–3186
Article MathSciNet Google Scholar
Li C, Cong R, Piao Y, Xu Q, Loy CC (2020) Rgb-d salient object detection with cross-modality modulation and selection.. In: Proceedings of the European Conference on Computer Vision (ECCV). Springer International Publishing, Cham, pp 225–241
Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 7471– 7481
Wang W, Lai Q, Fu H, Shen J, Ling H, Yang R (2021) Salient object detection in the deep learning era An in-depth survey. IEEE Trans Pattern Anal Mach Intell 1:1–1
Google Scholar
Zhao J, Liu J, Fan D, Cao Y, Yang J, Cheng M-M (2019) Egnet: Edge guidance network for salient object detection. In: Proceedings of the IEEE International Conference on Computer Vision(ICCV), pp 8778–8787
Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3080–3089
Liu N, Ni Z, Han J (2020) Learning selective self-mutual attention for rgb-d saliency detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 13753–13762
Fan D, Cheng M, Liu Y, Li T, Borji A (2017) Structure-measure: a new way to evaluate foreground maps. In: Proceedings of the IEEE International Conference on Computer Vision(ICCV), pp 4558–4567
Zhang J, Fan D, Dai Y, Anwar S, Saleh FS, Zhang T, Barnes N (2020) Uc-net: uncertainty inspired rgb-d saliency detection via conditional variational autoencoders. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 8579–8588
Zhang M, Ren W, Piao Y, Rong Z, Lu H (2020) Select, supplement and focus for rgb-d saliency detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3469–3478, 06
Zhang Z, Lin Z, Xu J, Jin W-D, Lu S-P, Fan D-P (2021) Bilateral attention network for rgb-d salient object detection. IEEE Trans Image Process 30:1949–1961
Article Google Scholar
Qu L, He S, Zhang J, Tian J, Tang Y, Yang Q (2017) Rgbd salient object detection via deep fusion. IEEE Trans Image Process 26(5):2274–2285
Article MathSciNet Google Scholar
Fan D-P, Lin Z, Zhang Z, Zhu M, Cheng M-M (2020) Rethinking rgb-d salient object detection: Models, data sets, and large-scale benchmarks. IEEE Trans Neural Netw Learn Syst 32(5):2075–2089
Article Google Scholar
Liu Z, Shi S, Duan Q, Zhang W, Zhao P (2019) Salient object detection for rgb-d image by single stream recurrent convolution neural network. Neurocomputing 363(07):46–57
Google Scholar
Zhao X, Zhang L, Pang Y, Lu H, Zhang L (2020) A single stream network for robust and real-time rgb-d salient object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 646–662
Chen H, Li Y, Su D (2019) Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for rgb-d salient object detection. Pattern Recogn 86:376–385
Article Google Scholar
Fu K, Fan D-P, Ji G-P, Zhao Q (2020) Jl-dcf: Joint learning and densely-cooperative fusion framework for rgb-d salient object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3052–3062
Fan D-P, Zhai A, Borji Y, Yang J, Shao L (2020) Bbs-net: Rgb-d salient object detection with a bifurcated backbone strategy network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 275–292
Wang N, Gong X (2019) Adaptive fusion for rgb-d salient object detection. IEEE Access 7:55277–55284
Article Google Scholar
Guo J, Ren T, Bei J (2016) Salient object detection for rgb-d image via saliency evolution. In: Proceedings of the IEEE International Conference on Multimedia and Expo(ICME), pp 1–6
Dinh P-H (2021) Combining gabor energy with equilibrium optimizer algorithm for multi-modality medical image fusion. Biomed Signal Process Control 68:102696
Article Google Scholar
Dinh P-H (2021) A novel approach based on three-scale image decomposition and marine predators algorithm for multi-modal medical image fusion. Biomed Signal Process Control 67:102536
Article Google Scholar
Dinh P-H (2021) A novel approach based on grasshopper optimization algorithm for medical image fusion. Expert Syst Appl 171:114576
Article Google Scholar
Dinh P-H (2021) Multi-modal medical image fusion based on equilibrium optimizer algorithm and local energy functions. Appl Intell:04
Abualigah L, Diabat A, Mirjalili S, Elaziz MA, Gandomi AH (2021) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng 376:113609
Article MathSciNet Google Scholar
Ahmadianfar I, Bozorg-Haddad O, Chu X (2020) Gradient-based optimizer: a new metaheuristic optimization algorithm. Inf Sci 540:131–159
Article MathSciNet Google Scholar
Singh VK, Kumar N Soft: salient object detection based on feature combination using teaching-learning-based optimization. Signal Image and Video Processing
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3431–3440
Gong A, Huang L, Shi J, Liu C (2021) Unsupervised rgb-t saliency detection by node classification distance and sparse constrained graph learning. Appl Intell:05
Ma Y, Sun D, Meng Q, Ding Z, Li C (2017) Learning multiscale deep features and svm regressors for adaptive rgb-t saliency detection. In: 2017 10th International Symposium on Computational Intelligence and Design (ISCID), pp 389–392, 12
Tu Z, Li Z, Li C, Lang Y, Tang J (2021) Multi-interactive siamese decoder for rgbt salient object detection
Zhang Q, Huang N, Yao L, Zhang D, Shan C, Han J (2020) Rgb-t salient object detection via fusing multi-level cnn features. IEEE Trans Image Process 29:3321–3335
Article Google Scholar
Lazebnik S., Schmid C., Ponce J (2006) Beyond bags of features Spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), vol 2, pp 2169–2178
Zhu Z, Xu M, Bai S, Huang T, Bai X (2019) Asymmetric non-local neural networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision(ICCV), pp 593–602, 10
Yu C, Liu Y, Gao C, Shen C, Sang N (2020) Representative graph neural network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 379–396
Nasir Baig M, Riaz M, Ghafoor A, Siddiqui AM (2016) Image dehazing using quadtree decomposition and entropy-based contextual regularization. IEEE Signal Process Lett 23(6):853–857
Article Google Scholar
Sullivan GJ, Baker RL (1991) Efficient quadtree coding of images and video. In: [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing, pp 2661–2664
Tu Z, Ma Y, Li Z, Li C, Xu J, Liu Y (2020) Rgbt salient object detection: A large-scale dataset and benchmark
Wang G, Li C, Ma Y, Zheng A, Tang J, Luo B (2018) Rgb-t saliency detection benchmark: Dataset, baselines, analysis and a novel approach. In: Image and Graphics Technologies and Applications, pp 359–369
Shi J, Yan Q, Xu L, Jia J (2016) Hierarchical image saliency detection on extended cssd. IEEE Trans Pattern Anal Mach Intell 38(4):717–729
Article Google Scholar
Zhu J-Y, Park P, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In 2017 IEEE International Conference on Computer Vision (ICCV)
Peng H, Li B, Ling H, Hu W, Xiong W, Stephen J. (2017) Maybank: Salient object detection via structured matrix decomposition. IEEE Trans Pattern Anal Mach Intell 39(4):818–832
Article Google Scholar
Woo S, Park J, Lee J-Y, Kweon IN (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), volume 11211 LNCS, pp 3–19
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) Eca-net: Efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 11531–11539, 06
Piao Y, Rong Z, Zhang M, Ren W, Lu H (2020) A2dele: adaptive and attentive depth distiller for efficient rgb-d salient object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 9057–9066
Zhang J, Sclaroff S, Lin Z, Shen X, Price B, Mech R (2015) Minimum barrier salient object detection at 80 fps. In: Proceedings of the IEEE International Conference on Computer Vision(ICCV), pp 1404–1412
Xiao X, Zhou Y, Gong Y-J (2019) Rgb-’d’ saliency detection with pseudo depth. IEEE Trans Image Process 28(5):2126– 2139
Article MathSciNet Google Scholar
Fan D-P, Lin Z, Zhang Z, Zhu Mx, Cheng M-M (2021) Rethinking rgb-d salient object detection: Models, data sets, and large-scale benchmarks. IEEE Trans Neural Netw Learn Syst 32(5):2075–2089
Article Google Scholar
Li G, Liu Z, Chen M, Bai Z, Lin W, Ling H (2021) Hierarchical alternate interaction network for rgb-d salient object detection. IEEE Trans Image Process 30:3528–3542
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (grant number 62001156) and by the Key Research and Development Plan of Jiangsu Province (grant numbers BE2019036 and BE2020092).

Author information

Authors and Affiliations

College of Internet of Things Engineering, Hohai University, Changzhou, Jiangsu, 213022, China
Chang Xu, Qingwu Li, Mingyu Zhou, Qingkai Zhou, Yaqin Zhou & Yunpeng Ma

Authors

Chang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Qingwu Li
View author publications
You can also search for this author in PubMed Google Scholar
Mingyu Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Qingkai Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yaqin Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yunpeng Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qingwu Li.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflicts of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, C., Li, Q., Zhou, M. et al. RGB-T salient object detection via CNN feature and result saliency map fusion. Appl Intell 52, 11343–11362 (2022). https://doi.org/10.1007/s10489-021-02984-1

Download citation

Accepted: 05 November 2021
Published: 24 January 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s10489-021-02984-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RGB-T salient object detection via CNN feature and result saliency map fusion

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Interactive context-aware network for RGB-T salient object detection

Modal complementary fusion network for RGB-T salient object detection

Cross-Collaboration Weighted Fusion Network for RGB-T Salient Detection

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now