Abstract
Thermal infrared sensors have unique advantages under the conditions of insufficient illumination, complex scenarios, or occluded appearances. RGB-T salient object detection methods integrate the complementary advantages of visual and thermal modalities to capture salient objects more accurately. Considering the characteristics of visual and thermal images, we combine CNN feature and result saliency map fusion methods to achieve RGB-T salient object detection. First, a two-stream encoder-decoder network is proposed to handle the different saliency cues within RGB-T images. Specifically, the global attention module introduces the complementary saliency cues within thermal images to visual images, thereby ensuring the consistency of salient object locations. Subsequently, the two-stream decoder module gradually fuses the high-level salient object location cues with low-level detail saliency cues to obtain single-modality saliency maps. Then, saliency maps are fused and refined by the proposed result saliency map fusion method to achieve the final saliency map with high precision. In this way, the salient object is segmented with the fine boundary, and the noise inside the salient object is effectively suppressed. Experimental results demonstrate the effectiveness of each component within the CNN feature and result saliency map fusion methods. The proposed method facilitates desirable complementation for RGB-T images and performs favorably against state-of-the-art methods, especially in the challenges of low illumination, cluttered background, and low contrast.













Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Marchesotti L, Cifarelli C, Csurka G A framework for visual saliency detection with applications to image thumbnailing. In: Proceedings of the IEEE International Conference on Computer vision(ICCV), pp 2232–2239
Qin X, He S, Yang X, Dehghan M, Qin Q, Martin J (2018) Accurate outline extraction of individual building from very high-resolution optical images. IEEE Geosci Remote Sens Lett 15(11):1775–1779
Borji A, Cheng M-M, Jiang H, Li J (2015) Salient object detection: A benchmark. IEEE Trans Image Process 24(12):5706–5722
Rutishauser U, Walther D, Koch C, Perona P (2004) Is bottom-up attention useful for object recognition?. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern recognition(CVPR), vol 2, pp II–II
Itti L (2004) Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans Image Process 13(10):1304–1318
Guo C, Zhang L (2010) A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans Image Process 19(1):185–198
Zhao C, Huang Y, Qiu S (2019) Infrared and visible image fusion algorithm based on saliency detection and adaptive double-channel spiking cortical model. Infrared Phys Technol 102:102976
Minghui S, Liu L, Yuanxi P, Tian J, Li J (2019) Infrared and visible images fusion based on redundant directional lifting-based wavelet and saliency detection. Infrared Phys Technol 101:45–55
Chattopadhay A, Sarkar A, Howlader P, Balasubramanian VN (2018) Grad-cam++ Generalized gradient-based visual explanations for deep convolutional networks. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp 839–847
Liu N, Han J, Yang M (2018) Picanet: learning pixel-wise contextual attention for saliency detect ion. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3089–3098
Gao Y, Li C, Zhu Y, Tang J, He T (2019) Deep adaptive fusion network for high performance rgbt tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, pp 91–99, 10
Zimmermann C, Welschehold T, Dornhege C (2018) Wolfram Burgard, and Thomas Brox. 3d human pose estimation in rgbd images for robotic task learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp 1986–1992
Ji Y, Zhang H, Zhang Z, Liu M (2021) Cnn-based encoder-decoder networks for salient object detection A comprehensive review and recent advances. Inf Sci 546:835–857
Li C, Cong R, Kwong S, Hou J, Fu H, Zhu G, Zhang D, Huang Q (2021) Asif-net: Attention steered interweave fusion network for rgb-d salient object detection. IEEE Trans Cybern 51(1):88–100
Tu Z, Xia T, Li C, Wang X, Ma Y, Tang J (2019) Rgb-t image saliency detection via collaborative graph learning. IEEE Trans Multimed 22(1):160–173, 06
Bai X, Yu Z, Zhou F, Xue B (2015) Quadtree-based multi-focus image fusion using a weighted focus-measure. Inf Fusion 22:105–118, 03
Zhang L (2008) In situ image segmentation using the convexity of illumination distribution of the light sources
Liu Z, Zhang X, Luo S, Meur OL (2014) Superpixel-based spatiotemporal saliency detection. IEEE Trans Circ Syst Video Technol 24(9):1522–1540
Wang Q, Zheng W, Piramuthu R (2016) Grab: visual saliency via novel graph model and background priors. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 535–543
Ren J, Gong X, Yu L, Zhou W, Yang MY (2015) Exploiting global priors for rgb-d saliency detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp 25–32
Zhu W, Liang S, Wei Y, Sun J (2014) Saliency optimization from robust background detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 2814–2821
Yang C, Zhang L, Lu H, Ruan X, Yang M-H (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3166–3173
Li H, Lu H, Lin Z, Shen X, Price B (2015) Inner and inter label propagation: salient object detection in the wild. IEEE Trans Image Process 24(10):3176–3186
Li C, Cong R, Piao Y, Xu Q, Loy CC (2020) Rgb-d salient object detection with cross-modality modulation and selection.. In: Proceedings of the European Conference on Computer Vision (ECCV). Springer International Publishing, Cham, pp 225–241
Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 7471– 7481
Wang W, Lai Q, Fu H, Shen J, Ling H, Yang R (2021) Salient object detection in the deep learning era An in-depth survey. IEEE Trans Pattern Anal Mach Intell 1:1–1
Zhao J, Liu J, Fan D, Cao Y, Yang J, Cheng M-M (2019) Egnet: Edge guidance network for salient object detection. In: Proceedings of the IEEE International Conference on Computer Vision(ICCV), pp 8778–8787
Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3080–3089
Liu N, Ni Z, Han J (2020) Learning selective self-mutual attention for rgb-d saliency detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 13753–13762
Fan D, Cheng M, Liu Y, Li T, Borji A (2017) Structure-measure: a new way to evaluate foreground maps. In: Proceedings of the IEEE International Conference on Computer Vision(ICCV), pp 4558–4567
Zhang J, Fan D, Dai Y, Anwar S, Saleh FS, Zhang T, Barnes N (2020) Uc-net: uncertainty inspired rgb-d saliency detection via conditional variational autoencoders. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 8579–8588
Zhang M, Ren W, Piao Y, Rong Z, Lu H (2020) Select, supplement and focus for rgb-d saliency detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3469–3478, 06
Zhang Z, Lin Z, Xu J, Jin W-D, Lu S-P, Fan D-P (2021) Bilateral attention network for rgb-d salient object detection. IEEE Trans Image Process 30:1949–1961
Qu L, He S, Zhang J, Tian J, Tang Y, Yang Q (2017) Rgbd salient object detection via deep fusion. IEEE Trans Image Process 26(5):2274–2285
Fan D-P, Lin Z, Zhang Z, Zhu M, Cheng M-M (2020) Rethinking rgb-d salient object detection: Models, data sets, and large-scale benchmarks. IEEE Trans Neural Netw Learn Syst 32(5):2075–2089
Liu Z, Shi S, Duan Q, Zhang W, Zhao P (2019) Salient object detection for rgb-d image by single stream recurrent convolution neural network. Neurocomputing 363(07):46–57
Zhao X, Zhang L, Pang Y, Lu H, Zhang L (2020) A single stream network for robust and real-time rgb-d salient object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 646–662
Chen H, Li Y, Su D (2019) Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for rgb-d salient object detection. Pattern Recogn 86:376–385
Fu K, Fan D-P, Ji G-P, Zhao Q (2020) Jl-dcf: Joint learning and densely-cooperative fusion framework for rgb-d salient object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3052–3062
Fan D-P, Zhai A, Borji Y, Yang J, Shao L (2020) Bbs-net: Rgb-d salient object detection with a bifurcated backbone strategy network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 275–292
Wang N, Gong X (2019) Adaptive fusion for rgb-d salient object detection. IEEE Access 7:55277–55284
Guo J, Ren T, Bei J (2016) Salient object detection for rgb-d image via saliency evolution. In: Proceedings of the IEEE International Conference on Multimedia and Expo(ICME), pp 1–6
Dinh P-H (2021) Combining gabor energy with equilibrium optimizer algorithm for multi-modality medical image fusion. Biomed Signal Process Control 68:102696
Dinh P-H (2021) A novel approach based on three-scale image decomposition and marine predators algorithm for multi-modal medical image fusion. Biomed Signal Process Control 67:102536
Dinh P-H (2021) A novel approach based on grasshopper optimization algorithm for medical image fusion. Expert Syst Appl 171:114576
Dinh P-H (2021) Multi-modal medical image fusion based on equilibrium optimizer algorithm and local energy functions. Appl Intell:04
Abualigah L, Diabat A, Mirjalili S, Elaziz MA, Gandomi AH (2021) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng 376:113609
Ahmadianfar I, Bozorg-Haddad O, Chu X (2020) Gradient-based optimizer: a new metaheuristic optimization algorithm. Inf Sci 540:131–159
Singh VK, Kumar N Soft: salient object detection based on feature combination using teaching-learning-based optimization. Signal Image and Video Processing
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3431–3440
Gong A, Huang L, Shi J, Liu C (2021) Unsupervised rgb-t saliency detection by node classification distance and sparse constrained graph learning. Appl Intell:05
Ma Y, Sun D, Meng Q, Ding Z, Li C (2017) Learning multiscale deep features and svm regressors for adaptive rgb-t saliency detection. In: 2017 10th International Symposium on Computational Intelligence and Design (ISCID), pp 389–392, 12
Tu Z, Li Z, Li C, Lang Y, Tang J (2021) Multi-interactive siamese decoder for rgbt salient object detection
Zhang Q, Huang N, Yao L, Zhang D, Shan C, Han J (2020) Rgb-t salient object detection via fusing multi-level cnn features. IEEE Trans Image Process 29:3321–3335
Lazebnik S., Schmid C., Ponce J (2006) Beyond bags of features Spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), vol 2, pp 2169–2178
Zhu Z, Xu M, Bai S, Huang T, Bai X (2019) Asymmetric non-local neural networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision(ICCV), pp 593–602, 10
Yu C, Liu Y, Gao C, Shen C, Sang N (2020) Representative graph neural network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 379–396
Nasir Baig M, Riaz M, Ghafoor A, Siddiqui AM (2016) Image dehazing using quadtree decomposition and entropy-based contextual regularization. IEEE Signal Process Lett 23(6):853–857
Sullivan GJ, Baker RL (1991) Efficient quadtree coding of images and video. In: [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing, pp 2661–2664
Tu Z, Ma Y, Li Z, Li C, Xu J, Liu Y (2020) Rgbt salient object detection: A large-scale dataset and benchmark
Wang G, Li C, Ma Y, Zheng A, Tang J, Luo B (2018) Rgb-t saliency detection benchmark: Dataset, baselines, analysis and a novel approach. In: Image and Graphics Technologies and Applications, pp 359–369
Shi J, Yan Q, Xu L, Jia J (2016) Hierarchical image saliency detection on extended cssd. IEEE Trans Pattern Anal Mach Intell 38(4):717–729
Zhu J-Y, Park P, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In 2017 IEEE International Conference on Computer Vision (ICCV)
Peng H, Li B, Ling H, Hu W, Xiong W, Stephen J. (2017) Maybank: Salient object detection via structured matrix decomposition. IEEE Trans Pattern Anal Mach Intell 39(4):818–832
Woo S, Park J, Lee J-Y, Kweon IN (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), volume 11211 LNCS, pp 3–19
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) Eca-net: Efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 11531–11539, 06
Piao Y, Rong Z, Zhang M, Ren W, Lu H (2020) A2dele: adaptive and attentive depth distiller for efficient rgb-d salient object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 9057–9066
Zhang J, Sclaroff S, Lin Z, Shen X, Price B, Mech R (2015) Minimum barrier salient object detection at 80 fps. In: Proceedings of the IEEE International Conference on Computer Vision(ICCV), pp 1404–1412
Xiao X, Zhou Y, Gong Y-J (2019) Rgb-’d’ saliency detection with pseudo depth. IEEE Trans Image Process 28(5):2126– 2139
Fan D-P, Lin Z, Zhang Z, Zhu Mx, Cheng M-M (2021) Rethinking rgb-d salient object detection: Models, data sets, and large-scale benchmarks. IEEE Trans Neural Netw Learn Syst 32(5):2075–2089
Li G, Liu Z, Chen M, Bai Z, Lin W, Ling H (2021) Hierarchical alternate interaction network for rgb-d salient object detection. IEEE Trans Image Process 30:3528–3542
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (grant number 62001156) and by the Key Research and Development Plan of Jiangsu Province (grant numbers BE2019036 and BE2020092).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflicts of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xu, C., Li, Q., Zhou, M. et al. RGB-T salient object detection via CNN feature and result saliency map fusion. Appl Intell 52, 11343–11362 (2022). https://doi.org/10.1007/s10489-021-02984-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02984-1