Skip to main content

Advertisement

Log in

RGB-T salient object detection via CNN feature and result saliency map fusion

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Thermal infrared sensors have unique advantages under the conditions of insufficient illumination, complex scenarios, or occluded appearances. RGB-T salient object detection methods integrate the complementary advantages of visual and thermal modalities to capture salient objects more accurately. Considering the characteristics of visual and thermal images, we combine CNN feature and result saliency map fusion methods to achieve RGB-T salient object detection. First, a two-stream encoder-decoder network is proposed to handle the different saliency cues within RGB-T images. Specifically, the global attention module introduces the complementary saliency cues within thermal images to visual images, thereby ensuring the consistency of salient object locations. Subsequently, the two-stream decoder module gradually fuses the high-level salient object location cues with low-level detail saliency cues to obtain single-modality saliency maps. Then, saliency maps are fused and refined by the proposed result saliency map fusion method to achieve the final saliency map with high precision. In this way, the salient object is segmented with the fine boundary, and the noise inside the salient object is effectively suppressed. Experimental results demonstrate the effectiveness of each component within the CNN feature and result saliency map fusion methods. The proposed method facilitates desirable complementation for RGB-T images and performs favorably against state-of-the-art methods, especially in the challenges of low illumination, cluttered background, and low contrast.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Marchesotti L, Cifarelli C, Csurka G A framework for visual saliency detection with applications to image thumbnailing. In: Proceedings of the IEEE International Conference on Computer vision(ICCV), pp 2232–2239

  2. Qin X, He S, Yang X, Dehghan M, Qin Q, Martin J (2018) Accurate outline extraction of individual building from very high-resolution optical images. IEEE Geosci Remote Sens Lett 15(11):1775–1779

    Article  Google Scholar 

  3. Borji A, Cheng M-M, Jiang H, Li J (2015) Salient object detection: A benchmark. IEEE Trans Image Process 24(12):5706–5722

  4. Rutishauser U, Walther D, Koch C, Perona P (2004) Is bottom-up attention useful for object recognition?. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern recognition(CVPR), vol 2, pp II–II

  5. Itti L (2004) Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans Image Process 13(10):1304–1318

  6. Guo C, Zhang L (2010) A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans Image Process 19(1):185–198

    Article  MathSciNet  Google Scholar 

  7. Zhao C, Huang Y, Qiu S (2019) Infrared and visible image fusion algorithm based on saliency detection and adaptive double-channel spiking cortical model. Infrared Phys Technol 102:102976

    Article  Google Scholar 

  8. Minghui S, Liu L, Yuanxi P, Tian J, Li J (2019) Infrared and visible images fusion based on redundant directional lifting-based wavelet and saliency detection. Infrared Phys Technol 101:45–55

    Article  Google Scholar 

  9. Chattopadhay A, Sarkar A, Howlader P, Balasubramanian VN (2018) Grad-cam++ Generalized gradient-based visual explanations for deep convolutional networks. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp 839–847

  10. Liu N, Han J, Yang M (2018) Picanet: learning pixel-wise contextual attention for saliency detect ion. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3089–3098

  11. Gao Y, Li C, Zhu Y, Tang J, He T (2019) Deep adaptive fusion network for high performance rgbt tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, pp 91–99, 10

  12. Zimmermann C, Welschehold T, Dornhege C (2018) Wolfram Burgard, and Thomas Brox. 3d human pose estimation in rgbd images for robotic task learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp 1986–1992

  13. Ji Y, Zhang H, Zhang Z, Liu M (2021) Cnn-based encoder-decoder networks for salient object detection A comprehensive review and recent advances. Inf Sci 546:835–857

    Article  MathSciNet  Google Scholar 

  14. Li C, Cong R, Kwong S, Hou J, Fu H, Zhu G, Zhang D, Huang Q (2021) Asif-net: Attention steered interweave fusion network for rgb-d salient object detection. IEEE Trans Cybern 51(1):88–100

    Article  Google Scholar 

  15. Tu Z, Xia T, Li C, Wang X, Ma Y, Tang J (2019) Rgb-t image saliency detection via collaborative graph learning. IEEE Trans Multimed 22(1):160–173, 06

  16. Bai X, Yu Z, Zhou F, Xue B (2015) Quadtree-based multi-focus image fusion using a weighted focus-measure. Inf Fusion 22:105–118, 03

  17. Zhang L (2008) In situ image segmentation using the convexity of illumination distribution of the light sources

  18. Liu Z, Zhang X, Luo S, Meur OL (2014) Superpixel-based spatiotemporal saliency detection. IEEE Trans Circ Syst Video Technol 24(9):1522–1540

    Article  Google Scholar 

  19. Wang Q, Zheng W, Piramuthu R (2016) Grab: visual saliency via novel graph model and background priors. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 535–543

  20. Ren J, Gong X, Yu L, Zhou W, Yang MY (2015) Exploiting global priors for rgb-d saliency detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp 25–32

  21. Zhu W, Liang S, Wei Y, Sun J (2014) Saliency optimization from robust background detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 2814–2821

  22. Yang C, Zhang L, Lu H, Ruan X, Yang M-H (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3166–3173

  23. Li H, Lu H, Lin Z, Shen X, Price B (2015) Inner and inter label propagation: salient object detection in the wild. IEEE Trans Image Process 24(10):3176–3186

    Article  MathSciNet  Google Scholar 

  24. Li C, Cong R, Piao Y, Xu Q, Loy CC (2020) Rgb-d salient object detection with cross-modality modulation and selection.. In: Proceedings of the European Conference on Computer Vision (ECCV). Springer International Publishing, Cham, pp 225–241

  25. Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 7471– 7481

  26. Wang W, Lai Q, Fu H, Shen J, Ling H, Yang R (2021) Salient object detection in the deep learning era An in-depth survey. IEEE Trans Pattern Anal Mach Intell 1:1–1

    Google Scholar 

  27. Zhao J, Liu J, Fan D, Cao Y, Yang J, Cheng M-M (2019) Egnet: Edge guidance network for salient object detection. In: Proceedings of the IEEE International Conference on Computer Vision(ICCV), pp 8778–8787

  28. Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3080–3089

  29. Liu N, Ni Z, Han J (2020) Learning selective self-mutual attention for rgb-d saliency detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 13753–13762

  30. Fan D, Cheng M, Liu Y, Li T, Borji A (2017) Structure-measure: a new way to evaluate foreground maps. In: Proceedings of the IEEE International Conference on Computer Vision(ICCV), pp 4558–4567

  31. Zhang J, Fan D, Dai Y, Anwar S, Saleh FS, Zhang T, Barnes N (2020) Uc-net: uncertainty inspired rgb-d saliency detection via conditional variational autoencoders. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 8579–8588

  32. Zhang M, Ren W, Piao Y, Rong Z, Lu H (2020) Select, supplement and focus for rgb-d saliency detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3469–3478, 06

  33. Zhang Z, Lin Z, Xu J, Jin W-D, Lu S-P, Fan D-P (2021) Bilateral attention network for rgb-d salient object detection. IEEE Trans Image Process 30:1949–1961

    Article  Google Scholar 

  34. Qu L, He S, Zhang J, Tian J, Tang Y, Yang Q (2017) Rgbd salient object detection via deep fusion. IEEE Trans Image Process 26(5):2274–2285

    Article  MathSciNet  Google Scholar 

  35. Fan D-P, Lin Z, Zhang Z, Zhu M, Cheng M-M (2020) Rethinking rgb-d salient object detection: Models, data sets, and large-scale benchmarks. IEEE Trans Neural Netw Learn Syst 32(5):2075–2089

    Article  Google Scholar 

  36. Liu Z, Shi S, Duan Q, Zhang W, Zhao P (2019) Salient object detection for rgb-d image by single stream recurrent convolution neural network. Neurocomputing 363(07):46–57

    Google Scholar 

  37. Zhao X, Zhang L, Pang Y, Lu H, Zhang L (2020) A single stream network for robust and real-time rgb-d salient object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 646–662

  38. Chen H, Li Y, Su D (2019) Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for rgb-d salient object detection. Pattern Recogn 86:376–385

    Article  Google Scholar 

  39. Fu K, Fan D-P, Ji G-P, Zhao Q (2020) Jl-dcf: Joint learning and densely-cooperative fusion framework for rgb-d salient object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3052–3062

  40. Fan D-P, Zhai A, Borji Y, Yang J, Shao L (2020) Bbs-net: Rgb-d salient object detection with a bifurcated backbone strategy network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 275–292

  41. Wang N, Gong X (2019) Adaptive fusion for rgb-d salient object detection. IEEE Access 7:55277–55284

    Article  Google Scholar 

  42. Guo J, Ren T, Bei J (2016) Salient object detection for rgb-d image via saliency evolution. In: Proceedings of the IEEE International Conference on Multimedia and Expo(ICME), pp 1–6

  43. Dinh P-H (2021) Combining gabor energy with equilibrium optimizer algorithm for multi-modality medical image fusion. Biomed Signal Process Control 68:102696

    Article  Google Scholar 

  44. Dinh P-H (2021) A novel approach based on three-scale image decomposition and marine predators algorithm for multi-modal medical image fusion. Biomed Signal Process Control 67:102536

    Article  Google Scholar 

  45. Dinh P-H (2021) A novel approach based on grasshopper optimization algorithm for medical image fusion. Expert Syst Appl 171:114576

    Article  Google Scholar 

  46. Dinh P-H (2021) Multi-modal medical image fusion based on equilibrium optimizer algorithm and local energy functions. Appl Intell:04

  47. Abualigah L, Diabat A, Mirjalili S, Elaziz MA, Gandomi AH (2021) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng 376:113609

    Article  MathSciNet  Google Scholar 

  48. Ahmadianfar I, Bozorg-Haddad O, Chu X (2020) Gradient-based optimizer: a new metaheuristic optimization algorithm. Inf Sci 540:131–159

    Article  MathSciNet  Google Scholar 

  49. Singh VK, Kumar N Soft: salient object detection based on feature combination using teaching-learning-based optimization. Signal Image and Video Processing

  50. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 3431–3440

  51. Gong A, Huang L, Shi J, Liu C (2021) Unsupervised rgb-t saliency detection by node classification distance and sparse constrained graph learning. Appl Intell:05

  52. Ma Y, Sun D, Meng Q, Ding Z, Li C (2017) Learning multiscale deep features and svm regressors for adaptive rgb-t saliency detection. In: 2017 10th International Symposium on Computational Intelligence and Design (ISCID), pp 389–392, 12

  53. Tu Z, Li Z, Li C, Lang Y, Tang J (2021) Multi-interactive siamese decoder for rgbt salient object detection

  54. Zhang Q, Huang N, Yao L, Zhang D, Shan C, Han J (2020) Rgb-t salient object detection via fusing multi-level cnn features. IEEE Trans Image Process 29:3321–3335

    Article  Google Scholar 

  55. Lazebnik S., Schmid C., Ponce J (2006) Beyond bags of features Spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), vol 2, pp 2169–2178

  56. Zhu Z, Xu M, Bai S, Huang T, Bai X (2019) Asymmetric non-local neural networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision(ICCV), pp 593–602, 10

  57. Yu C, Liu Y, Gao C, Shen C, Sang N (2020) Representative graph neural network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 379–396

  58. Nasir Baig M, Riaz M, Ghafoor A, Siddiqui AM (2016) Image dehazing using quadtree decomposition and entropy-based contextual regularization. IEEE Signal Process Lett 23(6):853–857

    Article  Google Scholar 

  59. Sullivan GJ, Baker RL (1991) Efficient quadtree coding of images and video. In: [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing, pp 2661–2664

  60. Tu Z, Ma Y, Li Z, Li C, Xu J, Liu Y (2020) Rgbt salient object detection: A large-scale dataset and benchmark

  61. Wang G, Li C, Ma Y, Zheng A, Tang J, Luo B (2018) Rgb-t saliency detection benchmark: Dataset, baselines, analysis and a novel approach. In: Image and Graphics Technologies and Applications, pp 359–369

  62. Shi J, Yan Q, Xu L, Jia J (2016) Hierarchical image saliency detection on extended cssd. IEEE Trans Pattern Anal Mach Intell 38(4):717–729

    Article  Google Scholar 

  63. Zhu J-Y, Park P, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In 2017 IEEE International Conference on Computer Vision (ICCV)

  64. Peng H, Li B, Ling H, Hu W, Xiong W, Stephen J. (2017) Maybank: Salient object detection via structured matrix decomposition. IEEE Trans Pattern Anal Mach Intell 39(4):818–832

    Article  Google Scholar 

  65. Woo S, Park J, Lee J-Y, Kweon IN (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), volume 11211 LNCS, pp 3–19

  66. Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) Eca-net: Efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 11531–11539, 06

  67. Piao Y, Rong Z, Zhang M, Ren W, Lu H (2020) A2dele: adaptive and attentive depth distiller for efficient rgb-d salient object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), pp 9057–9066

  68. Zhang J, Sclaroff S, Lin Z, Shen X, Price B, Mech R (2015) Minimum barrier salient object detection at 80 fps. In: Proceedings of the IEEE International Conference on Computer Vision(ICCV), pp 1404–1412

  69. Xiao X, Zhou Y, Gong Y-J (2019) Rgb-’d’ saliency detection with pseudo depth. IEEE Trans Image Process 28(5):2126– 2139

    Article  MathSciNet  Google Scholar 

  70. Fan D-P, Lin Z, Zhang Z, Zhu Mx, Cheng M-M (2021) Rethinking rgb-d salient object detection: Models, data sets, and large-scale benchmarks. IEEE Trans Neural Netw Learn Syst 32(5):2075–2089

    Article  Google Scholar 

  71. Li G, Liu Z, Chen M, Bai Z, Lin W, Ling H (2021) Hierarchical alternate interaction network for rgb-d salient object detection. IEEE Trans Image Process 30:3528–3542

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (grant number 62001156) and by the Key Research and Development Plan of Jiangsu Province (grant numbers BE2019036 and BE2020092).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qingwu Li.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflicts of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, C., Li, Q., Zhou, M. et al. RGB-T salient object detection via CNN feature and result saliency map fusion. Appl Intell 52, 11343–11362 (2022). https://doi.org/10.1007/s10489-021-02984-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02984-1

Keywords