Skip to main content

MBDNet: Mitigating the “Under-Training Issue” in Dual-Encoder Model for RGB-d Salient Object Detection

  • Conference paper
  • First Online:
Advanced Intelligent Computing Technology and Applications (ICIC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14090))

Included in the following conference series:

Abstract

Existing RGB-D salient object detection methods generally rely on the dual-encoder structure for RGB and depth feature extraction. However, we observe that the encoders in such models are often not adequately trained to obtain superior feature representations. We name this problem the “under-training issue”. To this end, we propose a multi-branch decoding network (MBDNet) to suppress this issue. The MBDNet introduces additional decoding branches with supervision to form a multi-branch decoding (MBD) structure, facilitating the training of the encoders and enhancing the feature representation. Specifically, to ensure the effectiveness of the introduced supervision and improve the performance of additional decoding branches, we design an adaptive multi-scale decoding (AMSD) module. We also design a multi-branch feature aggregation (MBFA) module to aggregate the multi-branch features in MBD to further improve the detection accuracy. In addition, we design an enhancement complement fusion (ECF) module to achieve multi-modality feature fusion. Extensive experiments demonstrate that our MBDNet outperforms other state-of-the-art methods and mitigates the “under-training issue”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abdulmunem, A., Lai, Y.-K., Sun, X.: Saliency guided local and global descriptors for effective action recognition. Computational Visual Media 2(1), 97–106 (2016). https://doi.org/10.1007/s41095-016-0033-9

    Article  Google Scholar 

  2. Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: 2009 IEEE conference on computer vision and pattern recognition, pp. 1597–1604. IEEE (2009)

    Google Scholar 

  3. Bi, H., Wu, R., Liu, Z., Zhu, H., Zhang, C., Xiang, T.Z.: Cross-modal hierarchical interaction network for rgb-d salient object detection. Pattern Recogn. 136, 109194 (2023)

    Article  Google Scholar 

  4. Cadene, R., Dancette, C., Cord, M., Parikh, D., et al.: Rubi: Reducing unimodal biases for visual question answering. Advances in neural information processing systems 32 (2019)

    Google Scholar 

  5. Chen, H., Li, Y.: Progressively complementarity-aware fusion network for rgb-d salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3051–3060 (2018)

    Google Scholar 

  6. Cong, R., et al.: Cir-net: Cross-modality interaction and refinement for rgb-d salient object detection. IEEE Trans. Image Process. 31, 6800–6815 (2022)

    Article  Google Scholar 

  7. Fan, D.P., Gong, C., Cao, Y., Ren, B., Cheng, M.M., Borji, A.: Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint arXiv:1805.10421 (2018)

  8. Fan, D.P., Lin, Z., Zhang, Z., Zhu, M., Cheng, M.M.: Rethinking rgb-d salient object detection: Models, data sets, and large-scale benchmarks. IEEE Transactions on Neural Networks and Learning Systems 32(5), 2075–2089 (2020)

    Article  Google Scholar 

  9. Fu, K., Fan, D.P., Ji, G.P., Zhao, Q., Shen, J., Zhu, C.: Siamese network for rgb-d salient object detection and beyond. IEEE transactions on pattern analysis and machine intelligence (2021)

    Google Scholar 

  10. Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2net: A new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2019)

    Article  Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)

    Google Scholar 

  12. Ji, W., et al.: Calibrated rgb-d salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9471–9481 (2021)

    Google Scholar 

  13. Jin, W.D., Xu, J., Han, Q., Zhang, Y., Cheng, M.M.: Cdnet: Complementary depth network for rgb-d salient object detection. IEEE Trans. Image Process. 30, 3376–3390 (2021)

    Article  Google Scholar 

  14. Ju, R., Ge, L., Geng, W., Ren, T., Wu, G.: Depth saliency based on anisotropic center-surround difference. In: 2014 IEEE international conference on image processing (ICIP), pp. 1115–1119. IEEE (2014)

    Google Scholar 

  15. Li, C., Cong, R., Piao, Y., Xu, Q., Loy, C.C.: Rgb-d salient object detection with cross-modality modulation and selection. In: European Conference on Computer Vision, pp. 225–241. Springer (2020)

    Google Scholar 

  16. Liu, N., Zhang, N., Han, J.: Learning selective self-mutual attention for rgb-d saliency detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13756–13765 (2020)

    Google Scholar 

  17. Margolin, R., Zelnik-Manor, L., Tal, A.: How to evaluate foreground maps? In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 248–255 (2014)

    Google Scholar 

  18. Niu, Y., Geng, Y., Li, X., Liu, F.: Leveraging stereopsis for saliency analysis. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 454–461. IEEE (2012)

    Google Scholar 

  19. Pang, Y., Zhang, L., Zhao, X., Lu, H.: Hierarchical dynamic filtering network for rgb-d salient object detection. In: European Conference on Computer Vision, pp. 235–252. Springer (2020)

    Google Scholar 

  20. Peng, H., Li, B., Xiong, W., Hu, W., Ji, R.: Rgbd salient object detection: a benchmark and algorithms. In: European conference on computer vision, pp. 92–109. Springer (2014)

    Google Scholar 

  21. Perazzi, F., Krähenbühl, P., Pritch, Y., Hornung, A.: Saliency filters: contrast based filtering for salient region detection. In: 2012 IEEE conference on computer vision and pattern recognition, pp. 733–740. IEEE (2012)

    Google Scholar 

  22. Piao, Y., Ji, W., Li, J., Zhang, M., Lu, H.: Depth-induced multi-scale recurrent attention network for saliency detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7254–7263 (2019)

    Google Scholar 

  23. Sun, P., Zhang, W., Wang, H., Li, S., Li, X.: Deep rgb-d saliency detection with depth-sensitive attention and automatic multi-modal fusion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1407–1417 (2021)

    Google Scholar 

  24. Wang, F., Pan, J., Xu, S., Tang, J.: Learning discriminative cross-modality features for rgb-d saliency detection. IEEE Trans. Image Process. 31, 1285–1297 (2022)

    Article  Google Scholar 

  25. Wang, W., Tran, D., Feiszli, M.: What makes training multi-modal classification networks hard? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12695–12705 (2020)

    Google Scholar 

  26. Wu, Z., Su, L., Huang, Q.: Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3907–3916 (2019)

    Google Scholar 

  27. Yao, Z., Wang, L.: Erbanet: enhancing region and boundary awareness for salient object detection. Neurocomputing 448, 152–167 (2021)

    Article  Google Scholar 

  28. Zhai, Y., et al.: Bifurcated backbone strategy for rgb-d salient object detection. IEEE Trans. Image Process. 30, 8727–8742 (2021)

    Article  Google Scholar 

  29. Zhang, J., et al.: Uncertainty inspired rgb-d saliency detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021)

    Google Scholar 

  30. Zhang, W., Fu, K., Wang, Z., Ji, G.P., Zhao, Q.: Depth quality-inspired feature manipulation for efficient rgb-d and video salient object detection. arXiv preprint arXiv:2208.03918 (2022)

  31. Zhou, B., Yang, G., Wan, X., Wang, Y., Liu, C., Wang, H.: A simple network with progressive structure for salient object detection. In: Chinese Conference on Pattern Recognition and Computer Vision (PRCV) (2021)

    Google Scholar 

  32. Zhou, J., Wang, L., Lu, H., Huang, K., Shi, X., Liu, B.: Mvsalnet: Multi-view augmentation for rgb-d salient object detection. In: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXIX, pp. 270–287. Springer (2022)

    Google Scholar 

Download references

Acknowledgment

This work is supported by the National Natural Science Foundation of China under Grant No. 62076058.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gang Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, S., Yang, G., Zhang, Y., Xu, Q., Wang, Y. (2023). MBDNet: Mitigating the “Under-Training Issue” in Dual-Encoder Model for RGB-d Salient Object Detection. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14090. Springer, Singapore. https://doi.org/10.1007/978-981-99-4761-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4761-4_9

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4760-7

  • Online ISBN: 978-981-99-4761-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics