Skip to main content

Bi-directional Interaction and Dense Aggregation Network for RGB-D Salient Object Detection

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14554))

Included in the following conference series:

  • 344 Accesses

Abstract

RGB-D salient object detection (SOD) which aims to detect the prominent regions in figures has attracted much attention recently. It jointly models the RGB and depth information. However, existing methods explore cross-modality information from RGB images and depth maps without considering the potential coupling correlation between them. This may lead to insufficient information learning of these two modalities and even bring conflict due to their de-coupled representations. Thus, in this paper, we propose a novel framework called Bi-directional Interaction and Dense Aggregation Network (BIDANet) for RGB-D salient object detection. Firstly, we carefully design the depth-guided enhancement (DGE) and RGB-induced style transfer (RST) to allow the depth map and RGB image to learn information from each other through the bi-directional interaction network. Secondly, we adopt an adaptive cross-modal fusion (ACF) to flexibly integrate these learned multi-modal features. Last, we propose a dense aggregation network (DAN) to effectively aggregate cross-stage outcomes and generate accurate saliency prediction. Extensive experiments on 5 widely-used datasets demonstrate that our proposed BIDANet achieves superior performance compared with 14 state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1597–1604 (2009)

    Google Scholar 

  2. Bi, H., Wu, R., Liu, Z., Zhu, H., Zhang, C., Xiang, T.Z.: Cross-modal hierarchical interaction network for RGB-D salient object detection. Pattern Recogn. 136, 109194 (2023)

    Article  Google Scholar 

  3. Borji, A., Cheng, M.M., Jiang, H., Li, J.: Salient object detection: a benchmark. IEEE Trans. Image Process. 24(12), 5706–5722 (2015)

    Article  MathSciNet  Google Scholar 

  4. Chen, C., Wei, J., Peng, C., Qin, H.: Depth-quality-aware salient object detection. IEEE Trans. Image Process. 30, 2350–2363 (2021)

    Article  Google Scholar 

  5. Chen, G., et al.: Modality-induced transfer-fusion network for RGB-D and RGB-T salient object detection. IEEE Trans. Circ. Syst. Video Technol. 33(4), 1787–1801 (2023)

    Article  Google Scholar 

  6. Fan, D.P., Cheng, M.M., Liu, Y., Li, T., Borji, A.: Structure-measure: a new way to evaluate foreground maps. In: Proceedings of IEEE International Conference on Computer Vision, pp. 4558–4567 (2017)

    Google Scholar 

  7. Fan, D.P., Gong, C., Cao, Y., Ren, B., Cheng, M.M., Borji, A.: Enhanced-alignment measure for binary foreground map evaluation. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 698–704 (2018)

    Google Scholar 

  8. Fan, D.P., Lin, Z., Zhang, Z., Zhu, M., Cheng, M.M.: Rethinking RGB-D salient object detection: models, data sets, and large-scale benchmarks. IEEE Trans. Neural Netw. Learn. Syst. 32(5), 2075–2089 (2021)

    Article  Google Scholar 

  9. Fang, X., Jiang, M., Zhu, J., Shao, X., Wang, H.: M2RNet: multi-modal and multi-scale refined network for RGB-D salient object detection. Pattern Recogn. 135, 109139 (2023)

    Article  Google Scholar 

  10. Fu, K., Fan, D.P., Ji, G.P., Zhao, Q., Shen, J., Zhu, C.: Siamese network for RGB-D salient object detection and beyond. IEEE Trans. Pattern Anal. Mach. Intell. 44(9), 5541–5559 (2022)

    Google Scholar 

  11. Gao, W., Liao, G., Ma, S., Li, G., Liang, Y., Lin, W.: Unified information fusion network for multi-modal RGB-D and RGB-T salient object detection. IEEE Trans. Circ. Syst. Video Technol. 32(4), 2091–2106 (2022)

    Article  Google Scholar 

  12. Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)

    Google Scholar 

  13. Ji, W., et al.: Calibrated RGB-D salient object detection. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 9466–9476 (2021)

    Google Scholar 

  14. Ji, W., et al.: DMRA: depth-induced multi-scale recurrent attention network for RGB-D saliency detection. IEEE Trans. Image Process. 31, 2321–2336 (2022)

    Article  Google Scholar 

  15. Jin, W.D., Xu, J., Han, Q., Zhang, Y., Cheng, M.M.: CDNet: complementary depth network for RGB-D salient object detection. IEEE Trans. Image Process. 30, 3376–3390 (2021)

    Article  Google Scholar 

  16. Jin, X., Yi, K., Xu, J.: MoADNet: mobile asymmetric dual-stream networks for real-time and lightweight RGB-D salient object detection. IEEE Trans. Circ. Syst. Video Technol. 32(11), 7632–7645 (2022)

    Article  Google Scholar 

  17. Ju, R., Ge, L., Geng, W., Ren, T., Wu, G.: Depth saliency based on anisotropic center-surround difference. In: Proceedings of the International Conference on Image Processing, pp. 1115–1119 (2014)

    Google Scholar 

  18. Li, Z., Lang, C., Liew, J.H., Li, Y., Hou, Q., Feng, J.: Cross-layer feature pyramid network for salient object detection. IEEE Trans. Image Process. 30, 4587–4598 (2021)

    Article  Google Scholar 

  19. Liu, J.J., Hou, Q., Liu, Z.A., Cheng, M.M.: PoolNet+: exploring the potential of pooling for salient object detection. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 887–904 (2023)

    Article  Google Scholar 

  20. Liu, N., Zhang, N., Han, J.: Learning selective self-mutual attention for RGB-D saliency detection. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 13756–13765 (2020)

    Google Scholar 

  21. Liu, Z., Wang, Y., Tu, Z., Xiao, Y., Tang, B.: TriTransNet: RGB-D salient object detection with a triplet transformer embedding network. In: Proceedings of the ACM International Conference on Multimedia, pp. 4481–4490 (2021)

    Google Scholar 

  22. Mao, Y., Jiang, Q., Cong, R., Gao, W., Shao, F., Kwong, S.: Cross-modality fusion and progressive integration network for saliency prediction on stereoscopic 3D images. IEEE Trans. Multimedia 24, 2435–2448 (2022)

    Article  Google Scholar 

  23. Niu, Y., Geng, Y., Li, X., Liu, F.: Leveraging stereopsis for saliency analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 454–461 (2012)

    Google Scholar 

  24. Peng, H., Li, B., Xiong, W., Hu, W., Ji, R.: RGBD salient object detection: a benchmark and algorithms. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 92–109. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_7

    Chapter  Google Scholar 

  25. Piao, Y., Rong, Z., Zhang, M., Ren, W., Lu, H.: A2dele: adaptive and attentive depth distiller for efficient RGB-D salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9060–9069 (2020)

    Google Scholar 

  26. Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., Jagersand, M.: BASNet: boundary-aware salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7471–7481 (2019)

    Google Scholar 

  27. Sun, P., Zhang, W., Wang, H., Li, S., Li, X.: Deep RGB-D saliency detection with depth-sensitive attention and automatic multi-modal fusion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1407–1417 (2021)

    Google Scholar 

  28. Tu, Z., Ma, Y., Li, C., Tang, J., Luo, B.: Edge-guided non-local fully convolutional network for salient object detection. IEEE Trans. Circ. Syst. Video Technol. 31(2), 582–593 (2021)

    Article  Google Scholar 

  29. Wang, W., Lai, Q., Fu, H., Shen, J., Ling, H., Yang, R.: Salient object detection in the deep learning era: an in-depth survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(6), 3239–3259 (2022)

    Article  Google Scholar 

  30. Wang, X., Li, S., Chen, C., Fang, Y., Hao, A., Qin, H.: Data-level recombination and lightweight fusion scheme for RGB-D salient object detection. IEEE Trans. Image Process. 30, 458–471 (2021)

    Article  Google Scholar 

  31. Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1

    Chapter  Google Scholar 

  32. Wu, J., Hao, F., Liang, W., Xu, J.: Transformer fusion and pixel-level contrastive learning for RGB-D salient object detection. IEEE Trans. Multimedia, 1–16 (2023)

    Google Scholar 

  33. Yang, M., Yu, K., Zhang, C., Li, Z., Yang, K.: DenseASPP for semantic segmentation in street scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3684–3692 (2018)

    Google Scholar 

  34. Yang, Y., Qin, Q., Luo, Y., Liu, Y., Zhang, Q., Han, J.: Bi-directional progressive guidance network for RGB-D salient object detection. IEEE Trans. Circ. Syst. Video Technol. 32, 5346–5360 (2022)

    Article  Google Scholar 

  35. Yao, Z., Wang, L.: Boundary information progressive guidance network for salient object detection. IEEE Trans. Multimedia 24, 4236–4249 (2022)

    Article  Google Scholar 

  36. Yi, K., Zhu, J., Guo, F., Xu, J.: Cross-stage multi-scale interaction network for RGB-D salient object detection. IEEE Sig. Process. Lett. 29, 2402–2406 (2022)

    Article  Google Scholar 

  37. Zhang, J., Wang, X.: Light field salient object detection via hybrid priors. In: Proceedings of the International Conference on Multimedia Modeling, pp. 361–372 (2020)

    Google Scholar 

  38. Zhang, Y., et al.: Deep RGB-D saliency detection without depth. IEEE Trans. Multimedia 24, 755–767 (2022)

    Article  Google Scholar 

  39. Zhou, J., Wang, L., Lu, H., Huang, K., Shi, X., Liu, B.: MVSalNet: multi-view augmentation for RGB-D salient object detection. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision, ECCV 2022. LNCS, vol. 13689, pp. 270–287. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19818-2_16

  40. Zhou, W., Guo, Q., Lei, J., Yu, L., Hwang, J.N.: IRFR-Net: interactive recursive feature-reshaping network for detecting salient objects in RGB-D images. IEEE Trans. Neural Netw. Learn. Syst., 1–13 (2021)

    Google Scholar 

  41. Zhou, W., Zhu, Y., Lei, J., Wan, J., Yu, L.: CCAFNet: crossflow and cross-scale adaptive fusion network for detecting salient objects in RGB-D images. IEEE Trans. Multimedia 24, 2192–2204 (2022)

    Article  Google Scholar 

  42. Zhu, C., Li, G.: A three-pathway psychobiological framework of salient object detection using stereoscopic technology. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3008–3014 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yi, K., Tang, H., Bai, H., Wang, Y., Xu, J., Li, P. (2024). Bi-directional Interaction and Dense Aggregation Network for RGB-D Salient Object Detection. In: Rudinac, S., et al. MultiMedia Modeling. MMM 2024. Lecture Notes in Computer Science, vol 14554. Springer, Cham. https://doi.org/10.1007/978-3-031-53305-1_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-53305-1_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-53304-4

  • Online ISBN: 978-3-031-53305-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics