Skip to main content

Asymmetric Two-Stream Architecture for Accurate RGB-D Saliency Detection

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12373))

Included in the following conference series:

Abstract

Most existing RGB-D saliency detection methods adopt symmetric two-stream architectures for learning discriminative RGB and depth representations. In fact, there is another level of ambiguity that is often overlooked: if RGB and depth data are necessary to fit into the same network. In this paper, we propose an asymmetric two-stream architecture taking account of the inherent differences between RGB and depth data for saliency detection. First, we design a flow ladder module (FLM) for the RGB stream to fully extract global and local information while maintaining the saliency details. This is achieved by constructing four detail-transfer branches, each of which preserves the detail information and receives global location information from representations of other vertical parallel branches in an evolutionary way. Second, we propose a novel depth attention module (DAM) to ensure depth features with high discriminative power in location and spatial structure being effectively utilized when combined with RGB features in challenging scenes. The depth features can also discriminatively guide the RGB features via our proposed DAM to precisely locate the salient objects. Extensive experiments demonstrate that our method achieves superior performance over 13 state-of-the-art RGB-D approaches on the 7 datasets. Our code will be publicly available.

S. X. Fei and J. Liu—Denotes equal contributions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Achanta, R., Hemami, S.S., Estrada, F.J., Süsstrunk, S.: Frequency-tuned salient region detection. In: CVPR, pp. 1597–1604 (2009)

    Google Scholar 

  2. Borji, A., Frintrop, S., Sihite, D.N., Itti, L.: Adaptive object tracking by learning background context. In: CVPR, pp. 23–30 (2012). https://academic.microsoft.com/paper/2158535435

  3. Borji, A., Sihite, D.N., Itti, L.: Salient object detection: a benchmark. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 414–429. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33709-3_30

    Chapter  Google Scholar 

  4. Chen, H., Li, Y.: Progressively complementarity-aware fusion network for RGB-D salient object detection. In: CVPR, pp. 3051–3060 (2018)

    Google Scholar 

  5. Chen, H., Li, Y.: Three-stream attention-aware network for RGB-D salient object detection. TIP 28(6), 2825–2835 (2019)

    MathSciNet  MATH  Google Scholar 

  6. Chen, H., Li, Y., Su, D.: Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection. PR 86, 376–385 (2019)

    Google Scholar 

  7. Cheng, M.M., Hou, Q.B., Zhang, S.H., Rosin, P.L.: Intelligent visual media processing: When graphics meets vision. JCST 32(1), 110–121 (2017). https://academic.microsoft.com/paper/2571295082

  8. Cheng, Y., Fu, H., Wei, X., Xiao, J., Cao, X.: Depth enhanced saliency detection method. In: ICIMCS, pp. 23–27 (2014)

    Google Scholar 

  9. Cong, R., Lei, J., Zhang, C., Huang, Q., Cao, X., Hou, C.: Saliency detection for stereoscopic images based on depth confidence analysis and multiple cues fusion. SPL 23(6), 819–823 (2016)

    Google Scholar 

  10. Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NIPS, pp. 379–387 (2016). https://academic.microsoft.com/paper/2407521645

  11. Desingh, K., K, M.K., Rajan, D., Jawahar, C.V.: Depth really matters: improving visual salient region detection with depth. In: BMVC (2013)

    Google Scholar 

  12. Donoser, M., Urschler, M., Hirzer, M., Bischof, H.: Saliency driven total variation segmentation. In: ICCV, pp. 817–824 (2009). https://academic.microsoft.com/paper/2546160422

  13. Fan, D.P., Cheng, M.M., Liu, Y., Li, T., Borji, A.: Structure-measure: a new way to evaluate foreground maps. In: ICCV, pp. 4558–4567 (2017). https://academic.microsoft.com/paper/2963868681

  14. Fan, D.P., Gong, C., Cao, Y., Ren, B., Cheng, M.M., Borji, A.: Enhanced-alignment measure for binary foreground map evaluation. In: IJCAI, pp. 698–704 (2018)

    Google Scholar 

  15. Fan, D.P., Wang, J., Liang, X.M.: Improving image retrieval using the context-aware saliency areas. AMM 734, 596–599 (2015). https://academic.microsoft.com/paper/2090323693

  16. Feng, M., Lu, H., Ding, E.: Attentive feedback network for boundary-aware salient object detection. In: CVPR, pp. 1623–1632 (2019). https://academic.microsoft.com/paper/2948510860

  17. Gao, Y., Wang, M., Tao, D., Ji, R., Dai, Q.: 3-D object retrieval and recognition with hypergraph analysis. TIP 21(9), 4290–4303 (2012). https://academic.microsoft.com/paper/2068078373

  18. Han, J., Chen, H., Liu, N., Yan, C., Li, X.: CNNs-based RGB-D saliency detection via cross-view transfer and multiview fusion. TSMC 48(11), 3171–3183 (2018)

    Google Scholar 

  19. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  20. Hong, S., You, T., Kwak, S., Han, B.: Online tracking by learning discriminative saliency map with convolutional neural network. In: ICML, pp. 597–606 (2015). https://academic.microsoft.com/paper/1854404533

  21. Hou, Q., Cheng, M.M., Hu, X., Borji, A., Tu, Z., Torr, P.H.S.: Deeply supervised salient object detection with short connections. CVPR. 41, 815–828 (2017)

    Google Scholar 

  22. Hou, Q., Cheng, M.M., Hu, X., Borji, A., Tu, Z., Torr, P.H.S.: Deeply supervised salient object detection with short connections. TPAMI 41(4), 815–828 (2019). https://academic.microsoft.com/paper/2569272946

  23. Ju, R., Ge, L., Geng, W., Ren, T., Wu, G.: Depth saliency based on anisotropic center-surround difference. In: ICIP, pp. 1115–1119 (2014)

    Google Scholar 

  24. Lang, C., Nguyen, T.V., Katti, H., Yadati, K., Kankanhalli, M., Yan, S.: Depth matters: influence of depth cues on visual saliency. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 101–115. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33709-3_8

    Chapter  Google Scholar 

  25. Li, G., Zhu, C.: A three-pathway psychobiological framework of salient object detection using stereoscopic technology. In: ICCVW, pp. 3008–3014 (2017). https://academic.microsoft.com/paper/2766315367

  26. Li, G., Yu, Y.: Deep contrast learning for salient object detection. In: CVPR, pp. 478–487 (2016)

    Google Scholar 

  27. Li, N., Ye, J., Ji, Y., Ling, H., Yu, J.: Saliency detection on light field. PAMI 39(8), 1605–1616 (2017)

    Article  Google Scholar 

  28. Liu, G., Fan, D.: A model of visual attention for natural image retrieval. In: ISCC-C, pp. 728–733 (2013). https://academic.microsoft.com/paper/2314707829

  29. Liu, N., Han, J.: DHSNet: deep hierarchical saliency network for salient object detection. In: CVPR, pp. 678–686 (2016)

    Google Scholar 

  30. Niu, Y., Geng, Y., Li, X., Liu, F.: Leveraging stereopsis for saliency analysis. In: CVPR, pp. 454–461 (2012)

    Google Scholar 

  31. Peng, H., Li, B., Xiong, W., Hu, W., Ji, R.: RGBD salient object detection: a benchmark and algorithms. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 92–109. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_7

    Chapter  Google Scholar 

  32. Piao, Y., Ji, W., Li, J., Zhang, M., Lu, H.: Depth-induced multi-scale recurrent attention network for saliency detection. In: ICCV (2019)

    Google Scholar 

  33. Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., Jagersand, M.: BASNet: boundary-aware salient object detection. In: CVPR, pp. 7479–7489 (2019). https://academic.microsoft.com/paper/2961348656

  34. Qu, L., He, S., Zhang, J., Tian, J., Tang, Y., Yang, Q.: RGBD salient object detection via deep fusion. TIP 26(5), 2274–2285 (2017)

    MathSciNet  MATH  Google Scholar 

  35. Ren, J., Gong, X., Yu, L., Zhou, W., Yang, M.Y.: Exploiting global priors for RGB-D saliency detection. In: CVPRW, pp. 25–32 (2015)

    Google Scholar 

  36. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. TPAMI 39(6), 1137–1149 (2017). https://academic.microsoft.com/paper/639708223

  37. Ren, Z., Gao, S., Chia, L.T., Tsang, I.W.H.: Region-based saliency detection and its application in object recognition. TCSVT 24(5), 769–779 (2014). https://academic.microsoft.com/paper/2055180303

  38. Smeulders, A.W.M., Chu, D.M., Cucchiara, R., Calderara, S., Dehghan, A., Shah, M.: Visual tracking: an experimental survey. TPAMI 36(7), 1442–1468 (2014). https://academic.microsoft.com/paper/2126302311

  39. Wang, W., Shen, J., Porikli, F.: Saliency-aware geodesic video object segmentation. In: CVPR, pp. 3395–3402 (2015)

    Google Scholar 

  40. Wang, W., Shen, J., Sun, H., Shao, L.: Video co-saliency guided co-segmentation. TCSVT 28(8), 1727–1736 (2018). https://academic.microsoft.com/paper/2887503470

  41. Wu, R., Feng, M., Guan, W., Wang, D., Lu, H., Ding, E.: A mutual learning method for salient object detection with intertwined multi-supervision. In: CVPR, pp. 8150–8159 (2019). https://academic.microsoft.com/paper/2962680827

  42. Zhang, M., et al.: LFNet: light field fusion network for salient object detection. IEEE Trans. Image Process. 29, 6276–6287 (2020)

    Article  Google Scholar 

  43. Zhang, M., Li, J., Ji, W., Piao, Y., Lu, H.: Memory-oriented decoder for light field salient object detection. In: NeurIPS 2019: Thirty-third Conference on Neural Information Processing Systems, pp. 898–908 (2019)

    Google Scholar 

  44. Zhang, P., Wang, D., Lu, H., Wang, H., Ruan, X.: Amulet: aggregating multi-level convolutional features for salient object detection. In: ICCV, pp. 202–211 (2017). https://academic.microsoft.com/paper/2963032190

  45. Zhang, X., Wang, T., Qi, J., Lu, H., Wang, G.: Progressive attention guided recurrent network for salient object detection. In: CVPR (2018)

    Google Scholar 

  46. Zhao, J.X., Cao, Y., Fan, D.P., Cheng, M.M., Li, X.Y., Zhang, L.: Contrast prior and fluid pyramid integration for RGBD salient object detection. In: CVPR, pp. 3927–3936 (2019)

    Google Scholar 

  47. Zhao, R., Ouyang, W., Li, H., Wang, X.: Saliency detection by multi-context deep learning. In: CVPR, pp. 1265–1274 (2015)

    Google Scholar 

  48. Zhu, C., Cai, X., Huang, K., Li, T.H., Li, G.: PDNet: prior-model guided depth-enhanced network for salient object detection. In: ICME (2019)

    Google Scholar 

  49. Zhu, C., Li, G., Guo, X., Wang, W., Wang, R.: A multilayer backpropagation saliency detection algorithm based on depth mining. In: CAIP, pp. 14–23 (2017)

    Google Scholar 

  50. Zhu, C., Li, G., Wang, W., Wang, R.: An innovative salient object detection using center-dark channel prior. In: ICCVW, pp. 1509–1515 (2017)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the Science and Technology Innovation Foundation of Dalian (2019J12GX034), the National Natural Science Foundation of China (61976035), and the Fundamental Research Funds for the Central Universities (DUT19JC58, DUT20JC42).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongri Piao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, M., Fei, S.X., Liu, J., Xu, S., Piao, Y., Lu, H. (2020). Asymmetric Two-Stream Architecture for Accurate RGB-D Saliency Detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12373. Springer, Cham. https://doi.org/10.1007/978-3-030-58604-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58604-1_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58603-4

  • Online ISBN: 978-3-030-58604-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics