Abstract
Traditional U-Net framework generates multi-level features by the successive convolution and pooling operations, and then decodes the saliency cue by progressive upsampling and skip connection. The multi-level features are generated from the same input source, but quite different with each other. In this paper, we explore the complementarity among multi-level features, and decode them by Bi-GRU. Since multi-level features are different in the size, we first propose scale adjustment module to organize multi-level features into sequential data with the same channel and resolution. The core unit SAGRU of Bi-GRU is then devised based on self-attention, which can effectively fuse the history and current input. Based on the designed SAGRU, we further present the bidirectional decoding fusion module, which decoding the multi-level features in both down-top and top-down manners. The proposed bidirectional gated recurrent decoding network is applied in the RGB-D salient object detection, which leverages the depth map as a complementary information. Concretely, we put forward depth guided residual module to enhance the color feature. Experimental results demonstrate our method outperforms the state-of-the-art methods in the six popular benchmarks. Ablation studies also verify each module plays an important role.






Similar content being viewed by others
References
Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. In: 2009 IEEE conference on computer vision and pattern recognition, IEEE, pp 1597–1604
Ahmadi M, Karimi N, Samavi S (2021) Context-aware saliency detection for image retargeting using convolutional neural networks. Multi Tools and Appl 80(8):11917–11941
Ballas N, Yao L, Pal C, Courville A (2015) Delving deeper into convolutional networks for learning video representations. arXiv:1511.06432
Bani N T, Fekri-Ershad S (2019) Content-based image retrieval based on combination of texture and colour information extracted in spatial and frequency domains. The electronic library
Bardhan S (2020) Salient object detection by contextual refinement. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 356–357
Bardhan S, Das S, Jacob S (2019) Visual Saliency Detection via Convolutional Gated Recurrent Units. In: International conference on neural information processing, Springer, pp 162–174
Borji A, Cheng M-M, Jiang H, Li J (2015) Salient object detection: A benchmark. IEEE Trans Image Processing 24(12):5706–5722
Chen C, Wei J, Peng C, Qin H (2021) Depth-quality-aware salient object detection. IEEE Trans Image Process 30:2350–2363
Chen H, Deng Y, Li Y, Hung T-Y, Lin G (2020) RGBD salient object detection via disentangled cross-modal fusion. IEEE Trans Image Process 29:8407–8416
Chen H, Li Y (2018) Progressively complementarity-aware fusion network for RGB-D salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3051–3060
Chen Q, Fu K, Liu Z, Chen G, Du H, Qiu B, Shao L (2020) EF-Net: A novel enhancement and fusion network for RGB-D saliency detection. Pattern Recogn, p 107740
Chen Q, Liu Z, Zhang Y, Fu K, Zhao Q, Du H (2021) Rgb-d salient object detection via 3d convolutional neural. AAAI
Chen S, Fu Y (2020) Progressively guided alternate refinement network for RGB-D salient object detection. In: European conference on computer vision, Springer, pp 520–538
Chen S, Zhu X, Liu W, He X, Liu J (2021) Global-Local Propagation Network for RGB-D Semantic Segmentation. arXiv:2101.10801
Chen Z, Cong R, Xu Q, Huang Q (2020) DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection. IEEE Trans Image Process
Cheng Y, Cai R, Li Z, Zhao X, Huang K (2017) Locality-sensitive deconvolution networks with gated fusion for rgb-d indoor semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3029–3037
Cheng Y, Fu H, Wei X, Xiao J, Cao X (2014) Depth enhanced saliency detection method. In: Proceedings of international conference on internet multimedia computing and service, pp 23–27
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078
Fan D-P, Cheng M-M, Liu Y, Li T, Borji A (2017) Structure-measure: A new way to evaluate foreground maps. In: Proceedings of the IEEE international conference on computer vision, pp 4548–4557
Fan D-P, Gong C, Cao Y, Ren B, Cheng M-M, Borji A (2018) Enhanced-alignment measure for binary foreground map evaluation. arXiv:1805.10421
Fan D-P, Lin Z, Zhang Z, Zhu M, Cheng M-M (2020) Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks. IEEE transactions on neural networks and learning systems
Fan D-P, Zhai Y, Borji A, Yang J, Shao L (2020) BBS-Net: RGB-D salient object detection with a bifurcated backbone strategy network. In: European conference on computer vision, Springer, pp 275–292
Feng D, Barnes N, You S (2017) HOSO: Histogram Of Surface Orientation for RGB-D Salient Object Detection. In: Digital image computing: techniques and applications (DICTA), IEEE, pp 1–8
Feng D, Barnes N, You S, McCarthy C (2016) Local background enclosure for RGB-D salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2343–2350
Fu K, Fan D-P, Ji G-P, Zhao Q (2020) JL-DCF: Joint learning and densely-cooperative fusion framework for rgb-d salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3052–3062
Fu K, Fan D-P, Ji G-P, Zhao Q, Shen J, Zhu C (2021) Siamese network for RGB-D salient object detection and beyond. IEEE transactions on pattern analysis and machine intelligence
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang H, Cai M, Lin L, Zheng J, Mao X, Qian X, Peng Z, Zhou J, Iwamoto Y, Han X-H et al (2021) Graph-based Pyramid Global Context Reasoning with a Saliency-aware Projection for COVID-19 Lung Infections Segmentation. In: ICASSP 2021-2021 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 1050–1054
Huang Z, Chen H-X, Zhou T, Yang Y-Z, Liu B-Y (2021) Multi-level cross-modal interaction network for RGB-D salient object detection. Neurocomputing 452:200–211
Ji W, Li J, Zhang M, Piao Y, Lu H (2020) Accurate rgb-d salient object detection via collaborative learning. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVIII 16, Springer, pp 52–69
Jiang B, Zhou Z, Wang X, Tang J, Luo B (2020) cmSalGAN: RGB-D Salient Object Detection with Cross-View Generative Adversarial Networks. IEEE Trans Multi
Jiang Q, Shao F, Lin W, Gu K, Jiang G, Sun H (2017) Optimizing multistage discriminative dictionaries for blind image quality assessment. IEEE Trans Multi 20(8):2035–2048
Ju R, Ge L, Geng W, Ren T, Wu G (2014) Depth saliency based on anisotropic center-surround difference. In: 2014 IEEE international conference on image processing (ICIP), IEEE, pp 1115–1119
Ju R, Ge L, Geng W, Ren T, Wu G (2014) Depth saliency based on anisotropic center-surround difference. In: Image processing (ICIP), IEEE, pp 1115–1119
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv:1412.6980
Li B, Sun Z, Li Q, Wu Y, Hu A (2019) Group-wise deep object co-segmentation with co-attention recurrent neural network. In: Proceedings of the IEEE international conference on computer vision, pp 8519–8528
Li B, Sun Z, Tang L, Sun Y, Shi J (2019) Detecting robust co-saliency with recurrent co-attention neural network.. In: IJCAI, pp 818–825
Li C, Cong R, Kwong S, Hou J, Fu H, Zhu G, Zhang D, Huang Q (2020) ASIF-Net: Attention steered interweave fusion network for RGB-D salient object detection. IEEE Transactions on Cybernetics
Li C, Cong R, Piao Y, Xu Q, Loy CC (2020) RGB-D salient object detection with cross-modality modulation and selection. In: European conference on computer vision, Springer, pp 225–241
Li G, Liu Z, Ling H (2020) ICNet: Information conversion network for rgb-d based salient object detection. IEEE Trans Image Process 29:4873–4884
Li G, Liu Z, Ye L, Wang Y, Ling H (2020) Cross-modal weighting network for RGB-D salient object detection. In: Computer vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVII 16, Springer, pp 665–681
Liao G, Gao W, Jiang Q, Wang R, Li G (2020) MMNet: Multi-stage and multi-scale fusion network for rgb-d salient object detection. In: Proceedings of the 28th ACM international conference on multimedia, pp 2436–2444
Liu N, Zhang N, Han J (2020) Learning selective self-mutual attention for rgb-d saliency detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13756–13765
Liu N, Zhang N, Shao L, Han J (2020) Learning selective mutual attention and contrast for rgb-d saliency detection. arXiv:2010.05537
Liu Z, Shi S, Duan Q, Zhang W, Zhao P (2019) Salient object detection for RGB-D image by single stream recurrent convolution neural network. Neurocomputing 363:46–57
Liu Z, Zhang W, Zhao P (2020) A cross-modal adaptive gated fusion generative adversarial network for RGB-D salient object detection. Neurocomputing 387:210–220
Nie D, Xue J, Ren X (2020) Bidirectional pyramid networks for semantic segmentation. In: Proceedings of the asian conference on computer vision
Niu Y, Geng Y, Li X, Liu F (2012) Leveraging stereopsis for saliency analysis. In: 2012 IEEE conference on computer vision and pattern recognition, IEEE, pp 454–461
Niu Y, Long G, Liu W, Guo W, He S (2020) Boundary-aware RGBD salient object detection with cross-modal feature sampling. IEEE Trans Image Process 29:9496–9507
Ouerhani N, Hugli H (2000) Computing visual attention from scene depth. In: Proceedings 15th international conference on pattern recognition. ICPR-2000, vol 1, IEEE, pp 375–378
Pahuja A, Majumder A, Chakraborty A, Babu RV (2019) Enhancing salient object segmentation through attention. In: CVPR workshops, pp 27–36
Pan L, Zhou X, Shi R, Zhang J, Yan C (2020) Cross-modal feature extraction and integration based RGBD saliency detection. Image Vis Comput 101:103964
Pang Y, Zhang L, Zhao X, Lu H (2020) Hierarchical dynamic filtering network for RGB-D salient object detection. In: Computer Vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, proceedings, Part XXV 16, Springer, pp 235–252
Peng H, Li B, Xiong W, Hu W, Ji R (2014) RGBD salient object detection: a benchmark and algorithms. In: European conference on computer vision, Springer, pp 92–109
Peng P, Li Y-J (2020) A unified structure for efficient rgb and rgb-d salient object detection. arXiv:2012.00437
Perazzi F, Krähenbühl P, Pritch Y, Hornung A (2012) Saliency filters: Contrast based filtering for salient region detection. In: 2012 IEEE conference on computer vision and pattern recognition, IEEE, pp 733–740
Piao Y, Ji W, Li J, Zhang M, Lu H (2019) Depth-induced multi-scale recurrent attention network for saliency detection. In: Proceedings of the IEEE international conference on computer vision, pp 7254–7263
Piao Y, Rong Z, Zhang M, Ren W, Lu H (2020) A2dele: adaptive and attentive depth distiller for efficient rgb-d salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9060–9069
Pinheiro PO, Lin T-Y, Collobert R, Dollár P (2016) Learning to refine object segments. In: European conference on computer vision, Springer, pp 75–91
Ren J, Gong X, Yu L, Zhou W, Ying Yang M (2015) Exploiting global priors for RGB-D saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 25–32
Ronneberger O, Fischer P, Brox T (2015) U-Net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 234–241
Shi Z, Shen X, Chen H, Lyu Y (2020) Global semantic consistency network for image manipulation detection. IEEE Signal Processing Letters 27:1755–1759
Sun L, Yang K, Hu X, Hu W, Wang K (2020) Real-time fusion network for RGB-D semantic segmentation incorporating unexpected obstacle detection for road-driving images. IEEE Robotics and Automation Letters 5 (4):5558–5565
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Wang N, Gong X (2019) Adaptive fusion for RGB-D salient object detection. IEEE Access 7:55277–55284
Wang S-T, Zhou Z, Qu H-B, Li B (2016) Visual saliency detection for RGB-D images with generative model. In: Asian conference on computer vision, Springer, pp 20–35
Wang W, Shen J, Cheng M-M, Shao L (2019) An iterative and cooperative top-down and bottom-up inference network for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5968–5977
Wang W, Shen J, Yang R, Porikli F (2017) Saliency-aware video object segmentation. IEEE Trans Pattern Analysis and Machine Intelligence 40 (1):20–33
Wang X, Li S, Chen C, Fang Y, Hao A, Qin H (2020) Data-level recombination and lightweight fusion scheme for RGB-D salient object detection. IEEE Trans Image Process 30:458–471
Wang Y, Li Y, Elder JH, Wu R, Lu H, Zhang L (2020) Synergistic saliency and depth prediction for RGB-D saliency detection. In: Proceedings of the asian conference on computer vision, pp 1–17
Wei J, Wang S, Huang Q (2020) F3Net: Fusion, feedback and focus for salient object detection. In: Proceedings of the AAAI conference on artificial intelligence, pp 12321–12328
Weng Z, Li W, Jin Z (2021) Human activity prediction using saliency-aware motion enhancement and weighted LSTM network. EURASIP J Image and Video Process 2021(1):1–23
Woo S, Park J, Lee J-Y, So Kweon I (2018) CBAM: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
Wu Y-H, Liu Y, Xu J, Bian J-W, Gu Y, Cheng M-M (2020) MobileSal: Extremely Efficient RGB-D Salient Object Detection. arXiv:2012.13095
Wu Z, Su L, Huang Q (2019) Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3907–3916
Xu Y, Xu W, Wang M, Li L, Sang G, Wei P, Zhu L (2021) Saliency aware image cropping with latent region pair. Expert Syst Appl 171:114596
Yarlagadda SK, Montserrat DM, Guerra D, Boushey CJ, Kerr DA, Zhu F (2021) Saliency-aware class-agnostic food image segmentation. arXiv:2102.06882
Zeng J, Tong Y, Huang Y, Yan Q, Sun W, Chen J, Wang Y (2019) Deep surface normal estimation with hierarchical rgb-d fusion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6153–6162
Zhang C, Li G, Lin G, Wu Q, Yao R (2021) Cyclesegnet: Object co-segmentation with cycle refinement and region correspondence. IEEE Trans Image Process
Zhang J, Fan D-P, Dai Y, Anwar S, Saleh FS, Zhang T, Barnes N (2020) UC-Net: uncertainty inspired rgb-d saliency detection via conditional variational autoencoders. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8582–8591
Zhang M, Fei SX, Liu J, Xu S, Piao Y, Lu H (2020) Asymmetric two-stream architecture for accurate RGB-D saliency detection. In: European conference on computer vision, Springer, pp 374–390
Zhang M, Ren W, Piao Y, Rong Z, Lu H (2020) Select, supplement and focus for RGB-D saliency detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3472–3481
Zhang M, Zhang Y, Piao Y, Hu B, Lu H (2020) Feature reintegration over differential treatment: A top-down and adaptive fusion network for RGB-D salient object detection. In: Proceedings of the 28th ACM international conference on multimedia, pp 4107–4115
Zhang P, Liu W, Wang D, Lei Y, Wang H, Lu H (2020) Non-rigid object tracking via deep multi-scale spatial-temporal discriminative saliency maps. Pattern Recogn 100:107130
Zhang Q, Cong R, Li C, Cheng M-M, Fang Y, Cao X, Zhao Y, Kwong S (2020) Dense attention fluid network for salient object detection in optical remote sensing images. IEEE Trans Image Process
Zhang X, Jin T, Zhou W, Lei J (2021) Attention-based contextual interaction asymmetric network for RGB-D saliency prediction. J Vis Commun Image Represent 74:102997
Zhang Y-, Zheng J, Li L, Liu N, Jia W, Fan X, Xu C, He X (2021) Rethinking feature aggregation for deep RGB-D salient object detection. Neurocomputing 423:463–473
Zhang Z, Lin Z, Xu J, Jin W-D, Lu S-P, Fan D-P (2021) Bilateral attention network for RGB-D salient object detection. IEEE Trans Image Process 30:1949–1961
Zhao J-X, Cao Y, Fan D-P, Cheng M-M, Li X-Y, Zhang L (2019) Contrast prior and fluid pyramid integration for RGBD salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3927–3936
Zhao J, Zhao Y, Li J, Chen X (2020) Is depth really necessary for salient object detection?. In: Proceedings of the 28th ACM international conference on multimedia, pp 1745–1754
Zhao X, Pang Y, Zhang L, Lu H, Ruan X (2021) Self-supervised representation learning for rgb-d salient object detection. arXiv:2101.12482
Zhao X, Zhang L, Pang Y, Lu H, Zhang L (2020) A single stream network for robust and real-time rgb-d salient object detection. In: European conference on computer vision, Springer, pp 646–662
Zhou T, Fan D-P, Cheng M-M, Shen J, Shao L (2021) RGB-D salient object detection: A survey. Computational Visual Media, pp 1–33
Zhou W, Chen Y, Liu C, Yu L (2020) GFNet: Gate Fusion Network with Res2Net for Detecting Salient Objects in RGB-D Images. IEEE Signal Process Letters
Zhou X, Li G, Gong C, Liu Z, Zhang J (2020) Attention-guided RGBD saliency detection using appearance information. Image Vis Comput 95:103888
Zhu C, Cai X, Huang K, Li T H, Li G (2019) PDNet: Prior-model guided depth-enhanced network for salient object detection. In: 2019 IEEE International conference on multimedia and expo (ICME), IEEE, pp 199–204
Acknowledgment
This work is supported by Natural Science Foundation of Anhui Province (1908085MF182) and University Natural Science Research Project of Anhui Province(KJ2019A0034).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
We declare that we have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, Z., Wang, Y., Zhang, Z. et al. BGRDNet: RGB-D salient object detection with a bidirectional gated recurrent decoding network. Multimed Tools Appl 81, 25519–25539 (2022). https://doi.org/10.1007/s11042-022-12799-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12799-y