Skip to main content
Log in

Dual-stream encoded fusion saliency detection based on RGB and grayscale images

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Existing saliency algorithms based on deep learning are not sufficient to extract features of images. And the features are fused only during decoding. As a result, the edge of saliency detection result is not clear and the internal structure display is not uniform. To solve the above problems, this paper proposes a saliency detection method of dual-stream encoding fusion based on RGB and grayscale image. Firstly, an interactive dual-stream encoder is constructed to extract the feature information of gray stream and RGB stream. Secondly, a multi-level fusion strategy is used to obtain more effective multi-scale features. These features are extended and optimized in the decoding stage by linear transformation with hybrid attention. Finally, We propose a hybrid weighted loss function. So that the prediction results of the model can keep a high level accuracy at pixel level and region level. The experimental results of the model proposed to this paper on 6 public datasets illustrate that: The prediction results of the proposed method are clearer about the edge of salient targets and more uniform within salient targets. And has a more lightweight model size.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 1597–1604

  2. Boer P, Kroese DP, Mannor S, Rubinstein RY (2005) A tutorial on the cross-entropy method. Ann Oper Res 134(1):19–67

    Article  MathSciNet  MATH  Google Scholar 

  3. Borji A, Cheng M-M, Jiang H, Li J (2015) Salient object detection: a benchmark. IEEE Trans Image Process 24(12):5706–5722

    Article  MathSciNet  MATH  Google Scholar 

  4. Chen H, Li Y (2018) Progressively complementarity-aware fusion network for rgb-d salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3051–3060

  5. Cheng M-M, Mitra NJ, Huang X, Torr PH, Hu S-M (2014) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37 (3):569–582

    Article  Google Scholar 

  6. Fan D-P, Lin Z, Zhang Z, Zhu M, Cheng M-M (2020) Rethinking rgb-d salient object detection: models, data sets, and large-scale benchmarks. IEEE Trans Neural Netw Learn Syst 32(5):2075–2089

    Article  Google Scholar 

  7. Feng M, Lu H, Ding E (2019) Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1623–1632

  8. Feng X, Zhou S, Zhu Z, Wang L, Hua G (2022) Local to global feature learning for salient object detection. Pattern Recogn Lett 162:81–88

    Article  Google Scholar 

  9. Feng G, Meng J, Zhang L, Lu H (2022) Encoder deep interleaved network with multi-scale aggregation for rgb-d salient object detection. Pattern Recogn 128:108666

    Article  Google Scholar 

  10. Gao Y, Shi M, Tao D, Xu C (2015) Database saliency for fast image retrieval. IEEE Trans Multimed 17(3):359–359

    Article  Google Scholar 

  11. Goferman S, Zelnik-Manor L, Tal A (2012) Context-aware saliency detection. IEEE Trans Pattern Anal Mach Intell 34(10):1915–1926

    Article  Google Scholar 

  12. Golner MA, Mikhael WB, Krishnang V (2002) Modified jpeg image compression with region-dependent quantization. Circ Syst Signal Process 21(2):163–180

    Article  MATH  Google Scholar 

  13. Guo C, Zhang L (2010) A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans Image Process 19(1):185–198

    Article  MathSciNet  MATH  Google Scholar 

  14. Han K, Wang Y, Tian Q, Guo J, Xu C, Xu C (2020) Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1580–1589

  15. Hou Q, Cheng M-M, Hu X, Borji A, Tu Z, Torr PH (2017) Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3203–3212

  16. Jerripothula KR, Cai J, Yuan J (2016) Image co-segmentation via saliency co-fusion. IEEE Press

  17. Jiang H, Wang J, Yuan Z, Wu Y, Zheng N, Li S (2013) Salient object detection: a discriminative regional feature integration approach. IEEE conference on computer vision & pattern recognition

  18. Lee G, Tai Y-W, Kim J (2016) Deep saliency with encoded low level distance map and high level features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 660–668

  19. Li G, Yu Y (2015) Visual saliency based on multiscale deep features. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 5455–5463

  20. Li Y, Hou X, Koch C, Rehg JM, Yuille AL (2014) The secrets of salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 280–287

  21. Li G, Xie Y, Lin L, Yu Y (2017) Instance-level salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2386–2395

  22. Liu N, Han J (2016) Dhsnet: deep hierarchical saliency network for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 678–686

  23. Liu Z, Shi R, Shen L, Xue Y, Ngan KN, Zhang Z (2012) Unsupervised salient object segmentation based on kernel density estimation and two-phase graph cut. IEEE Trans Multimed 14(4):1275–1289

    Article  Google Scholar 

  24. Liu J, Yuan M, Huang X, Su Y, Yang X (2022) Diponet: dual-information progressive optimization network for salient object detection. Digit Signal Process 126:103425

    Article  Google Scholar 

  25. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  26. Luo Z, Mishra A, Achkar A, Eichel J, Jodoin PM (2017) Non-local deep features for salient object detection. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR)

  27. Mahadevan V, Vasconcelos N (2009) Saliency-based discriminant tracking. In: IEEE Conference on computer vision & pattern recognition

  28. Movahedi V, Elder JH (2010) Design and perceptual validation of performance measures for salient object segmentation. In: 2010 IEEE computer society conference on computer vision and pattern recognition-workshops. IEEE, pp 49–56

  29. Mu X, Qi H, Li X (2019) Automatic segmentation of images with superpixel similarity combined with deep learning. Circ Syst Signal Process 39(3)

  30. Ojala T, Pietikäinen M, Harwood D (1996) A comparative study of texture measures with classification based on featured distributions. Pattern Recognit 29(1):51–59

    Article  Google Scholar 

  31. Pang Y, Zhao X, Zhang L, Lu H (2020) Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9413–9422

  32. Pinheiro PO, Lin T-Y, Collobert R, Dollár P (2016) Learning to refine object segments. In: European conference on computer vision. Springer, pp 75–91

  33. Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: boundary-aware salient object detection. In: Proceedings of the IEEE/CVf conference on computer vision and pattern recognition, pp 7479–7489

  34. Qin X, Zhang Z, Huang C, Dehghan M, Zaiane OR, Jagersand M (2020) U2-net: going deeper with nested u-structure for salient object detection. Pattern Recogn 106:107404

    Article  Google Scholar 

  35. Qu L, He S, Zhang J, Tian J, Tang Y, Yang Q (2017) Rgbd salient object detection via deep fusion. IEEE Trans Image Process 26 (5):2274–2285. https://doi.org/10.1109/TIP.2017.2682981

    Article  MathSciNet  MATH  Google Scholar 

  36. Ren Z, Gao S, Chia L, Tsang IW (2014) Region-based saliency detection and its application in object recognition. IEEE Trans Circ Syst Video Technol 24(5):769–779

    Article  Google Scholar 

  37. Ren J, Wang Z, Ren J (2022) Ps-net: progressive selection network for salient object detection. Cognit Comput, pp 1–11

  38. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241

  39. Rutishauser U, Walther D, Koch C, Perona P (2004) Is bottom-up attention useful for object recognition?. In: IEEE Computer society conference on computer vision & pattern recognition

  40. Wang Z, Simoncelli EP, Bovik AC (2003) Multiscale structural similarity for image quality assessment. In: The thrity-seventh asilomar conference on signals, systems & computers, vol 2. IEEE, pp 1398–1402

  41. Wang X, Han TX, Yan S (2009) An hog-lbp human detector with partial occlusion handling. In: 2009 IEEE 12th international conference on computer vision. IEEE, pp 32–39

  42. Wang L, Lu H, Ruan X, Yang M -H (2015) Deep networks for saliency detection via local estimation and global search. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3183–3192

  43. Wang L, Lu H, Wang Y, Feng M, Wang D, Yin B, Ruan X (2017) Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 136–145

  44. Wang T, Zhang L, Wang S, Lu H, Yang G, Ruan X, Borji A (2018) Detect globally, refine locally: a novel approach to saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3127–3135

  45. Wang X, Zhu L, Tang S, Fu H, Li P, Wu F, Yang Y, Zhuang Y (2022) Boosting rgb-d saliency detection by leveraging unlabeled rgb images. IEEE Trans Image Process 31:1107–1119

    Article  Google Scholar 

  46. Wu Z, Su L, Huang Q (2019) Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3907–3916

  47. Wu R, Feng M, Guan W, Wang D, Lu H, Ding E (2020) A mutual learning method for salient object detection with intertwined multi-supervision. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  48. Xu C, Li N, Zhang S, Wu Z (2014) Robust visual tracking with sift features and fragments based on particle swarm optimization. Circ Syst 33(5):1507–1526

    Google Scholar 

  49. Yan Q, Xu L, Shi J, Jia J (2013) Hierarchical saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1155–1162

  50. Yang C, Zhang L, Lu H, Ruan X, Yang M-H (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3166–3173

  51. Zhang YY, Wang ZP, Lv XD (2016) Saliency detection via sparse reconstruction errors of covariance descriptors on riemannian manifolds. Circ Syst Signal Process

  52. Zhang J, Fan D-P, Dai Y, Anwar S, Saleh FS, Zhang T, Barnes N (2020) Uc-net: uncertainty inspired rgb-d saliency detection via conditional variational autoencoders. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8582–8591

  53. Zhang J, Fan D-P, Dai Y, Yu X, Zhong Y, Barnes N, Shao L (2021) Rgb-d saliency detection via cascaded mutual information minimization. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4338–4347

  54. Zhang H, Wu C, Zhang Z, Zhu Y, Lin H, Zhang Z, Sun Y, He T, Mueller J, Manmatha R (2022) Resnest: Split-attention networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2736–2746

  55. Zhao R, Ouyang W, Li H, Wang X (2015) Saliency detection by multi-context deep learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1265–1274

  56. Zhou H, Xie X, Lai J-H, Chen Z, Yang L (2020) Interactive two-stream decoder for accurate and fast saliency detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9141–9150

  57. Zhu W, Liang S, Wei Y, Sun J (2014) Saliency optimization from robust background detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2814–2821

Download references

Acknowledgments

This work was supported by the Major Science and Technology Project in Henan Province [221100110500], Science and Technology Project of Henan Province [222102320380, 222102110194, 212102210161], the National Key Research and Development Project [2019YFB1311000].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tao Xu.

Ethics declarations

Conflict of interest

The authors declare no conflicts of interest regarding the publication of this paper. And The datasets generated and/or analyzed during the present study are available from the corresponding author on reasonable request.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Weishuo Zhao contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, T., Zhao, W., Chai, H. et al. Dual-stream encoded fusion saliency detection based on RGB and grayscale images. Multimed Tools Appl 82, 47327–47346 (2023). https://doi.org/10.1007/s11042-023-15217-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15217-z

Keywords

Navigation