Skip to main content
Log in

Spatial attention-guided deformable fusion network for salient object detection

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Most of salient object detection methods employ U-shape architecture as the understructure. Although promising performance has been achieved, they struggle to detect salient objects with non-rigid shapes and arbitrary sizes. Besides, the features are transmitted to the decoder directly without any discrimination and active selection, resulting in prominent features underutilized. To address the above issues, we propose a spatial-attention-guided deformable fusion network for salient object detection, which consists of a contour enhancement module (CEM), a spatial-attention-guided deformable fusion module (SADFM) and a gate module (GM). Specifically, the CEM is designed to obtain global features, aiming to reduce the loss of high-level features in the transfer process. The SADFM develops the spatial attention to guide the deformable convolution to aggregate global features, high-level and low-level features adaptively. Furthermore, the GM is employed to refine the initial fusion features and predict the salient regions accurately. Experiments on five public datasets verify the effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability statement

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable

References

  1. Wang, H., Li, Z., Li, Y., Gupta, B.B., Choi, C.: Visual saliency guided complex image retrieval. Pattern Recogn. Lett. 130, 64–72 (2020)

    Article  Google Scholar 

  2. Zhang, Y., Gao, X., Chen, Z., Zhong, H., Li, L., Yan, C., Shen, T.: Learning salient features to prevent model drift for correlation tracking. Neurocomputing 418, 1–10 (2020)

    Article  Google Scholar 

  3. Kampffmeyer, M., Dong, N., Liang, X., Zhang, Y., Xing, E.P.: Connnet: a long-range relation-aware pixel-connectivity network for salient segmentation. IEEE Trans. Image Process. 28(5), 2518–2529 (2018)

    Article  MathSciNet  Google Scholar 

  4. Chen, Z., Zhou, H., Lai, J., Yang, L., Xie, X.: Contour-aware loss: boundary-aware learning for salient object segmentation. IEEE Trans. Image Process. 30, 431–443 (2020)

    Article  Google Scholar 

  5. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

  6. Chen, T., Hu, X., Xiao, J., Zhang, G.: Bpfinet: boundary-aware progressive feature integration network for salient object detection. Neurocomputing 451, 152–166 (2021)

    Article  Google Scholar 

  7. Hou, Q., Cheng, M.-M., Hu, X., Borji, A., Tu, Z., Torr, P.H.: Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3203–3212 (2017)

  8. Liu, J.-J., Hou, Q., Cheng, M.-M., Feng, J., Jiang, J.: A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3917–3926 (2019)

  9. Deng, J.: A large-scale hierarchical image database. In: Proceedings IEEE Computer Vision and Pattern Recognition (2009)

  10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  11. Pang, Y., Zhao, X., Zhang, L., Lu, H.: Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9413–9422 (2020)

  12. Mohammadi, S., Noori, M., Bahri, A., Majelan, S.G., Havaei, M.: Cagnet: content-aware guidance for salient object detection. Pattern Recogn. 103, 107303 (2020)

    Article  Google Scholar 

  13. Zhao, X., Pang, Y., Zhang, L., Lu, H., Zhang, L.: Suppress and balance: a simple gated network for salient object detection. In: European Conference on Computer Vision, pp. 35–51 (2020). Springer

  14. Feng, G., Bo, H., Sun, J., Zhang, L., Lu, H.: Cacnet: salient object detection via context aggregation and contrast embedding. Neurocomputing 403, 33–44 (2020)

    Article  Google Scholar 

  15. Liu, Y., Duanmu, M., Huo, Z., Qi, H., Chen, Z., Li, L., Zhang, Q.: Exploring multi-scale deformable context and channel-wise attention for salient object detection. Neurocomputing 428, 92–103 (2021)

    Article  Google Scholar 

  16. Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., Wei, Y.: Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 764–773 (2017)

  17. Zhu, X., Hu, H., Lin, S., Dai, J.: Deformable convnets v2: more deformable, better results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9308–9316 (2019)

  18. Lee, G., Tai, Y.-W., Kim, J.: Deep saliency with encoded low level distance map and high level features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 660–668 (2016)

  19. Tang, Y., Wu, X., Bu, W.: Deeply-supervised recurrent convolutional neural network for saliency detection. In: Proceedings of the 24th ACM International Conference on Multimedia, pp. 397–401 (2016)

  20. Zhao, T., Wu, X.: Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3085–3094 (2019)

  21. Chen, Z., Xu, Q., Cong, R., Huang, Q.: Global context-aware progressive aggregation network for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 10599–10606 (2020)

  22. Luo, Z., Mishra, A., Achkar, A., Eichel, J., Li, S., Jodoin, P.-M.: Non-local deep features for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6609–6617 (2017)

  23. Yoon, Y., Jeon, H.-G., Yoo, D., Lee, J.-Y., Kweon, I.S.: Light-field image super-resolution using convolutional neural network. IEEE Signal Process. Lett. 24(6), 848–852 (2017)

    Article  Google Scholar 

  24. Shim, G., Park, J., Kweon, I.S.: Robust reference-based super-resolution with similarity-aware deformable convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8425–8434 (2020)

  25. Song, H., Xu, W., Liu, D., Liu, B., Liu, Q., Metaxas, D.N.: Multi-stage feature fusion network for video super-resolution. IEEE Trans. Image Process. 30, 2923–2934 (2021)

    Article  Google Scholar 

  26. Tian, Y., Zhang, Y., Fu, Y., Xu, C.: Tdan: temporally-deformable alignment network for video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3360–3369 (2020)

  27. Wu, S., Xu, Y.: Dsn: a new deformable subnetwork for object detection. IEEE Trans. Circuits Syst. Video Technol. 30(7), 2057–2066 (2019)

    Google Scholar 

  28. Zhang, C., Kim, J.: Object detection with location-aware deformable convolution and backward attention filtering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9452–9461 (2019)

  29. Liu, W., Song, Y., Chen, D., He, S., Yu, Y., Yan, T., Hancke, G.P., Lau, R.W.: Deformable object tracking with gated fusion. IEEE Trans. Image Process. 28(8), 3766–3777 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  30. Li, F., Zheng, J., Zhang, Y.-F., Liu, N., Jia, W.: Amdfnet: adaptive multi-level deformable fusion network for rgb-d saliency detection. Neurocomputing 465, 141–156 (2021)

    Article  Google Scholar 

  31. Zeng, X., Ouyang, W., Yang, B., Yan, J., Wang, X.: Gated bi-directional cnn for object detection. In: European Conference on Computer Vision, pp. 354–369 (2016). Springer

  32. Zhang, L., Dai, J., Lu, H., He, Y., Wang, G.: A bi-directional message passing model for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1741–1750 (2018)

  33. Gupta, A.K., Seal, A., Khanna, P., Yazidi, A., Krejcar, O.: Gated contextual features for salient object detection. IEEE Trans. Instrum. Meas. PP(99), 1–1 (2021)

    Google Scholar 

  34. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)

  35. Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)

  36. Peng, C., Zhang, X., Yu, G., Luo, G., Sun, J.: Large kernel matters—improve semantic segmentation by global convolutional network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4353–4361 (2017)

  37. Máttyus, G., Luo, W., Urtasun, R.: Deeproadmapper: extracting road topology from aerial images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3438–3446 (2017)

  38. De Boer, P.-T., Kroese, D.P., Mannor, S., Rubinstein, R.Y.: A tutorial on the cross-entropy method. Ann. Oper. Res. 134(1), 19–67 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  39. Wang, L., Lu, H., Wang, Y., Feng, M., Wang, D., Yin, B., Ruan, X.: Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 136–145 (2017)

  40. Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.-H.: Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3166–3173 (2013)

  41. Li, Y., Hou, X., Koch, C., Rehg, J.M., Yuille, A.L.: The secrets of salient object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 280–287 (2014)

  42. Li, G., Yu, Y.: Visual saliency based on multiscale deep features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5455–5463 (2015)

  43. Yan, Q., Xu, L., Shi, J., Jia, J.: Hierarchical saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1155–1162 (2013)

  44. Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1597–1604 (2009). IEEE

  45. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  46. Zhang, P., Wang, D., Lu, H., Wang, H., Ruan, X.: Amulet: aggregating multi-level convolutional features for salient object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 202–211 (2017)

  47. Liu, N., Han, J., Yang, M.-H.: Picanet: learning pixel-wise contextual attention for saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3089–3098 (2018)

  48. Wang, W., Zhao, S., Shen, J., Hoi, S.C., Borji, A.: Salient object detection with pyramid attention and salient edges. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1448–1457 (2019)

  49. Li, J., Pan, Z., Liu, Q., Cui, Y., Sun, Y.: Complementarity-aware attention network for salient object detection. IEEE Trans. Cybern. 52(2), 873–886 (2020)

  50. Liu, J., Wang, H., Yan, C., Yuan, M., Su, Y.: Soda\(^2\): salient object detection with structure-adaptive & scale-adaptive receptive field. IEEE Access 8, 204160–204172 (2020)

    Article  Google Scholar 

  51. Zhou, S., Wang, J., Wang, L., Zhang, J., Wang, F., Huang, D., Zheng, N.: Hierarchical and interactive refinement network for edge-preserving salient object detection. IEEE Trans. Image Process. 30, 1–14 (2020)

    Article  MathSciNet  Google Scholar 

  52. Wu, Z., Su, L., Huang, Q.: Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3907–3916 (2019)

  53. Luo, H., Han, G., Wu, X., Liu, P., Yang, H., Zhang, X.: Lf3net: leader-follower feature fusing network for fast saliency detection. Neurocomputing 449, 24–37 (2021)

    Article  Google Scholar 

  54. Sun, L., Chen, Z., Wu, Q.J., Zhao, H., He, W., Yan, X.: Ampnet: average-and max-pool networks for salient object detection. IEEE Trans. Circuits Syst. Video Technol. 31(11), 4321–4333 (2021)

    Article  Google Scholar 

  55. Li, X., Yang, F., Cheng, H., Liu, W., Shen, D.: Contour knowledge transfer for salient object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 355–370 (2018)

  56. Ren, J., Wang, Z., Ren, J.: Ps-net: progressive selection network for salient object detection. Cogn. Comput. 14(2),794–804 (2022)

  57. Sun, J., Yan, S., Song, X.: Qcnet: query context network for salient object detection of automatic surface inspection. Vis. Comput. 1–13 (2022). https://doi.org/10.1007/s00371-022-02597-w.

  58. Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)

Download references

Funding

This work was supported by the National Key Research and Development Program of China (Grant no. 2022ZD0160402) and the National Natural Science Foundation of China (Grant nos. 62071323 and 62176178). We gratefully acknowledge the support from Shanghai Artificial Intelligence Laboratory.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, A-PY; methodology, A-PY and Y-L; software, Y-L and S-MC; validation, Y-L and S-MC; formal analysis, A-PY and Y-L; investigation, A-PY and Y-L; resources, A-PY and Y-L; data curation, Y-L and S-MC; writing—original draft preparation, Y-L; writing—review and editing, A-PY, J-LC, Z-J and Y-WP; visualization, Y-L; supervision, A-PY; project administration, A-PY; funding acquisition, A-PY, Z-J and Y-WP. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Aiping Yang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Communicated by B. Bao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, A., Liu, Y., Cheng, S. et al. Spatial attention-guided deformable fusion network for salient object detection. Multimedia Systems 29, 2563–2573 (2023). https://doi.org/10.1007/s00530-023-01152-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-023-01152-4

Keywords

Navigation