FPANet: Feature-enhanced position attention network for semantic segmentation

Xu, Haixia; Wang, Shuailong; Huang, Yunjia; Zhou, Wei; Chen, Qi; Zhang, Dongbo

doi:10.1007/s00138-021-01246-x

FPANet: Feature-enhanced position attention network for semantic segmentation

Original Paper
Published: 25 September 2021

Volume 32, article number 119, (2021)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Haixia Xu ORCID: orcid.org/0000-0001-8587-7044¹,
Shuailong Wang¹,
Yunjia Huang¹,
Wei Zhou¹,
Qi Chen¹ &
…
Dongbo Zhang¹

501 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Attention mechanism is beneficial to capture the contextual information in visual task. This paper proposes a feature-enhanced position attention network (FPANet) for semantic segmentation based on framework of FCN. On the top of dilated FCN, we design a feature integration module, which aggregates the context over local features by expanding the receptive field and multiscale representation, to promote a position attention module, which models spatial interdependencies over features, so as to form a feature-enhanced position attention module to enhance the discrimination of features for better semantic segmentation. Experimental comparisons show that our proposed FPANet is superior to other state-of-the-art models in the performance of segmentation accuracy on datasets PASCAL VOC 2012 and Cityscapes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PPNet : pooling position attention network for semantic segmentation

Article 02 September 2023

Position attention optimized deep semantic segmentation

Article 13 September 2023

Principal Semantic Feature Analysis with Covariance Attention

Notes

https://doi.org/10.1109/CVPR.2016.90

References

Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, USA, Jun. 7–12, pp. 3440–3461 (2015)
Chen, L-C, Papandreou, G, Kokkions, L, Murphy, K, Yuille, AL.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: On Computer Vision and Pattern Recognition (CVPR), arXiv:1412.7062v3, (2015).4.9
Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
Article Google Scholar
Chen, L-C, Papandreou, G, Schroff, F, Adam, H: Rethinking Atrous Convolution for Semantic Image Segmentation. In: On Computer Vision and Pattern Recognition (CVPR), arXiv:1706.05587v3, (2017).12.5
Chen L-C, YuKun Z, George P, Florian S, Hartwig A.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: On Computer Vision and Pattern Recognition (CVPR), arXiv:1802.02611v3, (2018).8.22
Chen L-C, Papandreou G, Schroff, F, Adam, H: Rethinking atrous convolution for semantic image segmentation. In: On Computer Vision and Pattern Recognition (CVPR), arXiv preprint. arXiv:1706.05587, (2017) 2, 5, 6, 7
Ding, H, Jiang, X , Shuai, B, Liu, AQ, Wang, G: Context contrasted feature and gated multi-scale aggregation for scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2393–2402. (2018)
Ke, T-W, Hwang, J-J, Liu, Z, Yu, SX: Adaptive affinity field for semantic segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 587-602. (2018)
Zhao, H, Shi, J, Qi, X, Wang, X, Jia, J: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2881–2890 (2017)
Noh, H, Hong, S, Han, B: Learning deconvolution network for semantic segmentation. In: IEEE Conference On Computer Vision (ICCV). Santiago, Chile, Dec. 13–16, pp. 1520–1528 (2015)
Lin, G, Shen, C, Hengel, AVD, Reld, I: Efficient piecewise training of deep structured models for semantic segmentation. In: IEEE Conference On Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, Jun.26-Jul.1, pp.3194–3203. (2016)
Zhao, H, Zhang, Y , Liu, S, Shi, J, Loy, CC, Lin, D, Jia, J: Psanet: Point-wise spatial attention network for scene parsing. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 267–283. (2018)
Cheng, J, Dong, L, Lapata, M: Long short-term memory-networks for machine reading. In: On Computation and Language. arXiv preprint arXiv:1601.06733, (2016). 2
Vaswani, A, Shazeer, N, Parmar, N, Uszko-Reit, J, Jones, L, Gomez, AN, Kaiser, U, Polosukhin, I: Attention is all you need. In: On Computation and Language, p. 5998C6008, (2017). 2,3
Wang, X, Girshick, R, Gupta, A, He, K: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7794–7803. (2018)
Woo, S, Park, J, Lee, J-Y, Kweon I.S.: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19. (2018)
Huang, Z, Wang, X, Huang, L, Huang, C, Wei, Y, Liu, W: CCNet: Criss-Cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 603–612. (2019)
Fu, J, Liu, J, Tian, H, Fang, Z, Lu, H: Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3146–3154. (2019)
Mostajabi, M., Yadollahpour, P., Shakhnarovich, G.: Feed forward semantic segmentation with zoom-out features. In: IEEE Conference On Computer Vision and Pattern Recognition (CVPR). Boston, USA, Jun. 7–12, pp. 3376–3385 (2015)
Ghiasi, G, Fowlkes, CC: Laplacian pyramid reconsturction and refinement for semantic segmentation. In: European Conference on Computer Vision (ECCV), Amsterdam,The Netherlans, Oct. 8-16, pp. 519–534, (2016)
Kreso, I, Causevic, D, Krapac, J, Segvic, S: Convolution scale invariance for semantic segmentation. In: German Conference on Pattern Recognition (GCPR),Hannover, Germany, Sep. 12-15, pp. 64–75, (2016)
Liu, Z, Li, X, Luo, P, Loy, C-C., Tang, X-H: Semantic image segmentation via deep parsing network. In: IEEE Conference on Computer Vision (ICCV). Santiago, Chile, Dec. 13–16, pp. 1377–1385 (2015)
Yu, F, Koltun, V: (2015) Multi-scale context aggregation by dilated convolutions. arXiv: 1511.07122
Yuan, Y, Chen, X, Wang, J: Object-contextual representations for semantic segmentation. (ECCV) arXiv:1909.11065, (2020)
Zheng, S, Jayasumana, S, Romera-Paredes, B, et al.: Conditional random fields as recurrent neural networks. In: IEEE Conference On Computer Vision (ICCV). Santiago, Chile, Dec. 13–16, pp. 1529–1537 (2015)
Vemulapalli, R, Tuzel, O, Liu, M-Y, et al.: Gaussian conditional random field network for semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, Jun. 26-Jul.1, pp. 3224–3233, (2016)
Li, X., Meng, L., Tan, Y., et al.: Deep semantic segmentation-based multiple description coding. Multimed. Tools Appl. 80, 10323–10337 (2020)
Article Google Scholar
Gao, S.-H., Cheng, M.-M., Zhao, K., Zhang, X.-Y., Yang, M.-H., Torr, P.: Res2Net: A new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2019)
Article Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes(voc) challenge. Int. J. Comput. Vis. 88, 303–338 (2010)
Article Google Scholar
Cordts, M, Omran, M, Ramos, S, Rehfeld, T, Enzweiler, M, Benenson, R, Franke, U, Roth, S, Schiele, B: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 3213C3223, (2016). 2, 5
Zhang, H, Dana, K, Shi, J, Zhang, Z, Wang, X, Tyagi, A, Agrawal, A: Context encoding for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7151–7160. (2018)
Ronneberger, O, Fischer, P, Brox, T: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, p. 234C241.Springer, (2015). 2
Yu, C, Wang, J, Peng, C, Gao, C, Yu, G, Sang, N: Learning a discriminative feature network for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1857–1866. (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. (2016)
Xie, S., Girshick, R., Dollr, P ., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1492–1500. (2017)
Yu, F., Wang, D., Shelhamer, E., Darrell, T.: Deep layer aggregation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2403–2412. (2018)
Peng, C, Zhang, X, Yu, G, Luo, G, Sun, J: Large kernel matters - improve semantic segmentation by global convolutional network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4353-4361. (2017)
Hariharan, B., Arbelaez, P., Bourdev, L., et al.: Semantic contours from inverse detectors[C]. In: IEEE Conference on Computer Vision (ICCV). Bafcelona, Spain, Nov. 6–13, pp. 991–998 (2011)
Krizhevsky, A., Sutskever, I., Hinton, G.E., et al.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS), 470 Lake Tahoe. USA, Dec. 3–6, pp. 1097–1105 (2012)

Download references

Acknowledgements

This work was supported by the Science and Technology Plan Project of Hunan Province (2016TP1020), open fund project of Hunan Provincial Key Laboratory of Intelligent Information Processing and Application for Hengyang normal university (IIPA20K04). And it was supported in part by the Joint fund for regional innovation and development of NSFC (U19A2083).

Author information

Authors and Affiliations

School of Automation and Electronic Information, XiangTan University, Xiangtan, China
Haixia Xu, Shuailong Wang, Yunjia Huang, Wei Zhou, Qi Chen & Dongbo Zhang

Authors

Haixia Xu
View author publications
You can also search for this author in PubMed Google Scholar
Shuailong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yunjia Huang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Qi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Dongbo Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haixia Xu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, H., Wang, S., Huang, Y. et al. FPANet: Feature-enhanced position attention network for semantic segmentation. Machine Vision and Applications 32, 119 (2021). https://doi.org/10.1007/s00138-021-01246-x

Download citation

Received: 03 February 2021
Revised: 23 August 2021
Accepted: 27 August 2021
Published: 25 September 2021
DOI: https://doi.org/10.1007/s00138-021-01246-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FPANet: Feature-enhanced position attention network for semantic segmentation

Abstract

Access this article

Similar content being viewed by others

PPNet : pooling position attention network for semantic segmentation

Position attention optimized deep semantic segmentation

Principal Semantic Feature Analysis with Covariance Attention

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

FPANet: Feature-enhanced position attention network for semantic segmentation

Abstract

Access this article

Similar content being viewed by others

PPNet : pooling position attention network for semantic segmentation

Position attention optimized deep semantic segmentation

Principal Semantic Feature Analysis with Covariance Attention

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation