Fully convolutional network with attention modules for semantic segmentation

Huang, Yunjia; Xu, Haixia

doi:10.1007/s11760-020-01828-8

Fully convolutional network with attention modules for semantic segmentation

Original Paper
Published: 02 January 2021

Volume 15, pages 1031–1039, (2021)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

770 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

Fully convolutional network is a powerful end-to-end model for semantic segmentation. However, it performs prediction pixel by pixel to pose weak consistency on intra-category. This paper proposes fully convolutional network with attention modules for semantic segmentation. Based on the framework of fully convolutional network, the post-processing attention module and skip-layer attention module are introduced to enhance the relevancy among pixels. Post-processing attention module is to calculate the similarity among pixels to obtain global information. Skip-layer attention module is designed to combine semantic information from a deep, coarse layer with contour information from a shallow, fine layer to produce the feature with high resolution and strong semantic information. Loss function, obtained by cross-entropy between estimated probability and label, is to optimize the network. Extensive experiments demonstrate that the proposed approach is superior to DeepLab and other models in performance of mean IoU with moderate computational complexity

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recent progress in semantic image segmentation

Article Open access 27 June 2018

Image Semantic Segmentation Based on Fully Convolutional Neural Network and CRF

Supervised semantic segmentation based on deep learning: a survey

Article 02 April 2022

References

Rother, C., Kolmogorov, V., Blake, A.: GrabCut-interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23, 309–314 (2004)
Article Google Scholar
Jian-Feng, Xia: Segmentation and recognition of cancer cells based on mathematical morphology. Electron. Sci. Technol. 29, 36–38 (2016)
Google Scholar
He, X., Zemel, R.S., Ray, D.: Learning and incorporating top down cues in image segmentation. In: Proceedings of the 9th European Conference on Computer Vision. Graz, Austria, 7–13 May, pp. 338–351 (2006)
RavD, Bober M., Farinella, G.M., Guarnera, M., Battiato, S.: Semantic segmentation of images exploiting DCT based features and random forest. Pattern Recogn. 52, 260–273 (2016)
Article Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)
Article MathSciNet Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Patten Anal. Mach. Intell. 39, 640–651 (2017)
Article Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 Oct, pp. 234–241 (2015)
Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. CoRR, arXiv:1706.05587 (2017)
Yuhui, Y., Jingdong, W.: OCNet: object context network for scene parsing. CoRR, arXiv:1809.00916 (2019)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: IEEE Conference on Computer Vision and Pattern Recognition, Puerto Rico, USA, 25–30 June, pp. 2881–2890 (2017)
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, Long Beach CA, USA, 16–20 June, pp. 3146–3154 (2019)
Mnih, V., Heess, N., Graves, A., Kavukcuoglu, K.: Recurrent models of visual attention. In: Advances in Neural Information Processing, June 24, pp. 2204–2212 (2014)
Chen, Liang-Chieh, Papandreou, George, Kokkinos, Iasonas, Murphy, Kevin, Yuille, Alan L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 834–848 (2018)
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, Columbus, America, 24–27 June, pp. 580–587 (2014)
Badrinarayanan, Vijay, Kendall, Alex, Cipolla, Roberto: Segent: deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2481–2495 (2017)
Article Google Scholar
Lin, T.-Y., Dollr, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, America, 25–30 June, p. 4 (2017)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 18–22 June, pp. 8759–8768 (2018)
Liu, H., Peng, C., Yu, C., Wang, J., Liu, X., Yu, G., Jiang, W.: An end-to-end network for panoptic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, Long Beach CA, USA, 16–20 June, pp. 6172–6181 (2019)
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 18–22 June, pp. 7794–7803 (2018)
Miao, S., Piat, S., Fischer, P., et al.: Dilated FCN for multi-agent 2D/3D medical image registration. In: AAAI Conference on Artificial Intelligence, New Orleans, USA, 2–7 Feb, pp. 4694–4701 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 27–30 June, pp. 770–778 (2016)
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The Cityscapes dataset for semantic urban scene understanding. In: IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 27–30 June, pp. 3213–3223 (2016)
Everingham, Mark, Van Gool, Luc, Williams, Christopher K.I., Winn, John, Zisserman, Andrew: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88, 303–338 (2010)
Article Google Scholar
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 834–848 (2018)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E., et al.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, Lake Tahoe, USA, 3–6 Dec, pp. 1097–1105 (2012)

Download references

Acknowledgements

This work was supported in part by the Joint fund for regional innovation and development of NSFC (U19A2083), by the Science and Technology Plan Project of Hunan Provinc (2016TP1020), open fund project of Hunan Provincial Key Laboratory of Intelligent Information Processing and Application for Hengyang normal university(IIPA20K04).

Author information

Authors and Affiliations

School of Automation and Electronic Information, Xiangtan University, Xiangtan, China
Yunjia Huang & Haixia Xu

Authors

Yunjia Huang
View author publications
You can also search for this author in PubMed Google Scholar
Haixia Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haixia Xu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, Y., Xu, H. Fully convolutional network with attention modules for semantic segmentation. SIViP 15, 1031–1039 (2021). https://doi.org/10.1007/s11760-020-01828-8

Download citation

Received: 20 July 2020
Revised: 23 October 2020
Accepted: 23 November 2020
Published: 02 January 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s11760-020-01828-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fully convolutional network with attention modules for semantic segmentation

Abstract

Access this article

Similar content being viewed by others

Recent progress in semantic image segmentation

Image Semantic Segmentation Based on Fully Convolutional Neural Network and CRF

Supervised semantic segmentation based on deep learning: a survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fully convolutional network with attention modules for semantic segmentation

Abstract

Access this article

Similar content being viewed by others

Recent progress in semantic image segmentation

Image Semantic Segmentation Based on Fully Convolutional Neural Network and CRF

Supervised semantic segmentation based on deep learning: a survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation