Lightweight and efficient asymmetric network design for real-time semantic segmentation

Zhang, Xiu-Ling; Du, Bing-Ce; Luo, Zhao-Ci; Ma, Kai

doi:10.1007/s10489-021-02437-9

Lightweight and efficient asymmetric network design for real-time semantic segmentation

Published: 06 May 2021

Volume 52, pages 564–579, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Xiu-Ling Zhang^1,2,
Bing-Ce Du^1,2,
Zhao-Ci Luo^1,2 &
…
Kai Ma^1,2

1169 Accesses
22 Citations
Explore all metrics

Abstract

With the increasing demand for application scenarios such as autonomous driving and drone aerial photography, it has become a challenging problem that how to achieve the best trade-off between segmentation accuracy and inference speed while reducing the number of parameters. In this paper, a lightweight and efficient asymmetric network (LEANet) for real-time semantic segmentation is proposed to address this problem. Specifically, LEANet adopts an asymmetric encoder-decoder architecture. In the encoder, a depth-wise asymmetric bottleneck module with separation and shuffling operations (SS-DAB module) is proposed to jointly extract local and context information. In the decoder, a pyramid pooling module based on channel-wise attention (CA-PP module) is proposed to aggregate multi-scale context information and guide feature selection. Without any pre-training and post-processing, LEANet respectively achieves the accuracy of 71.9% and 67.5% mean Intersection over Union (mIoU) with the speed of 77.3 and 98.6 Frames Per Second (FPS) on the Cityscapes and CamVid test sets. These experimental results show that LEANet achieves an optimal trade-off between segmentation accuracy and inference speed with only 0.74 million parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

LAANet: lightweight attention-guided asymmetric network for real-time semantic segmentation

Article 24 January 2022

EMFANet: a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation

Article 27 February 2024

DAABNet: depth-wise asymmetric attention bottleneck for real-time semantic segmentation

Article 24 February 2024

References

Minaee S, Boykov Y, Porikli F, Plaza A, Kehtarnavaz N (2021) Image segmentation using deep learning. A Survey. IEEE Trans Pattern Anal Mach Intell, https://doi.org/10.1109/TPAMI.2021.3059968
Wu J, Jiao J, Yang Q, Zha ZJ (2019) Ground-aware point cloud semantic segmentation for autonomous driving. In: MM 2019 - Proceedings of the 27th ACM international conference on multimedia, pp 971–979
Chen C, Wang G (2020) IOSUDA: an unsupervised domain adaptation with input and output space alignment for joint optic disc and cup segmentation. Appl Intell, https://doi.org/10.1007/s10489-020-01956-1
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3431–3440
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587
Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801–818
Lin G, Milan A, Shen C, Reid I (2017) RefineNet: Multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 5168–5177
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 6230–6239
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3213–3223
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Article Google Scholar
Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: A deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147
Zhao H, Qi X, Shen X, Shi J, Jia J (2018) ICNet for real-time semantic segmentation on high-resolution images. In: Proceedings of the european conference on computer vision (ECCV), pp 418–434
Romera E, Alvarez JM, Bergasa LM, Arroyo R (2018) Erfnet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans Intell Transp Syst 19(1):263–272
Article Google Scholar
Wu T, Tang S, Zhang R, Rui Z, Zhang Y (2021) CGNEt: A light-weight context guided network for semantic segmentation. IEEE Trans Image Process 30:1169–1179
Article Google Scholar
Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) BiSeNet: Bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the european conference on computer vision (ECCV), pp 334–349
Yu C, Gao C, Wang J, Yu G, Shen C, Sang N (2020) BiSeNet V2: bilateral network with guided aggregation for real-time semantic segmentation. arXiv:2004.02147
Poudel RPK, Liwicki S, Cipolla R (2019) Fast-SCNN: Fast Semantic Segmentation Network. arXiv:1902.04502
Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) ESPNet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the european conference on computer vision (ECCV), pp 561–580
Mehta S, Rastegari M, Shapiro L, Hajishirzi H (2019) ESPNetv2: A light-weight, power efficient, and general purpose convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 9182–9192
Li H, Xiong P, Fan H, Sun J (2019) DFANet: Deep feature aggregation for real-time semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 9514–9523
Wang Y, Zhou Q, Liu J, Xiong J, Latecki LJ (2019) Lednet: A lightweight encoder-decoder network for real-time semantic segmentation. In: Proceedings of the IEEE international conference on image processing (ICIP), pp 1860–1864
Wang Y, Zhou Q, Wu X (2019) ESNet: An efficient symmetric network for real-time semantic segmentation. In: Proceedings of the european conference on computer vision (ECCV), pp 41–52
Li G, Yun I, Kim J, Kim J (2019) DABNet: Depth-wise asymmetric bottleneck for real-time semantic segmentation. arXiv:1907.11357
Liu M, Yin H (2019) Feature Pyramid Encoding Network for Real-time Semantic Segmentation. arXiv:1909.08599
Liu J, Zhou Q, Qiang Y, Kang B, Zheng B (2020) FDDWNet: A lightweight convolutional neural network for real-time semantic segmentation. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2373–2377
Wang J, Xiong H, Wang H, Nian X (2020) ADSCNEt: Asymmetric depthwise separable convolution for semantic segmentation in real-time. Appl Intell 50(4):1045–1056
Article Google Scholar
Brostow GJ, Shotton J, Fauqueur J, Cipolla R (2008) Segmentation and recognition using structure from motion point clouds. In: Proceedings of the european conference on computer vision (ECCV), pp 44–57
Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions. In: 4th International conference on learning representations (ICLR)
Yang X, Wu Y, Zhao J, Liu F (2020) Dense dual-path network for real-time semantic segmentation. arXiv:2010.10778
Zhu H, Wang B, Zhang X, Liu J (2020) Semantic image segmentation with shared decomposition convolution and boundary reinforcement structure. Appl Intell 50(9):2676–2689
Article Google Scholar
Szegedy C, Vanhoucke V, Ioffe S, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2818–2826
Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: Proceedings - 30th IEEE conference on computer vision and pattern recognition (CVPR), pp 1800–1807
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861
Zhang X, Zhou X, Lin M, Sun J (2018) ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 6848–6856
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Article Google Scholar
Yang M, Yu K, Chi Z, Li Z, Yang K (2018) DenseASPP for semantic segmentation in street scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3684–3692
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2261–2269
Emara T, Abd El Munim HE, Abbas HM (2019) Liteseg: A Novel Lightweight ConvNet for Semantic Segmentation. Digital Image Computing: Techniques and Applications (DICTA), 1–7
Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation Networks. IEEE Trans Pattern Anal Mach Intell 42(8):2011–2023
Article Google Scholar
Zhao H, Zhang Y, Liu S, Shi J, Loy CC, Lin D, Jia J (2018) PSANet: Point-wise spatial attention network for scene parsing. In: Proceedings of the european conference on computer vision (ECCV), pp 1–6
Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Learning a discriminative feature network for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1857–1866
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3141–3149
Mostafa G, Mennatullah S, Moemen AR (2018) ShuffleSeg: Real-time semantic segmentation network. arXiv:1803.03816
Hao S, Zhou Y, Guo Y, Hong R (2020) Bi-direction context propagation network for real-time semantic segmentation. arXiv:2005.11034
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3080–3089
Jiang W, Xie Z, Li Y, Liu C, Lu H (2020) LRNNet: A light-weighted network with efficient reduced non-local operation for real-time semantic segmentation. arXiv:2006.02706

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China under Grant 2018YFB1702300. The author thanks the teachers and classmates for their guidance and help with this paper.

Author information

Authors and Affiliations

Engineering Research Center of the Ministry of Education for Intelligent Control System and Intelligent Equipment, Yanshan University Qinhuangdao, Qinhuangdao, China
Xiu-Ling Zhang, Bing-Ce Du, Zhao-Ci Luo & Kai Ma
Key Laboratory of Industrial Computer Control Engineering of Hebei Province, Yanshan University Qinhuangdao, Qinhuangdao, China
Xiu-Ling Zhang, Bing-Ce Du, Zhao-Ci Luo & Kai Ma

Authors

Xiu-Ling Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bing-Ce Du
View author publications
You can also search for this author in PubMed Google Scholar
Zhao-Ci Luo
View author publications
You can also search for this author in PubMed Google Scholar
Kai Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiu-Ling Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, XL., Du, BC., Luo, ZC. et al. Lightweight and efficient asymmetric network design for real-time semantic segmentation. Appl Intell 52, 564–579 (2022). https://doi.org/10.1007/s10489-021-02437-9

Download citation

Accepted: 15 April 2021
Published: 06 May 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10489-021-02437-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lightweight and efficient asymmetric network design for real-time semantic segmentation

Abstract

Access this article

Similar content being viewed by others

LAANet: lightweight attention-guided asymmetric network for real-time semantic segmentation

EMFANet: a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation

DAABNet: depth-wise asymmetric attention bottleneck for real-time semantic segmentation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Lightweight and efficient asymmetric network design for real-time semantic segmentation

Abstract

Access this article

Similar content being viewed by others

LAANet: lightweight attention-guided asymmetric network for real-time semantic segmentation

EMFANet: a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation

DAABNet: depth-wise asymmetric attention bottleneck for real-time semantic segmentation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation