Abstract
With the increasing demand for application scenarios such as autonomous driving and drone aerial photography, it has become a challenging problem that how to achieve the best trade-off between segmentation accuracy and inference speed while reducing the number of parameters. In this paper, a lightweight and efficient asymmetric network (LEANet) for real-time semantic segmentation is proposed to address this problem. Specifically, LEANet adopts an asymmetric encoder-decoder architecture. In the encoder, a depth-wise asymmetric bottleneck module with separation and shuffling operations (SS-DAB module) is proposed to jointly extract local and context information. In the decoder, a pyramid pooling module based on channel-wise attention (CA-PP module) is proposed to aggregate multi-scale context information and guide feature selection. Without any pre-training and post-processing, LEANet respectively achieves the accuracy of 71.9% and 67.5% mean Intersection over Union (mIoU) with the speed of 77.3 and 98.6 Frames Per Second (FPS) on the Cityscapes and CamVid test sets. These experimental results show that LEANet achieves an optimal trade-off between segmentation accuracy and inference speed with only 0.74 million parameters.
Similar content being viewed by others
References
Minaee S, Boykov Y, Porikli F, Plaza A, Kehtarnavaz N (2021) Image segmentation using deep learning. A Survey. IEEE Trans Pattern Anal Mach Intell, https://doi.org/10.1109/TPAMI.2021.3059968
Wu J, Jiao J, Yang Q, Zha ZJ (2019) Ground-aware point cloud semantic segmentation for autonomous driving. In: MM 2019 - Proceedings of the 27th ACM international conference on multimedia, pp 971–979
Chen C, Wang G (2020) IOSUDA: an unsupervised domain adaptation with input and output space alignment for joint optic disc and cup segmentation. Appl Intell, https://doi.org/10.1007/s10489-020-01956-1
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3431–3440
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587
Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801–818
Lin G, Milan A, Shen C, Reid I (2017) RefineNet: Multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 5168–5177
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 6230–6239
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3213–3223
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: A deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147
Zhao H, Qi X, Shen X, Shi J, Jia J (2018) ICNet for real-time semantic segmentation on high-resolution images. In: Proceedings of the european conference on computer vision (ECCV), pp 418–434
Romera E, Alvarez JM, Bergasa LM, Arroyo R (2018) Erfnet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans Intell Transp Syst 19(1):263–272
Wu T, Tang S, Zhang R, Rui Z, Zhang Y (2021) CGNEt: A light-weight context guided network for semantic segmentation. IEEE Trans Image Process 30:1169–1179
Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) BiSeNet: Bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the european conference on computer vision (ECCV), pp 334–349
Yu C, Gao C, Wang J, Yu G, Shen C, Sang N (2020) BiSeNet V2: bilateral network with guided aggregation for real-time semantic segmentation. arXiv:2004.02147
Poudel RPK, Liwicki S, Cipolla R (2019) Fast-SCNN: Fast Semantic Segmentation Network. arXiv:1902.04502
Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) ESPNet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the european conference on computer vision (ECCV), pp 561–580
Mehta S, Rastegari M, Shapiro L, Hajishirzi H (2019) ESPNetv2: A light-weight, power efficient, and general purpose convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 9182–9192
Li H, Xiong P, Fan H, Sun J (2019) DFANet: Deep feature aggregation for real-time semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 9514–9523
Wang Y, Zhou Q, Liu J, Xiong J, Latecki LJ (2019) Lednet: A lightweight encoder-decoder network for real-time semantic segmentation. In: Proceedings of the IEEE international conference on image processing (ICIP), pp 1860–1864
Wang Y, Zhou Q, Wu X (2019) ESNet: An efficient symmetric network for real-time semantic segmentation. In: Proceedings of the european conference on computer vision (ECCV), pp 41–52
Li G, Yun I, Kim J, Kim J (2019) DABNet: Depth-wise asymmetric bottleneck for real-time semantic segmentation. arXiv:1907.11357
Liu M, Yin H (2019) Feature Pyramid Encoding Network for Real-time Semantic Segmentation. arXiv:1909.08599
Liu J, Zhou Q, Qiang Y, Kang B, Zheng B (2020) FDDWNet: A lightweight convolutional neural network for real-time semantic segmentation. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2373–2377
Wang J, Xiong H, Wang H, Nian X (2020) ADSCNEt: Asymmetric depthwise separable convolution for semantic segmentation in real-time. Appl Intell 50(4):1045–1056
Brostow GJ, Shotton J, Fauqueur J, Cipolla R (2008) Segmentation and recognition using structure from motion point clouds. In: Proceedings of the european conference on computer vision (ECCV), pp 44–57
Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions. In: 4th International conference on learning representations (ICLR)
Yang X, Wu Y, Zhao J, Liu F (2020) Dense dual-path network for real-time semantic segmentation. arXiv:2010.10778
Zhu H, Wang B, Zhang X, Liu J (2020) Semantic image segmentation with shared decomposition convolution and boundary reinforcement structure. Appl Intell 50(9):2676–2689
Szegedy C, Vanhoucke V, Ioffe S, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2818–2826
Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: Proceedings - 30th IEEE conference on computer vision and pattern recognition (CVPR), pp 1800–1807
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861
Zhang X, Zhou X, Lin M, Sun J (2018) ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 6848–6856
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Yang M, Yu K, Chi Z, Li Z, Yang K (2018) DenseASPP for semantic segmentation in street scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3684–3692
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2261–2269
Emara T, Abd El Munim HE, Abbas HM (2019) Liteseg: A Novel Lightweight ConvNet for Semantic Segmentation. Digital Image Computing: Techniques and Applications (DICTA), 1–7
Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation Networks. IEEE Trans Pattern Anal Mach Intell 42(8):2011–2023
Zhao H, Zhang Y, Liu S, Shi J, Loy CC, Lin D, Jia J (2018) PSANet: Point-wise spatial attention network for scene parsing. In: Proceedings of the european conference on computer vision (ECCV), pp 1–6
Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Learning a discriminative feature network for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1857–1866
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3141–3149
Mostafa G, Mennatullah S, Moemen AR (2018) ShuffleSeg: Real-time semantic segmentation network. arXiv:1803.03816
Hao S, Zhou Y, Guo Y, Hong R (2020) Bi-direction context propagation network for real-time semantic segmentation. arXiv:2005.11034
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3080–3089
Jiang W, Xie Z, Li Y, Liu C, Lu H (2020) LRNNet: A light-weighted network with efficient reduced non-local operation for real-time semantic segmentation. arXiv:2006.02706
Acknowledgements
This work is supported by the National Key Research and Development Program of China under Grant 2018YFB1702300. The author thanks the teachers and classmates for their guidance and help with this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, XL., Du, BC., Luo, ZC. et al. Lightweight and efficient asymmetric network design for real-time semantic segmentation. Appl Intell 52, 564–579 (2022). https://doi.org/10.1007/s10489-021-02437-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02437-9