ELANet: Effective Lightweight Attention-Guided Network for Real-Time Semantic Segmentation

Yi, Qingming; Dai, Guoshuai; Shi, Min; Huang, Zunkai; Luo, Aiwen

doi:10.1007/s11063-023-11145-z

ELANet: Effective Lightweight Attention-Guided Network for Real-Time Semantic Segmentation

Published: 06 January 2023

Volume 55, pages 6425–6442, (2023)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Qingming Yi^1,3^na1,
Guoshuai Dai¹^na1,
Min Shi¹,
Zunkai Huang² &
…
Aiwen Luo¹

833 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

Deep neural networks have greatly facilitated the applications of semantic segmentation. However, most of the existing neural networks bring massive calculations with lots of model parameters for achieving a higher precision, which is unaffordable for resource-constrained edge devices. To achieve an appropriate trade-off between computing efficiency and segmentation accuracy, we proposed an effective lightweight attention-guided network (ELANet) for real-time semantic segmentation based on an asymmetrical encoder–decoder framework in this work. In the encoding phase, we combined atrous convolution and depth-wise convolution to design two types of effective context guidance blocks to learn contextual semantic information. A refined feature fusion module with a dual attention-guided fusion (DAF) unit was developed in the decoder to exploit different levels of features. Without any pretraining, we estimated the performance of multi-attention ELANet with extensive experiments on the Cityscapes dataset with an input resolution of 512\(\times \)1024, resulting in 75.4% mIoU and 83 FPS inference speed with only 0.76 M parameters and 10.34 GFLOPs on a single 3090 GPU. The code is publicly available at https://github.com/DGS666/ELANet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

EMFANet: a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation

Article 27 February 2024

Lightweight and Progressively-Scalable Networks for Semantic Segmentation

Article 18 May 2023

Real-Time Semantic Segmentation via Auto Depth, Downsampling Joint Decision and Feature Aggregation

Article 19 February 2021

References

Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670
Article MathSciNet MATH Google Scholar
Hong C, Yu J, Zhang J, Jin X, Lee K-H (2018) Multimodal face-pose estimation with multitask manifold deep learning. IEEE Trans Ind Inf 15(7):3952–3961
Article Google Scholar
Yu J, Tan M, Zhang H, Rui Y, Tao D (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell 44(2):563–578
Article Google Scholar
Yu J, Yao J, Zhang J, Yu Z, Tao D (2020) SPRNet: single-pixel reconstruction for one-stage instance segmentation. IEEE Trans Cybern 51(4):1731–1742
Article Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Siam M, Gamal M, Abdel-Razek M, Yogamani S, Jagersand M, Zhang H (2018) A comparative study of real-time semantic segmentation for autonomous driving. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 587–597
Siam M, Elkerdawy S, Jagersand M, Yogamani S (2017) Deep semantic segmentation for automated driving: Taxonomy, roadmap and challenges. In: 2017 IEEE 20th international conference on intelligent transportation systems (ITSC). IEEE, pp 1–8
Bovcon B, Perš J, Kristan M et al (2018) Stereo obstacle detection for unmanned surface vehicles by imu-assisted semantic segmentation. Robot Auton Syst 104:1–13
Article Google Scholar
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017)MobileNets: efficient convolutional neural networks for mobile vision applications. Preprint at arXiv:1704.04861
Zhang X, Zhou X, Lin M, Sun J (2018) ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856
Romera E, Alvarez JM, Bergasa LM, Arroyo R (2017) ERFNet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans Intell Transp Syst 19(1):263–272
Article Google Scholar
Li G, Yun I, Kim J, Kim J (2019) DABNet: depth-wise asymmetric bottleneck for real-time semantic segmentation. Preprint at arXiv:1907.11357
Zhang X, Du B, Wu Z, Wan T (2022) LAANet: lightweight attention-guided asymmetric network for real-time semantic segmentation. Neural Comput Appl:1–15
Li Y, Li X, Xiao C, Li H, Zhang W (2021) EACNet: enhanced asymmetric convolution for real-time semantic segmentation. IEEE Signal Process Lett 28:234–238
Article Google Scholar
Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions. In: 4th international conference on learning representations
Li Y, Li M, Li Z, Xiao C, Li H (2022) EFRNet: efficient feature reuse network for real-time semantic segmentation. Neural Process Lett:1–13
Elhassan MA, Huang C, Yang C, Munea TL (2021) DSANet: dilated spatial attention for real-time semantic segmentation in urban street scenes. Expert Syst Appl 183:115090
Article Google Scholar
Lin G, Milan A, Shen C, Reid I (2017) RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1925–1934
Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 7132–7141
Wang Q, Wu B, Zhu PF, Li P, Zuo W, Hu Q (2020) ECA-Net: efficient channel attention for deep convolutional neural networks. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 11531–11539
Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223
Brostow GJ, Fauqueur J, Cipolla R (2009) Semantic object classes in video: a high-definition ground truth database. Pattern Recogn Lett 30(2):88–97
Article Google Scholar
Caesar H, Uijlings J, Ferrari V (2018) Coco-stuff: thing and stuff classes in context. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1209–1218
Peng G, Yang S, Wang H (2021) Refine for semantic segmentation based on parallel convolutional network with attention model. Neural Process Lett 53(6):4177–4188
Article Google Scholar
Paszke A, Chaurasia A, Kim S, Culurciello E (2016) ENet: a deep neural network architecture for real-time semantic segmentation. Preprint arXiv:1606.02147
Li G, Li L, Zhang J (2021) BiAttnNet: bilateral attention for improving real-time semantic segmentation. IEEE Signal Process Lett 29:46–50
Article Google Scholar
Wang Y, Zhou Q, Liu J, Xiong J, Gao G, Wu X, Latecki LJ (2019) LEDNet: a lightweight encoder-decoder network for real-time semantic segmentation. In: 2019 IEEE international conference on image processing (ICIP). IEEE, pp 1860–1864
Zhang J, Cao Y, Wu Q (2021) Vector of locally and adaptively aggregated descriptors for image feature representation. Pattern Recogn 116:107952
Article Google Scholar
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. Preprint at arXiv:2010.11929
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022
Yang X, Li S, Chen Z, Chanussot J, Jia X, Zhang B, Li B, Chen P (2021) An attention-fused network for semantic segmentation of very-high-resolution remote sensing imagery. ISPRS J Photogramm Remote Sens 177:238–262
Article Google Scholar
Wu T, Tang S, Zhang R, Cao J, Zhang Y (2020) CGNet: a light-weight context guided network for semantic segmentation. IEEE Trans Image Process 30:1169–1179
Article Google Scholar
Hao X, Hao X, Zhang Y, Li Y, Wu C (2021) Real-time semantic segmentation with weighted factorized-depthwise convolution. Image Vis Comput 114:104269
Article Google Scholar
Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. In: International conference on machine learning. PMLR, pp 1139–1147
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. Preprint at arXiv:1412.6980
Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) ESPNet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 552–568
Zhuang M, Zhong X, Gu D, Feng L, Zhong X, Hu H (2021) LRDNet: a lightweight and efficient network with refined dual attention decorder for real-time semantic segmentation. Neurocomputing 459:349–360
Article Google Scholar
Zhou Q, Wang Y, Fan Y, Wu X, Zhang S, Kang B, Latecki LJ (2020) AGLNet: towards real-time semantic segmentation of self-driving images via attention-guided lightweight network. Appl Soft Comput 96:106682
Article Google Scholar
Lu M, Chen Z, Wu QJ, Wang N, Rong X, Yan X (2020) FRNet: factorized and regular blocks network for semantic segmentation in road scene. IEEE Trans Intell Transp Syst
Liu J, Zhou Q, Qiang Y, Kang B, Wu X, Zheng B (2020) FDDWNet: a lightweight convolutional neural network for real-time semantic segmentation. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2373–2377
Jiang W, Xie Z, Li Y, Liu C, Lu H (2020) LRNNET: a light-weighted network with efficient reduced non-local operation for real-time semantic segmentation. In: 2020 IEEE international conference on multimedia & expo workshops (ICMEW). IEEE, pp 1–6
Yu C, Xiao B, Gao C, Yuan L, Zhang L, Sang N, Wang J (2021) Lite-HRNet: a lightweight high-resolution network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10440–10450
Liu J, Xu X, Shi Y, Deng C, Shi M (2022) RELAXNet: residual efficient learning and attention expected fusion network for real-time semantic segmentation. Neurocomputing 474:115–127
Article Google Scholar
Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) BiSeNet: bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 325–341
Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Article Google Scholar
Li H, Xiong P, Fan H, Sun J (2019) DFAnet: deep feature aggregation for real-time semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9522–9531
Zhang X-L, Du B-C, Luo Z-C, Ma K (2022) Lightweight and efficient asymmetric network design for real-time semantic segmentation. Appl Intell 52(1):564–579
Article Google Scholar
Zhao H, Qi X, Shen X, Shi J, Jia J (2018) ICNet for real-time semantic segmentation on high-resolution images. In: Proceedings of the European conference on computer vision (ECCV), pp 405–420
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar

Download references

Acknowledgements

This work was supported by Guangzhou Leading Talents in Innovation and Entrepreneurship under Grant No. 2019019, the National Natural Science Foundation of China under Grant No. 62002134, Guangdong Basic and Applied Basic Research Foundation under Grant No. 2020A1515110645, the Key Laboratory of New Semiconductors and Devices of Guangdong Higher Education Institutes under Grant No. 2021KSY001, the Fundamental Research Funds for the Central Universities at Jinan University under Grant No. 21620353, and the JNU-Techtotop Joint Foundation of Postgraduates Training Base under Grant No. 82621176.

Author information

Qingming Yi and Guoshuai Dai have contributed equally to this work.

Authors and Affiliations

Department of Electronic Engineering, Jinan University, Guangzhou, 510632, China
Qingming Yi, Guoshuai Dai, Min Shi & Aiwen Luo
Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai, 201210, China
Zunkai Huang
Taidou Microelectronic Science and Technology Co., Ltd., Guangzhou, 510663, China
Qingming Yi

Authors

Qingming Yi
View author publications
You can also search for this author in PubMed Google Scholar
Guoshuai Dai
View author publications
You can also search for this author in PubMed Google Scholar
Min Shi
View author publications
You can also search for this author in PubMed Google Scholar
Zunkai Huang
View author publications
You can also search for this author in PubMed Google Scholar
Aiwen Luo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aiwen Luo.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yi, Q., Dai, G., Shi, M. et al. ELANet: Effective Lightweight Attention-Guided Network for Real-Time Semantic Segmentation. Neural Process Lett 55, 6425–6442 (2023). https://doi.org/10.1007/s11063-023-11145-z

Download citation

Accepted: 01 January 2023
Published: 06 January 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s11063-023-11145-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ELANet: Effective Lightweight Attention-Guided Network for Real-Time Semantic Segmentation

Abstract

Access this article

Similar content being viewed by others

EMFANet: a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation

Lightweight and Progressively-Scalable Networks for Semantic Segmentation

Real-Time Semantic Segmentation via Auto Depth, Downsampling Joint Decision and Feature Aggregation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

ELANet: Effective Lightweight Attention-Guided Network for Real-Time Semantic Segmentation

Abstract

Access this article

Similar content being viewed by others

EMFANet: a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation

Lightweight and Progressively-Scalable Networks for Semantic Segmentation

Real-Time Semantic Segmentation via Auto Depth, Downsampling Joint Decision and Feature Aggregation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation