Abstract
Semantic segmentation is a kind of dense prediction task, which has high requirements on the prediction accuracy and inference speed in mobile terminals. To reduce the computational burden of the segmentation network and supplement the missing spatial information of high-level features, an efficient feature reuse network (EFRNet) is proposed in two steps: a Multi-scale Bottleneck module is designed to extract multi-scale features, and a lightweight backbone is designed based on the MB module; then, features of different depths are integrated through efficient feature reuse model. Experiments on Cityscapes datasets demonstrate that the proposed EFRNet achieves an impressive balance between speed and precision. Specifically, without any pre-trained model and post-processing, it achieves 75.58% Mean IoU on the Cityscapes test dataset with the speed of 118 FPS on a single RTX 2080Ti GPU.
Similar content being viewed by others
Change history
06 November 2022
A Correction to this paper has been published: https://doi.org/10.1007/s11063-022-10957-9
References
Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans Pattern Anal & Mach Intell 39(2):1–1
Chen LC et al. (2018) Encoder–decoder with atrous separable convolution for semantic image segmentation. In: European conference on computer vision
Chen LC et al. (2017) Rethinking atrous convolution for semantic image segmentation. Comput Vis Pattern Recogn. arXiv:1706.05587
Chen LC et al. (2014) Semantic image segmentation with deep convolutional nets and fully connected CRFs. Comput Vis Pattern Recogn. arXiv:1412.7062v4
Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR)
Gao Roland (2021) Rethink dilated convolution for real-time semantic segmentation. In: CoRR abs/2111.09957
He K. et al. (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on computer vision and pattern recognition (CVPR)
Huang G et al. (2017) Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2261–2269
Jégou S et al. (2016) The one hundred layers tiramisu: fully convolutional denseNets for semantic segmentation. In: 2017 IEEE Conference on computer vision and pattern recognition workshops (CVPRW)
Jiang W et al. (2020) LRNNet: a light-weighted network with efficient reduced non-local operation for real-time semantic segmentation. In: IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp 1–6
Li G et al. (2019) DABNet: Depth-wise asymmetric Bottleneck for real-time semantic segmentation. In: BMVC. BMVA Press, p 259
Li G. et al. (2020) Depth-wise asymmetric Bottleneck with point-wise aggregation decoder for real-time semantic segmentation in urban scenes. In: IEEE Access vol. 99: pp. 1–1
Li H. et al. (2020) DFANet: deep feature aggregation for real-time semantic segmentation. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR)
Lin G. et al. (2017) RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR)
Liu M, Yin H (2019) Feature pyramid encoding network for real-time semantic segmentation. BMVC, BMVA Press, p 260
Mehta S. et al. (2018) ESPNet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the european conference on computer vision (ECCV), Springer, Cham
Nirkin Y, Wolf L, Hassner T (2020) HyperSeg: patch-wise hypernetwork for real-time semantic segmentation. In: CoRR abs/2012.11582
Pan J. et al. (2016) Shallow and deep convolutional networks for saliency prediction. In: Computer vision & pattern recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Paszke A et al. (2016) ENet: a deep neural network architecture for real-time semantic segmentation. Comput Vis Pattern Recogn. arXiv:1606.02147
Poudel RP et al. (2018) ContextNet: exploring context and detail for semantic segmentation in real-time. In: BMVC, BMVA Press, p 146
Romera E et al (2017) ERFNet: efficient residual factorized convNet for real-time semantic segmentation. IEEE Trans Intell Transport Syst 19(1):263–272
Sandler M. et al. (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on computer vision and pattern recognition (CVPR)
Shrivastava A, Gupta A, Girshick RB (2016) Training region-based object detectors with online hard example mining. In: IEEE conference on computer vision and pattern recognition pp. 761–769
Si H et al. (2019) Real-time semantic segmentation via multiply spatial fusion network. BMVC, BMVA Press
Siam M. et al. (2018) RTSeg: real-time semantic segmentation comparative study. In: Under Review by ICIP 2018
Wang P. et al. (2018) Understanding convolution for semantic segmentation. In: 2018 IEEE Winter conference on applications of computer vision (WACV)
Wang Y. et al. (2019) LEDNet: a lightweight encoder–decoder network for real-time semantic segmentation. In: IEEE International conference on image processing (ICIP)
Yang M. et al. (2018) DenseASPP for Semantic Segmentation in Street Scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Yu C et al. (2021) BiSeNet V2: bilateral network with guided aggregation for real-time semantic segmentation. Int J Comput Vis 129:3051–3068
Yu C. et al. (2018) BiSeNet: bilateral segmentation network for real-time semantic segmentation. In: European conference on computer vision
Yu J. et al. (2019) Hierarchical deep click feature prediction for fine-grained image recognition. In: IEEE Trans Pattern Anal Mach Intell PP.99, pp. 1–1
Zhang J, Cao Y, Wu Q (2021) Vector of locally and adaptively aggregated descriptors for image feature representation. Pattern Recog 116:107952
Zhang J, Yu J, Tao D (2018) Local deep-feature alignment for unsupervised dimension reduction. IEEE Trans Image Process 27(5):2420–2432
Zhang Z et al. (2018) ExFuse: enhancing feature fusion for semantic segmentation. In: European conference on computer vision
Zhao H et al. (2018) ICNet for real-time semantic segmentation on high-resolution images. Lecture Notes in Computer Science, vol 11207. Springer, pp 418–434
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, Y., Li, M., Li, Z. et al. EFRNet: Efficient Feature Reuse Network for Real-time Semantic Segmentation. Neural Process Lett 54, 4647–4659 (2022). https://doi.org/10.1007/s11063-022-10740-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-022-10740-w