Skip to main content
Log in

FBRNet: a feature fusion and border refinement network for real-time semantic segmentation

  • Original Paper
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

A Correction to this article was published on 25 April 2024

This article has been updated

Abstract

Existing semantic segmentation networks perform well in accuracy by spending much computation. However, for practical applications, not only high segmentation accuracy but also high inference speed is required. To solve the problem of the difficult balance between accuracy and speed, we propose a new real-time semantic segmentation network (FBRNet). To extract multi-scale semantic information more quickly, we propose a lightly weighted reinforced atrous spatial pyramid pooling module (arASPP) based on the attention mechanism, which can extract richer and more advanced features with less computation than the original ASPP. To eliminate the semantic gap between high- and low-level features, we propose a new feature fusion module (CSFM), in which a shuffling mechanism is introduced to enhance robustness, and a parallel contextual information enhancement module and detail information enhancement module are built to facilitate the information exchange between high- and low-level features, achieving the effect of improving the model feature representation. Finally, we also introduce high-level features, fusing Laplace convolution and spatial attention mechanisms, and design the edge feature reinforcement module (LABRM) to eliminate the noise of low-level features and compensate for the model’s segmentation effect target boundary. In the Cityscapes validation set and test set, FBRNet achieves 77.63% and 75.3% mIoU, and 101.9 FPS on a single tesla-T4 GPU, also achieved 72.4% mIoU and 89.8 FPS on the CamVid dataset and 55.2% mIoU and 100.8 FPS on the BDD100K dataset, which is a better balance of accuracy and speed compared with existing networks. The code is available at https://github.com/little5570/FBRNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The Cityscapes datasets are openly available in https://www.cityscapes-dataset.com. The CamVid datasets are openly available in http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid. The BDD100k datasets openly available in https://www.bdd100k.com. Our results of the cityscapes test set can be seen at https://www.cityscapes-dataset.com/method-details/?submissionID=16865&back=mysubmissions.

Change history

References

  1. Sun Y, Zheng W (2023) Hrnet-and pspnet-based multiband semantic segmentation of remote sensing images. Neural Comput Appl 35(12):8667–8675. https://doi.org/10.1007/s00521-022-07737-w

    Article  Google Scholar 

  2. Mo Y, Wu Y, Yang X, Liu F, Liao Y (2022) Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing 493:626–646. https://doi.org/10.1016/j.neucom.2022.01.005

    Article  Google Scholar 

  3. Luo D, Kang H, Long J, Zhang J, Liu X, Quan T (2023) Gdn: guided down-sampling network for real-time semantic segmentation. Neurocomputing 520:205–215. https://doi.org/10.1016/j.neucom.2022.11.075

    Article  Google Scholar 

  4. Li Y, Zhang W, Liu Y, Shao X (2022) A lightweight network for real-time smoke semantic segmentation based on dual paths. Neurocomputing 501:258–269. https://doi.org/10.1016/j.neucom.2022.06.026

    Article  Google Scholar 

  5. Zhu H, Zhang M, Zhang X, Zhang L (2021) Two-branch encoding and iterative attention decoding network for semantic segmentation. Neural Comput Appl 33:5151–5166. https://doi.org/10.1007/s00521-020-05312-9

    Article  Google Scholar 

  6. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440. https://doi.org/10.1109/CVPR.2015.7298965

  7. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520. https://doi.org/10.1109/CVPR.2018.00474

  8. Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Bisenet: Bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European conference on computer vision, pp. 325–341. https://doi.org/10.1007/978-3-030-01261-8_20

  9. Zhao H, Qi X, Shen X, Shi J, Jia J (2018) Icnet for real-time semantic segmentation on high-resolution images. In: Proceedings of the European conference on computer vision, pp. 405–420. https://doi.org/10.1007/978-3-030-01219-9_25

  10. Li H, Xiong P, Fan H, Sun J (2019) Dfanet: Deep feature aggregation for real-time semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9522–9531. https://doi.org/10.1109/cvpr.2019.00975

  11. Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/tpami.2017.2699184

    Article  Google Scholar 

  12. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3213–3223. https://doi.org/10.1109/CVPR.2016.350

  13. Brostow GJ, Fauqueur J, Cipolla R (2009) Semantic object classes in video: a high-definition ground truth database. Pattern Recogn Lett 30(2):88–97. https://doi.org/10.1016/j.patrec.2008.04.005

    Article  Google Scholar 

  14. Yu F, Chen H, Wang X, Xian W, Chen Y, Liu F, Madhavan V, Darrell T (2020) Bdd100k: A diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2636–2645. https://doi.org/10.48550/arXiv.1805.04687

  15. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention, pp. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28

  16. Su Z, Li W, Ma Z, Gao R (2022) An improved u-net method for the semantic segmentation of remote sensing images. Appl Intell 52(3):3276–3288. https://doi.org/10.1007/s10489-021-02542-9

    Article  Google Scholar 

  17. Li Y, Wang Z, Yin L, Zhu Z, Qi G, Liu Y (2023) X-net: a dual encoding-decoding method in medical image segmentation. Vis Comput 39(6):2223–2233. https://doi.org/10.1007/s00371-021-02328-7

    Article  Google Scholar 

  18. Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2881–2890. https://doi.org/10.48550/arXiv.1612.01105

  19. Xiao C, Hao X, Li H, Li Y, Zhang W (2022) Real-time semantic segmentation with local spatial pixel adjustment. Image Vis Comput 123:104470. https://doi.org/10.1016/j.imavis.2022.104470

    Article  Google Scholar 

  20. He X, Qi G, Zhu Z, Li Y, Cong B, Bai L (2023) Medical image segmentation method based on multi-feature interaction and fusion over cloud computing. Simul Model Pract Theory 126:102769. https://doi.org/10.1016/j.simpat.2023.102769

    Article  Google Scholar 

  21. Xiao C, Hao X, Li H, Li Y, Zhang W (2022) Real-time semantic segmentation with local spatial pixel adjustment. Image Vis Comput 123:104470. https://doi.org/10.1016/j.imavis.2022.104470

    Article  Google Scholar 

  22. Zhang B, Li W, Hui Y, Liu J, Guan Y (2020) Mfenet: multi-level feature enhancement network for real-time semantic segmentation. Neurocomputing 393:54–65. https://doi.org/10.1016/j.neucom.2020.02.019

    Article  Google Scholar 

  23. Chen Y, Xia R, Yang K, Zou K (2023) Mffn: image super-resolution via multi-level features fusion network. Vis Comput 52:1–16. https://doi.org/10.1007/s00371-023-02795-0

    Article  Google Scholar 

  24. Liu M, Yin H (2021) Efficient pyramid context encoding and feature embedding for semantic segmentation. Image Vis Comput 111:104195. https://doi.org/10.1016/j.imavis.2021.104195

    Article  Google Scholar 

  25. Zhang X, Du B, Wu Z, Wan T (2022) Laanet: lightweight attention-guided asymmetric network for real-time semantic segmentation. Neural Comput Appl 34(5):3573–3587. https://doi.org/10.1007/s00521-022-06932-z

    Article  Google Scholar 

  26. Hu X, Jing L, Sehar U (2022) Joint pyramid attention network for real-time semantic segmentation of urban scenes. Appl Intell 52(1):580–594. https://doi.org/10.1007/s10489-021-02446-8

    Article  Google Scholar 

  27. Wu Y, Jiang J, Huang Z, Tian Y (2022) Fpanet: feature pyramid aggregation network for real-time semantic segmentation. Appl Intell 52(3):3319–3336. https://doi.org/10.1007/s10489-021-02603-z

    Article  Google Scholar 

  28. Zhu Z, He X, Qi G, Li Y, Cong B, Liu Y (2023) Brain tumor segmentation based on the fusion of deep semantics and edge information in multimodal MRI. Inform Fusion 91:376–387. https://doi.org/10.1016/j.inffus.2022.10.022

    Article  Google Scholar 

  29. Zhang K, Li D, Luo W, Ren W (2021) Dual attention-in-attention model for joint rain streak and raindrop removal. IEEE Trans Image Process 30:7608–7619. https://doi.org/10.1109/TIP.2021.3108019

    Article  Google Scholar 

  30. Zhang K, Luo W, Stenger B, Ren W, Ma L, Li H (2020) Every moment matters: Detail-aware networks to bring a blurry image alive. In: Proceedings of the 28th ACM international conference on multimedia, pp. 384–392 . https://doi.org/10.1145/3394171.3413929

  31. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30:329655. https://doi.org/10.5555/3295222.3295349

    Article  Google Scholar 

  32. Xiong J, Po L-M, Yu W-Y, Zhou C, Xian P, Ou W (2023) Csrnet: cascaded selective resolution network for real-time semantic segmentation. Expert Syst Appl 211:118537. https://doi.org/10.1016/j.eswa.2022.118537

    Article  Google Scholar 

  33. Romera E, Alvarez JM, Bergasa LM, Arroyo R (2017) Erfnet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans Intell Transp Syst 19(1):263–272. https://doi.org/10.1109/tits.2017.2750080

    Article  Google Scholar 

  34. Chen W, Gong X, Liu X, Zhang Q, Li Y, Wang Z (2019) Fasterseg: Searching for faster real-time semantic segmentation. arXiv preprint arXiv:1912.10917

  35. Wang Y, Zhou Q, Liu J, Xiong J, Gao G, Wu X, Latecki LJ (2019) Lednet: A lightweight encoder-decoder network for real-time semantic segmentation. In: IEEE international conference on image processing, pp. 1860–1864. https://doi.org/10.1109/icip.2019.8803154

  36. Li G, Yun I, Kim J, Kim J (2019) Dabnet: Depth-wise asymmetric bottleneck for real-time semantic segmentation. arXiv preprint arXiv:1907.11357

  37. Emara T, Abd El Munim HE, Abbas HM (2019) Liteseg: A novel lightweight convnet for semantic segmentation. In: Digital image computing: techniques and applications, pp. 1–7. https://doi.org/10.1109/dicta47822.2019.8945975

  38. Liu J, Zhou Q, Qiang Y, Kang B, Wu X, Zheng B (2020) Fddwnet: a lightweight convolutional neural network for real-time semantic segmentation. In: IEEE international conference on acoustics, speech and signal processing, pp. 2373–2377. https://doi.org/10.1109/icassp40776.2020.9053838

  39. Wang W, Pan Z (2018) Dsnet for real-time driving scene semantic segmentation. arXiv preprint arXiv:1812.07049

  40. Zhou Q, Wang Y, Fan Y, Wu X, Zhang S, Kang B, Latecki LJ (2020) Aglnet: towards real-time semantic segmentation of self-driving images via attention-guided lightweight network. Appl Soft Comput 96:106682. https://doi.org/10.1016/j.asoc.2020.106682

    Article  Google Scholar 

  41. Yu C, Gao C, Wang J, Yu G, Shen C, Sang N (2021) Bisenet v2: bilateral network with guided aggregation for real-time semantic segmentation. Int J Comput Vis 129:3051–3068. https://doi.org/10.1007/s11263-021-01515-2

    Article  Google Scholar 

  42. Zhang X-L, Du B-C, Luo Z-C, Ma K (2022) Lightweight and efficient asymmetric network design for real-time semantic segmentation. Appl Intell 52(1):564–579. https://doi.org/10.1007/s10489-021-02437-9

    Article  Google Scholar 

  43. Yu C, Xiao B, Gao C, Yuan L, Zhang L, Sang N, Wang J (2021) Lite-hrnet: A lightweight high-resolution network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10440–10450. https://doi.org/10.1109/CVPR46437.2021.01030

  44. Li Y, Liu Y, Sun Q (2021) Real-time semantic segmentation via region and pixel context network. In: International conference on pattern recognition, pp. 7043–7049. https://doi.org/10.1109/icpr48806.2021.9413018

  45. Elhassan MA, Huang C, Yang C, Munea TL (2021) Dsanet: Dilated spatial attention for real-time semantic segmentation in urban street scenes. Expert Syst Appl 183:115090. https://doi.org/10.1016/j.eswa.2021.115090

    Article  Google Scholar 

  46. Liu J, Xu X, Shi Y, Deng C, Shi M (2022) Relaxnet: residual efficient learning and attention expected fusion network for real-time semantic segmentation. Neurocomputing 474:115–127. https://doi.org/10.1016/j.neucom.2021.12.003

    Article  Google Scholar 

  47. Sheng P, Shi Y, Liu X, Jin H (2022) Lsnet: real-time attention semantic segmentation network with linear complexity. Neurocomputing 509:94–101. https://doi.org/10.1016/j.neucom.2022.08.049

    Article  Google Scholar 

  48. Hu X, Xu S, Jing L (2023) Lightweight attention-guided redundancy-reuse network for real-time semantic segmentation. IET Image Process 17:2649–2658. https://doi.org/10.1049/ipr2.12816

    Article  Google Scholar 

  49. Dong Y, Zhao K, Zheng L, Yang H, Liu Q, Pei Y (2023) Refinement co-supervision network for real-time semantic segmentation. IET Comput Vis 17:652–662. https://doi.org/10.1049/cvi2.12187

    Article  Google Scholar 

  50. Wang P, Li L, Pan F, Wang L (2023) Lightweight bilateral network for real-time semantic segmentation. J Adv Comput Intell 27(4), 673–682. https://doi.org/10.20965/jaciii.2023.p0673

  51. Yi Q, Dai G, Shi M, Huang Z, Luo A (2023) Elanet: effective lightweight attention-guided network for real-time semantic segmentation. Neural Process Lett 55:6425–6442. https://doi.org/10.1007/s11063-023-11145-z

    Article  Google Scholar 

  52. Arani E, Marzban S, Pata A, Zonooz B (2021) Rgpnet: A real-time general purpose semantic segmentation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 3009–3018 . https://doi.org/10.1109/wacv48630.2021.00305

  53. Hu P, Perazzi F, Heilbron FC, Wang O, Lin Z, Saenko K, Sclaroff S (2020) Real-time semantic segmentation with fast attention. IEEE Robot Autom Lett 6(1):263–270. https://doi.org/10.1109/LRA.2020.3039744

    Article  Google Scholar 

  54. Tan S, Yang W, Lin J, Yu W (2023) Feature extraction and enhancement for real-time semantic segmentation. Concurr Comput 35(17):6573. https://doi.org/10.1002/cpe.6573

    Article  Google Scholar 

  55. Yu F, Koltun V, Funkhouser T (2017) Dilated residual networks. In: Proceedings of the IEEE conference on cComputer vision and pattern recognition, pp. 472–480. https://doi.org/10.48550/arXiv.1705.09914

Download references

Funding

The work was supported by the National Natural Science Foundation of China (No. 12071126), the Scientific Research Fund of Hunan Provincial Education Department, China (23A0081).

Author information

Authors and Affiliations

Authors

Contributions

S.Q. contributed to conceptualization and writing—review and editing. Z.W. contributed to conceptualization, methodology, and writing—original draft. J.W. involved in supervision and investigation. Y.F. involved in supervision and investigation.

Corresponding author

Correspondence to ShaoJun Qu.

Ethics declarations

Conflicts of interests

The authors declare no competing interests.

Ethical approval

Written informed consent for publication of this paper was obtained from all authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qu, S., Wang, Z., Wu, J. et al. FBRNet: a feature fusion and border refinement network for real-time semantic segmentation. Pattern Anal Applic 27, 2 (2024). https://doi.org/10.1007/s10044-023-01207-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10044-023-01207-2

Keywords

Navigation