Skip to main content
Log in

End-to-end lane detection with convolution and transformer

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, an end-to-end lane detection method based on the polynomial regression is proposed, combining CNNs and Transformer. Transformer proposes a self-attentive mechanism to model nonlocal interactions to capture global context. Then, an effective Global-Local training strategy is presented to capture a multi-scale feature, which is capable of capturing richer lane information involving structure and context, especially as the lane marking point is remote. And the obtained multi-scale feature map can be fused by utilizing different scale guidance. Finally, the proposed method is validated on the TuSimple benchmark, whose results show the accuracy can achieve 96.33% in models, and 11.1x faster than the popular Line-CNN model in “compute” time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Badue C, Guidolini R, Carneiro RV et al (2021) Self-driving cars: a survey. Expert Syst Appl 165:113816

    Article  Google Scholar 

  2. Behrendt K, Soussan R (2019) Unsupervised labeled lane marker dataset generation using maps. In: International conference on computervision (ICCV), pp 832–839

  3. Borenstein J (1995) Control and kinematic design of multi-degree-of-freedom mobile robots with compliant linkage. IEEE Trans Robot Autom 11(1):21–35. https://doi.org/10.1109/70.345935

    Article  Google Scholar 

  4. Caltagirone L, Bellone M, Svensson L, Wande M (2019) Lidar–camera fusion for road detection using fully convolutional neural networks. Robot Auton Syst 111:125–131. https://doi.org/10.1016/j.robot.2018.11.002

    Article  Google Scholar 

  5. Campion G, Bastin G, Dandrea-Novel B (1996) Structural properties and classification of kinematic models of wheeled mobile robots. IEEE Trans Robot Autom 12(1):47–62. https://doi.org/10.1109/70.481750

    Article  Google Scholar 

  6. Carion N, Massa F, Synnaeve G (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer, pp 213–229

  7. Chen Z, Zhang J, Tao D (2019) Progressive lidar adaptation for road detection. IEEE/CAA J Autom Sinica 6(3):693–702. https://doi.org/10.1109/JAS.2019.1911459

    Article  Google Scholar 

  8. Dai Z, Liu H, Le QV, Tan M (2021) CoAtNet: marrying convolution and attention for all data sizes. arXiv:2106.04803

  9. Davy N, Bert D B, Stamatios G et al (2018) Towards end-to-end lane detection: an instance segmentation approach. In: 2018 IEEE intelligent vehicles symposium (IV). IEEE, pp 286–291

  10. Ghafoorian M, Nugteren C, Baka N et al (2018) EL-GAN: embedding loss driven generative adversarial networks for lane detection. In: Proceedings of the european conference on computer vision (ECCV) workshops, pp 256–272. https://doi.org/10.1007/978-3-030-11009-3_15

  11. Hou YN, Ma Z, Liu CX, Loy CC (2019) Learning lightweight lane detection cnns by self attention distillation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1013–1021

  12. Huval B, Wang T, Tandon S et al (2015) An empirical evaluation of deep learning on highway driving. arXiv:1504.01716

  13. Kim J, Lee M (2014) Robust lane detection based on convolutional neural network and random sample consensus. In: International conference on neural information processing. Springer, pp 454–461

  14. Kim J, Park C (2017) End-To-end ego lane estimation based on sequential transfer learning for self-driving cars. In: 2017 IEEE Conferemce on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 1194–1202. https://doi.org/10.1109/CVPRW.2017.158

  15. Ko Y, Lee Y, Azam S et al (2020) Key points estimation and point instance segmentation approach for lane detection. IEEE Trans Intell Transport Syst:1–10. https://doi.org/10.1109/TITS.2021.3088488

  16. Kreucher C, Lakshmanan S (1999) LANA: a lane extraction algorithm that uses frequency domain features. IEEE Trans Robot Autom 15(2):343–350. https://doi.org/10.1109/70.760356

    Article  Google Scholar 

  17. Li QQ, Chen L, Li M et al (2014) A sensor-fusion drivable-region and lane-detection system for autonomous vehicle navigation in challenging road scenarios. IEEE Trans Vehicular Technol 63(2):540–555. https://doi.org/10.1109/TVT.2013.2281199

    Article  MathSciNet  Google Scholar 

  18. Li X, Li J, Hu XL, Yang J (2020) Line-CNN: end-to-end traffic line detection with line proposal unit. IEEE Trans Intell Transport Syst 21 (1):248–258. https://doi.org/10.1109/TITS.2019.2890870

    Article  Google Scholar 

  19. Liu RJ, Yuan ZJ, Liu T, Xiong ZL (2021) End-to-end lane shape prediction with transformers. In: Proceedings of the IEEE/CVFWinter conference on applications of computer vision, pp 3694–3702. https://doi.org/10.1109/WACV48630.2021.00374

  20. Low CY, Zamzuri H, Mazlan SA (2014) Simple robust road lane detection algorithm. In: 2014 5th International conference on intelligent and advanced systems (ICIAS). IEEE. https://doi.org/10.1109/ICIAS.2014.6869550

  21. Lv H, Liu C, Zhao X et al (2019) Lane marking regression from confidence area detection to field inference. IEEE Trans Intell Vehicles 6(1):47–56. https://doi.org/10.1109/TIV.2020.3009366

    Article  Google Scholar 

  22. Pan XG, Shi JP, Luo P et al (2018) Spatial as deep: spatial CNN for traffic scene understanding. In: Thirty-second AAAI conference on artificial intelligence, pp 7276–7283

  23. Philion J (2019) Fastdraw: addressing the long tail of lane detection by adapting a sequential prediction network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11582–11591. https://doi.org/10.1109/CVPR.2019.01185

  24. Ruyi J, Reinhard K, Tobi V, Shigang W (2011) Lane detection and tracking using a new lane model and distance transform. Mach Vis Appl 22(4):721–737. https://doi.org/10.1007/s00138-010-0307-7

    Article  Google Scholar 

  25. Sandler M, Howard A, Zhu ML et al (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520. https://doi.org/10.1109/CVPR.2018.00474

  26. Satzoda RK, Sathyanarayana S, Srikanthan T, Sathyanarayana S (2010) Hierarchical additive hough transform for lane detection. IEEE Embedded Syst Lett 2(2):23–26. https://doi.org/10.1109/LES.2010.2051412

    Article  MATH  Google Scholar 

  27. Tabelini L, Berriel R, Paixao TM et al (2020) Polylanenet: lane estimation via deep polynomial regression. In: 2020 25th International conference on pattern recognition (ICPR). IEEE, pp 6150–6156. https://doi.org/10.1109/ICPR48806.2021.9412265

  28. Tan M, Le QV (2021) EfficientNetV2: smaller models and faster training. arXiv:2104.00298

  29. Tusimple benchmark (2017). https://github.com/TuSimple/tusimple-benchmark

  30. Vaswani A, Shazeer N, Parmar N (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008

  31. Wang Q, Han T, Qin Z et al (2020) Multitask attention network for lane detection and fitting. IEEE Trans Neural Netw Learn Syst:1–13. https://doi.org/10.1109/TNNLS.2020.3039675

  32. Wang GJ, Wu J, He R, Tian B (2021) Speed and accuracy tradeoff for LiDAR data based road boundary detection. IEEE/CAA J Autom Sinica 8(6):1210–1220. https://doi.org/10.1109/JAS.2020.1003414

    Article  Google Scholar 

  33. Xiao DG, Yang XF, Li JF, Islam M (2020) Attention deep neural network for lane marking detection. Knowl-Based Syst 194:105584. https://doi.org/10.1016/j.knosys.2020.105584

    Article  Google Scholar 

  34. Yim YU, Oh S-Y (2003) Three-feature based automatic lane detection algorithm (TFALDA) for autonomous driving. IEEE Trans Intell Transport Syst 4 (4):219–224. https://doi.org/10.1109/TITS.2003.821339

    Article  Google Scholar 

  35. Yu CQ, Gao CX, Wang JB et al (2021) BiSeNet V2: bilateral network with guided aggregation for real-time semantic segmentation. Int J Comput Vis 129(11):3051–3068. https://doi.org/10.1007/s11263-021-01515-2

    Article  Google Scholar 

  36. Yu F, Xian WQ, Chen YY et al (2018) BDD100K: a diverse driving video database with scalable annotation tooling. arXiv:1805.04687

  37. Zhang YC, Lu ZQ, MA DD et al (2021) Ripple-GAN: lane line detection with ripple lane line detection network and wasserstein GAN. IEEE Trans Intell Trans Syst 22(3):1532–1542. https://doi.org/10.1109/TITS.2020.2971728

    Article  Google Scholar 

Download references

Funding

This work was partially supported by National Natural Science Foundation of China (Grant Nos. 61473115), the Natural Science Foundation of Henan Province (Grant Nos. 202300410149), the Key Scientific Research Projects of Universities in Henan Province(Grant No. 20A120008, 22A413002), the Scientific and Technological project of Henan Province (Grant Nos. 212102210153), and the Aeronautical Science Foundation of China (Grant No. 20200051042003).

Author information

Authors and Affiliations

Authors

Contributions

Zekun Ge: Writing – original draft, Methodology, Software; Chao Ma: Conceptualization, Investigation, Writing – review & editing; Zhumu Fu: Supervision, Formal Analysis; Shuzhong Song: Supervision; Pengju Si: Software, Validation.

Corresponding author

Correspondence to Zhumu Fu.

Ethics declarations

Competing interests

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Chao Ma, Zhumu Fu, Shuzhong Song and Pengju Si are contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ge, Z., Ma, C., Fu, Z. et al. End-to-end lane detection with convolution and transformer. Multimed Tools Appl 82, 29607–29627 (2023). https://doi.org/10.1007/s11042-023-14622-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14622-8

Keywords

Navigation