Skip to main content
Log in

High-order deep infomax-guided deformable transformer network for efficient lane detection

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

With the development of deep learning, lane detection models based on deep convolutional neural networks have been widely used in autonomous driving systems and advanced driver assistance systems. However, in the case of harsh and complex environment, the performances of detection models degrade greatly due to the difficulty in merging long-range lane points with global context and exclusion of important higher-order information. To address these issues, we propose a new learning model to better capture lane features, called Deformable Transformer with high-order Deep Infomax (DTHDI) model. Specifically, we propose a Deformable Transformer neural network model based on segmentation techniques for high-accuracy detection, in which local and global contextual information is seamlessly fused and more information about the diversity of lane line shape features is retained, resulting in extraction of rich lane features. Meanwhile, we introduce a mutual information maximization approach for mining higher-order correlations among global shape, local shape, and lane position of lane lines to learn more discriminative representations of lane lines. In addition, we employ a row classification approach to further reduce the computational complexity for robust lane line detection. Our model is evaluated on two popular lane detection datasets. The empirical results show that the proposed DTHDI model outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

All of our datasets come from public datasets. You can go to the corresponding official website to download.

References

  1. Wu, P., Chang, C., Lin, C.: Lane-mark extraction for automobiles under complex conditions. Pattern Recognit. 47, 2756–2767 (2014)

    Article  Google Scholar 

  2. Hillel, A., Lerner, R., Levi, D., et al.: Recent progress in road and lane detection: a survey. Mach. Vis. Appl. 25, 727–745 (2014)

    Article  Google Scholar 

  3. Lin, T., Doll, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125 (2017)

  4. Pan, X., Shi, J., Luo, P., et al.: Spatial as deep: spatial cnn for traffific scene understanding. In: Proceeding of the 32nd AAAI conference on artificial intelligence, pp. 7276–7283 (2018)

  5. Qin, Z., Wang, H., Li, X., et al.: Ultra-fast structure aware deep lane detection. In: Proceedings of European conference on computer vision, pp. 276–291 (2020)

  6. Niu, J., Lu, J., Xu, M., et al.: Robust lane detection using two-stage feature extraction with curvefitting. Pattern Recognit. 59, 225–233 (2016)

    Article  Google Scholar 

  7. Narote, S., Bhujbal, P., Narote, A., et al.: A review of recent advances in lane detection and departure warning system. Pattern Recognit. 73, 216–234 (2018)

    Article  Google Scholar 

  8. Lee, M. Lee, J., Lee, D., et al.: Robust lane detection via expanded self-attention. (2021), arXiv:2102.07037

  9. Xu, H., Wang, S., Cai, X., et al.: Curve lane-NAS: Unifying lane-sensitive architecture search and adaptive point blending. (2020), arXiv:2007.12147

  10. Liu, R., Yuan, Z., Liu, T., et al.: End-to-end lane shape prediction with transformers. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 3694–3702 (2021)

  11. Neven, D., Brabandere, B., Georgoulis, S., et al.: Towards end-to-end lane detection: an instance segmentation approach. In: IEEE intelligent vehicles symposium, pp. 286–291 (2018)

  12. Zhang, J., Deng, T., Yan, F., et al.: Lane detection model based on spatio-temporal network with double convolutional gated recurrent units. IEEE Trans. Intell. Transp. Syst. 23(7), 6666–6678 (2021)

    Article  Google Scholar 

  13. Su, J., Chen, C., Zhang, K., et al.: Structure guided lane detection. (2021), arXiv:2105.05403

  14. Xu, H., Wang, S., Cai, X., et al.: Curve lane-NAs: Unifying lane-sensitive architecture search and adaptive point blending. In: Proceedings of the european conference on computer vision, pp. 689–704 (2020)

  15. Lee, M., Lee, J., Lee, D., et al.: Robust lane detection via expanded self-attention. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 533–542 (2022)

  16. Jayasinghe, O., Anhettigama, D., Hemachandra, S., et al.: Swiftlane: Towards fast and efficient lane detection. (2021), arXiv:2110.11779

  17. Yoo, S., Lee, H., Myeong, H., et al.: End-to-end lane marker detection via row-wise classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp. 1006–1007 (2020)

  18. Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Annual conference on neural information processing systems, pp. 5998–6008 (2017)

  19. Wang, W., Xie, E., Li, X., et al.: Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. (2021), arXiv:2102.12122

  20. Hjelm, R., Fedorov, A., Lavoie-Marchildon, S., et al.: Learning deep representations by mutual information estimation and maximization. (2019), arXiv:1808.06670

  21. Mukherjee, S., Asnani, H., Kannan, S.: CCMI: Classifier based conditional mutual information estimation. In: Proceedings of the 35th uncertainty in artificial intelligence conference, pp. 1083–1093 (2020)

  22. Bachman, P., Hjelm, R., Buchwalter, W.: Learning representations by maximizing mutual information across views. In: Proceedings of the 33rd international conference on neural information processing systems, pp. 15535–15545 (2019)

  23. Xu, J., Vedaldi, A., Henriques, J.: Invariant information clustering for unsupervised image classification and segmentation. In: 2019 international conference on computer vision 1, pp. 9865–9874 (2019)

  24. Chen, T., Kornblith, S., Norouzi, M., et al. A simple framework for contrastive learning of visual representations. (2020), arXiv:2002.05709

  25. Tusimple, Tusimple lane detection benchmark (2017). https://github.com/TuSimple/tusimple-benchmark

  26. Tusimple, Tusimple benchmark (2019). https://github.com/TuSimple/tusimple-benchmark

  27. Philion, J.: Fastdraw: Addressing the long tail of lane detection by adapting a sequential prediction network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11582–11591 (2019)

  28. Hou, Y., Ma, Z., Liu, C., et al.: Learning lightweight lane detection cnns by self-attention distillation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 1013–1021 (2019)

Download references

Funding

This work described in this paper was supported by the Open Foundation of State Key Laboratory for Novel Software Technology at Nanjing University of P. R. China (No. KFKT2021B12). This work was supported in part by the Future Network Scientific Research Fund Project (FNSRFP-2021-YB-54), the Natural Science Foundation of the Higher Education Institutions of Jiangsu Province (17KJB520028), Tongda College of Nanjing University of Posts and Telecommunications (XK203XZ21001), Major Science and Technology Project of Jilin Province, China (20210301030GX), and Key Research and Development Program of Hubei Province, China (2021BAA179 and 2022BAA079). The numerical calculations in this paper have been done on the supercomputing system in the Supercomputing Center of Wuhan University.

Author information

Authors and Affiliations

Authors

Contributions

RG: conceptualization, methodology, software. SH: data curation, writing-original draft preparation. LY: supervision, writing. LZ: supervision, writing - review and editing. HR: review, editing. YY: supervision, writing - review & editing. ZY: review, editing.

Corresponding author

Correspondence to Li Zhang.

Ethics declarations

Conflicts of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, R., Hu, S., Yan, L. et al. High-order deep infomax-guided deformable transformer network for efficient lane detection. SIViP 17, 3045–3052 (2023). https://doi.org/10.1007/s11760-023-02525-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-023-02525-y

Keywords

Navigation