L2T-BEV: Local Lane Topology Prediction from Onboard Surround-View Cameras in Bird’s Eye View Perspective

Ye, Shanding; Li, Tao; Li, Ruihang; Pan, Zhijie

doi:10.1007/978-981-99-8435-0_29

Shanding Ye¹⁵,
Tao Li¹⁵,
Ruihang Li¹⁵ &
…
Zhijie Pan¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14427))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

846 Accesses

Abstract

High definition maps (HDMaps) serve as the foundation for autonomous vehicles, encompassing various driving scenario elements, among which lane topology is critically important for vehicle perception and planning. Existing work on lane topology extraction predominantly relies on manual processing, while automated methods are limited to road topology extraction. Recently, road representation learning based on surround-view with bird’s-eye view (BEV) has emerged, which directly predicts localized vectorized maps around the vehicle. However, these maps cannot represent the topological relationships between lanes. As a solution, we propose a novel method, L2T-BEV, which learns local lane topology maps in BEV. This method utilizes the EfficientNet to extract features from surround-view images, followed by transforming these features into the BEV space through the Inverse Perspective Mapping (IPM). Nonetheless, the IPM transformation often suffers from distortion issues. To alleviate this, we add a learnable residual mapping function to the features after the IPM transformation. Finally, we employ a transformer network with learnable positional embedding to process the fused images, generating higher-precision lane topology. We validated our method on the NuScenes dataset, and the experimental results demonstrate the feasibility and excellent performance.

This work was supported by the Key Research and Development Program of Zhejiang Province in China (No. 2023C01237), and the Natural Science Foundation of China(No.U22A202101).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

CenterLineFormer: Road Centerlines Graph Generation with Single Onboard Camera

Article 13 January 2024

JPerceiver: Joint Perception Network for Depth, Pose and Layout Estimation in Driving Scenes

RoadPainter: Points Are Ideal Navigators for Topology TransformER

References

. Wang, H., Xue, C., Zhou, Y., Wen, F., Zhang, H.: Visual semantic localization based on HD map for autonomous vehicles in urban scenarios. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, pp. 11255–11261 (2021). https://doi.org/10.1109/ICRA48506.2021.9561459
Chiang, K.-W., Zeng, J.-C., Tsai, M.-L., Darweesh, H., Chen, P.-X., Wang, C.-K.: Bending the curve of HD maps production for autonomous vehicle applications in Taiwan. IEEE J. Sel. Topics Appl. Earth Obs. Remote Sens. 15, 8346–8359 (2022). https://doi.org/10.1109/JSTARS.2022.3204306
Article Google Scholar
Chiang, K.W., Wang, C.K., Hong, J.H., et al.: Verification and validation procedure for high-definition maps in Taiwan. Urban Inf. 1, 18 (2022). https://doi.org/10.1007/s44212-022-00014-0
Article Google Scholar
Liu, J.N., Zhan, J., Guo, C., Li, Y., Wu, H.B., Huang, H.: Data logic structure and key technologies on intelligent high-precision map. Acta Geodaetica et Cartographica Sinica 48(8), 939–953 (2019). https://doi.org/10.11947/j.AGCS.2019.20190125
Maiouak, M., Taleb, T.: Dynamic maps for automated driving and UAV geofencing. IEEE Wirel. Commun. 26(4), 54–59 (2019). https://doi.org/10.1109/MWC.2019.1800544
Article Google Scholar
HERE. https://www.here.com/. Accessed 8 Apr 2023
Kim, C., Cho, S., Sunwoo, M., Resende, P., Bradaï, B., Jo, K.: Updating point cloud layer of high definition (HD) map based on crowd-sourcing of multiple vehicles installed LiDAR. IEEE Access 9, 8028–8046 (2021). https://doi.org/10.1109/ACCESS.2021.3049482
Article Google Scholar
Jang, W., An, J., Lee, S., Cho, M., Sun, M., Kim, E.: Road lane semantic segmentation for high definition map. In: IEEE Intelligent Vehicles Symposium (IV). Changshu, China 2018, pp. 1001–1006 (2018). https://doi.org/10.1109/IVS.2018.8500661
Can, Y.B., Liniger, A., Paudel, D.P., Van Gool, L.: Topology preserving local road network estimation from single onboard camera image. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, pp. 17242–17251 (2022). https://doi.org/10.1109/CVPR52688.2022.01675
Kiran, B.R., et al.: Real-time dynamic object detection for autonomous driving using prior 3D-maps. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2018). https://doi.org/10.1007/978-3-030-11021-5_35
Bao, Z., Hossain, S., Lang, H., Lin, X.: High-definition map generation technologies for autonomous driving: a review (2022). arXiv preprint arXiv:2206.05400
Ma, L., Li, Y., Li, J., Junior, J.M., Gonçalves, W.N., Chapman, M.A.: BoundaryNet: extraction and completion of road boundaries with deep learning using mobile laser scanning point clouds and satellite imagery. IEEE Trans. Intell. Transp. Syst. 23(6), 5638–5654 (2022). https://doi.org/10.1109/TITS.2021.3055366
Article Google Scholar
Xu, Z., et al.: csBoundary: city-scale road-boundary detection in aerial images for high-definition Maps. IEEE Rob. Autom. Lett. 7(2), 5063–5070 (2022). https://doi.org/10.1109/LRA.2022.3154052
Article MathSciNet Google Scholar
Gao, S., Li, M., Rao, J., Mai, G., Prestby, T., Marks, J., Hu, Y.: Automatic urban road network extraction from massive GPS trajectories of taxis. In: Werner, M., Chiang, Y.-Y. (eds.) Handbook of Big Geospatial Data, pp. 261–283. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-55462-0_11
Chapter Google Scholar
Can, Y.B., Liniger, A., Paudel, D.P., Van Gool, L.: Structured bird’s-eye-view traffic scene understanding from onboard images. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, pp. 15641–15650 (2021). https://doi.org/10.1109/ICCV48922.2021.01537
Li, Q., Wang, Y., Wang, Y., Zhao, H.: HDMapNet: an online HD map construction and evaluation framework. In: International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, pp. 4628–4634 (2022). https://doi.org/10.1109/ICRA46639.2022.9812383
Liu, Y.C., Wang, Y., Wang, Y.L., Zhao, H.: Vectormapnet: end-to-end vectorized hd map learning. arXiv preprint arXiv:2206.08920 (2022)
Liao, B.C., et al.: MapTR: structured modeling and learning for online vectorized HD map construction. arXiv preprint arXiv:2208.14437 (2022)
Deng, L., Yang, M., Li, H., Li, T., Hu, B., Wang, C.: Restricted deformable convolution-based road scene semantic segmentation using surround view cameras. IEEE Trans. Intell. Transp. Syst. 21(10), 4350–4362 (2020). https://doi.org/10.1109/TITS.2019.2939832
Article Google Scholar
Raisi, Z., Naiel, M.A., Younes, G., Wardell, S., Zelek, J.: 2LSPE: 2D learnable sinusoidal positional encoding using transformer for scene text recognition. In: 2021 18th Conference on Robots and Vision (CRV), Burnaby, BC, Canada, pp. 119–126 (2021). https://doi.org/10.1109/CRV52889.2021.00024
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, pp. 11618–11628 (2020). https://doi.org/10.1109/CVPR42600.2020.01164
Máttyus, G., Luo, W., Urtasun, R.: DeepRoadMapper: extracting road topology from aerial images. In: 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, pp. 3458–3466 (2017). https://doi.org/10.1109/ICCV.2017.372
Batra, A., Singh, S., Pang, G., Basu, S., Jawahar, C.V., Paluri, M.: Improved road connectivity by joint learning of orientation and segmentation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 10377–10385 (2019). https://doi.org/10.1109/CVPR.2019.01063
Bastani, F., et al.: RoadTracer: automatic extraction of road networks from aerial images. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 4720–4728 (2018). https://doi.org/10.1109/CVPR.2018.00496
Zhang, J., Hu, X., Wei, Y., Zhang, L.: Road topology extraction from satellite imagery by joint learning of nodes and their connectivity. IEEE Trans. Geosci. Remote Sens. 61, 1–13 (2023). https://doi.org/10.1109/TGRS.2023.3241679
Article Google Scholar
Zhou, B., Krähenbühl, P.: Cross-view transformers for real-time map-view semantic segmentation. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, pp. 13750–13759 (2022). https://doi.org/10.1109/CVPR52688.2022.01339
Hu, A., et al.: FIERY: future instance prediction in bird’s-eye view from surround monocular cameras. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, pp. 15253–15262 (2021). https://doi.org/10.1109/ICCV48922.2021.01499
Tan, M., Le, Q.V.: EfficientNet: rethinking model scaling for convolutional neural networks (2019). ArXiv preprint arXiv:1905.11946
Xu, Z.H., Liu, Y.X., Sun, Y.X., Liu, M., Wang, L.J.: CenterLineDet: CenterLine Graph detection for road lanes with vehicle-mounted sensors by transformer for HD map generation (2023). ArXiv preprint arXiv:2209.07734
Acuna, D., Ling, H., Kar, A., Fidler, S.: Efficient interactive annotation of segmentation datasets with Polygon-RNN++. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 859–868 (2018). https://doi.org/10.1109/CVPR.2018.00096

Download references

Author information

Authors and Affiliations

College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Shanding Ye, Tao Li, Ruihang Li & Zhijie Pan

Authors

Shanding Ye
View author publications
You can also search for this author in PubMed Google Scholar
Tao Li
View author publications
You can also search for this author in PubMed Google Scholar
Ruihang Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhijie Pan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhijie Pan .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Xiamen University, Xiamen, China
Hanzi Wang
Beijing University of Posts and Telecommunications, Beijing, China
Zhanyu Ma
Sun Yat-sen University, Guangzhou, China
Weishi Zheng
Peking University, Beijing, China
Hongbin Zha
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Chinese Academy of Sciences, Beijing, China
Liang Wang
Xiamen University, Xiamen, China
Rongrong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ye, S., Li, T., Li, R., Pan, Z. (2024). L2T-BEV: Local Lane Topology Prediction from Onboard Surround-View Cameras in Bird’s Eye View Perspective. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14427. Springer, Singapore. https://doi.org/10.1007/978-981-99-8435-0_29

Download citation

DOI: https://doi.org/10.1007/978-981-99-8435-0_29
Published: 24 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8434-3
Online ISBN: 978-981-99-8435-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

L2T-BEV: Local Lane Topology Prediction from Onboard Surround-View Cameras in Bird’s Eye View Perspective