Asynchronous Large Language Model Enhanced Planner for Autonomous Driving

Chen, Yuan; Ding, Zi-han; Wang, Ziqin; Wang, Yan; Zhang, Lijun; Liu, Si

doi:10.1007/978-3-031-72764-1_2

Yuan Chen^13,14,
Zi-han Ding¹³,
Ziqin Wang¹³,
Yan Wang¹⁴,
Lijun Zhang¹³ &
…
Si Liu¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15094))

Included in the following conference series:

European Conference on Computer Vision

898 Accesses
9 Citations

Abstract

Despite real-time planners exhibiting remarkable performance in autonomous driving, the growing exploration of Large Language Models (LLMs) has opened avenues for enhancing the interpretability and controllability of motion planning. Nevertheless, LLM-based planners continue to encounter significant challenges, including elevated resource consumption and extended inference times, which pose substantial obstacles to practical deployment. In light of these challenges, we introduce AsyncDriver, a new asynchronous LLM-enhanced closed-loop framework designed to leverage scene-associated instruction features produced by LLM to guide real-time planners in making precise and controllable trajectory predictions. On one hand, our method highlights the prowess of LLMs in comprehending and reasoning with vectorized scene data and a series of routing instructions, demonstrating its effective assistance to real-time planners. On the other hand, the proposed framework decouples the inference processes of the LLM and real-time planners. By capitalizing on the asynchronous nature of their inference frequencies, our approach have successfully reduced the computational cost introduced by LLM, while maintaining comparable performance. Experiments show that our approach achieves superior closed-loop evaluation performance on nuPlan’s challenging scenarios. The code and dataset are available at https://github.com/memberRE/AsyncDriver.

Y. Chen, Z. Ding and Z. Wang—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Advancing Autonomous Driving Safety Through LLM Enhanced Trajectory Prediction

Real-Time Reliable Large Language Models with Distributed Knowledge Crowdsourcing for Automotive Mobile Intelligence

Agent Can Say No: Robot Task Planning by Natural Language Feedback Between Planner and Executor

References

Achiam, J., et al.: GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023)
Bacha, A., et al.: Odin: team Victortango’s entry in the Darpa urban challenge. J. Field Robot. 25(8), 467–492 (2008)
Article Google Scholar
Baidu: Apollo auto, July 2019. https://github.com/ApolloAuto/apollo
Caesar, H., et al.: NuPlan: a closed-loop ML-based planning benchmark for autonomous vehicles. arXiv preprint arXiv:2106.11810 (2021)
Chen, L., et al.: Towards end-to-end embodied decision making via multi-modal large language model: explorations with GPT4-vision and beyond. arXiv preprint arXiv:2310.02071 (2023)
Chen, L., et al.: Milestones in autonomous driving and intelligent vehicles: survey of surveys. IEEE Trans. Intell. Veh. 8(2), 1046–1056 (2022)
Article Google Scholar
Chen, L., et al.: Driving with LLMS: fusing object-level vector modality for explainable autonomous driving. arXiv preprint arXiv:2310.01957 (2023)
Cui, C., Ma, Y., Cao, X., Ye, W., Wang, Z.: Receive, reason, and react: drive as you say with large language models in autonomous vehicles. arXiv preprint arXiv:2310.08034 (2023)
Cui, C., et al.: A survey on multimodal large language models for autonomous driving. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 958–979 (2024)
Google Scholar
Dauner, D., Hallgarten, M., Geiger, A., Chitta, K.: Parting with misconceptions about learning-based vehicle motion planning. arXiv preprint arXiv:2306.07962 (2023)
Fan, H., et al.: Baidu Apollo EM motion planner. arXiv preprint arXiv:1807.08048 (2018)
Fu, D., et al.: Drive like a human: rethinking autonomous driving with large language models. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 910–919 (2024)
Google Scholar
Hallgarten, M., Stoll, M., Zell, A.: From prediction to planning with goal conditioned lane graph traversals. arXiv preprint arXiv:2302.07753 (2023)
Han, W., Guo, D., Xu, C.Z., Shen, J.: DME-driver: integrating human decision logic and 3D scene perception in autonomous driving. arXiv preprint arXiv:2401.03641 (2024)
Hu, E.J., et al.: LoRA: low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021)
Hu, S., Chen, L., Wu, P., Li, H., Yan, J., Tao, D.: ST-P3: end-to-end vision-based autonomous driving via spatial-temporal feature learning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13698, pp. 533–549. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19839-7_31
Chapter Google Scholar
Hu, Y., et al.: Planning-oriented autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17853–17862 (2023)
Google Scholar
Huang, Z., Liu, H., Lv, C.: GameFormer: game-theoretic modeling and learning of transformer-based interactive prediction and planning for autonomous driving. arXiv preprint arXiv:2303.05760 (2023)
Jin, Y., et al.: SurrealDriver: designing generative driver agent simulation framework in urban contexts based on large language model. arXiv preprint arXiv:2309.13193 (2023)
Kendall, A., et al.: Learning to drive in a day. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 8248–8254. IEEE (2019)
Google Scholar
Kesting, A., Treiber, M., Helbing, D.: General lane-changing model MOBIL for car-following models. Transp. Res. Rec. J. Transp. Res. Board, 86–94 (2007). https://doi.org/10.3141/1999-10. http://dx.doi.org/10.3141/1999-10
Keysan, A., et al.: Can you text what is happening? Integrating pre-trained language encoders into trajectory prediction models for autonomous driving. arXiv preprint arXiv:2309.05282 (2023)
Leonard, J., et al.: A perception-driven autonomous urban vehicle. J. Field Robot. 25(10), 727–774 (2008)
Article Google Scholar
Li, Z., Nie, F., Sun, Q., Da, F., Zhao, H.: Boosting offline reinforcement learning for autonomous driving with hierarchical latent skills. arXiv preprint arXiv:2309.13614 (2023)
Liu, J., Hang, P., Qi, X., Wang, J., Sun, J.: MTD-GPT: a multi-task decision-making GPT model for autonomous driving at unsignalized intersections. In: 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC), pp. 5154–5161. IEEE (2023)
Google Scholar
Ma, Y., Cao, Y., Sun, J., Pavone, M., Xiao, C.: Dolphins: multimodal language model for driving. arXiv preprint arXiv:2312.00438 (2023)
Ma, Y., et al.: LaMPilot: an open benchmark dataset for autonomous driving with language model programs. arXiv preprint arXiv:2312.04372 (2023)
Mao, J., Qian, Y., Zhao, H., Wang, Y.: GPT-driver: learning to drive with GPT. arXiv preprint arXiv:2310.01415 (2023)
Mao, J., Ye, J., Qian, Y., Pavone, M., Wang, Y.: A language agent for autonomous driving. arXiv preprint arXiv:2311.10813 (2023)
Motional: nuPlan challange (2023). https://github.com/motional/nuplan-devkit
Nie, M., et al.: Reason2drive: towards interpretable and chain-based reasoning for autonomous driving. arXiv preprint arXiv:2312.03661 (2023)
Renz, K., Chitta, K., Mercea, O.B., Koepke, A., Akata, Z., Geiger, A.: Plant: explainable planning transformers via object-level representations. arXiv preprint arXiv:2210.14222 (2022)
Scheel, O., Bergamini, L., Wolczyk, M., Osiński, B., Ondruska, P.: Urban driver: learning to drive from real-world demonstrations using policy gradients. In: Conference on Robot Learning, pp. 718–728. PMLR (2022)
Google Scholar
Sha, H., et al.: LanguageMPC: large language models as decision makers for autonomous driving. arXiv preprint arXiv:2310.03026 (2023)
Shao, H., Hu, Y., Wang, L., Waslander, S.L., Liu, Y., Li, H.: LMDrive: closed-loop end-to-end driving with large language models. arXiv preprint arXiv:2312.07488 (2023)
Sharan, S., Pittaluga, F., Chandraker, M., et al.: LLM-assist: enhancing closed-loop planning with language-based reasoning. arXiv preprint arXiv:2401.00125 (2023)
Sima, C., et al.: DriveLM: driving with graph visual question answering. arXiv preprint arXiv:2312.14150 (2023)
Touvron, H., et al.: Llama 2: open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023)
Treiber, M., Hennecke, A., Helbing, D.: Congested traffic states in empirical observations and microscopic simulations. Phys. Rev. E, 1805–1824 (2002). https://doi.org/10.1103/physreve.62.1805. http://dx.doi.org/10.1103/physreve.62.1805
Urmson, C., et al.: Autonomous driving in urban environments: boss and the urban challenge. J. field Robot. 25(8), 425–466 (2008)
Article Google Scholar
Wang, S., Zhu, Y., Li, Z., Wang, Y., Li, L., He, Z.: ChatGPT as your vehicle co-pilot: an initial attempt. IEEE Trans. Intell. Veh. 8, 4706–4721 (2023)
Article Google Scholar
Wang, W., et al.: DriveMLM: aligning multi-modal large language models with behavioral planning states for autonomous driving. arXiv preprint arXiv:2312.09245 (2023)
Wang, Y., et al.: Empowering autonomous driving with large language models: a safety perspective. arXiv preprint arXiv:2312.00812 (2023)
Wen, L., et al.: DiLu: a knowledge-driven approach to autonomous driving with large language models. arXiv preprint arXiv:2309.16292 (2023)
Xu, Z., et al.: DriveGPT4: interpretable end-to-end autonomous driving via large language model. arXiv preprint arXiv:2310.01412 (2023)
Yang, Z., Jia, X., Li, H., Yan, J.: A survey of large language models for autonomous driving. arXiv preprint arXiv:2311.01043 (2023)
Yuan, J., et al.: RAG-driver: generalisable driving explanations with retrieval-augmented in-context learning in multi-modal large language model. arXiv preprint arXiv:2402.10828 (2024)
Zhou, X., Liu, M., Zagar, B.L., Yurtsever, E., Knoll, A.C.: Vision language models in autonomous driving and intelligent transportation systems. arXiv preprint arXiv:2310.14414 (2023)

Download references

Acknowledgements

This research is supported in part by the National Science and Technology Major Project (No. 2022ZD0115502), the National Natural Science Foundation of China (No. 62122010, U23B2010), Zhejiang Provincial Natural Science Foundation of China (No. LDT23F02022F02), Beijing Natural Science Foundation (No. L231011), Beihang World TOP University Cooperation Program, and Lenovo Research.

Author information

Authors and Affiliations

Beihang University, Beijing, China
Yuan Chen, Zi-han Ding, Ziqin Wang, Lijun Zhang & Si Liu
AIR, Tsinghua University, Beijing, China
Yuan Chen & Yan Wang

Authors

Yuan Chen
View author publications
Search author on:PubMed Google Scholar
Zi-han Ding
View author publications
Search author on:PubMed Google Scholar
Ziqin Wang
View author publications
Search author on:PubMed Google Scholar
Yan Wang
View author publications
Search author on:PubMed Google Scholar
Lijun Zhang
View author publications
Search author on:PubMed Google Scholar
Si Liu
View author publications
Search author on:PubMed Google Scholar

Corresponding authors

Correspondence to Yan Wang or Si Liu .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 18012 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Y., Ding, Zh., Wang, Z., Wang, Y., Zhang, L., Liu, S. (2025). Asynchronous Large Language Model Enhanced Planner for Autonomous Driving. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15094. Springer, Cham. https://doi.org/10.1007/978-3-031-72764-1_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-72764-1_2
Published: 25 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72763-4
Online ISBN: 978-3-031-72764-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Asynchronous Large Language Model Enhanced Planner for Autonomous Driving