Abstract
Despite real-time planners exhibiting remarkable performance in autonomous driving, the growing exploration of Large Language Models (LLMs) has opened avenues for enhancing the interpretability and controllability of motion planning. Nevertheless, LLM-based planners continue to encounter significant challenges, including elevated resource consumption and extended inference times, which pose substantial obstacles to practical deployment. In light of these challenges, we introduce AsyncDriver, a new asynchronous LLM-enhanced closed-loop framework designed to leverage scene-associated instruction features produced by LLM to guide real-time planners in making precise and controllable trajectory predictions. On one hand, our method highlights the prowess of LLMs in comprehending and reasoning with vectorized scene data and a series of routing instructions, demonstrating its effective assistance to real-time planners. On the other hand, the proposed framework decouples the inference processes of the LLM and real-time planners. By capitalizing on the asynchronous nature of their inference frequencies, our approach have successfully reduced the computational cost introduced by LLM, while maintaining comparable performance. Experiments show that our approach achieves superior closed-loop evaluation performance on nuPlan’s challenging scenarios. The code and dataset are available at https://github.com/memberRE/AsyncDriver.
Y. Chen, Z. Ding and Z. Wang—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Achiam, J., et al.: GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023)
Bacha, A., et al.: Odin: team Victortango’s entry in the Darpa urban challenge. J. Field Robot. 25(8), 467–492 (2008)
Baidu: Apollo auto, July 2019. https://github.com/ApolloAuto/apollo
Caesar, H., et al.: NuPlan: a closed-loop ML-based planning benchmark for autonomous vehicles. arXiv preprint arXiv:2106.11810 (2021)
Chen, L., et al.: Towards end-to-end embodied decision making via multi-modal large language model: explorations with GPT4-vision and beyond. arXiv preprint arXiv:2310.02071 (2023)
Chen, L., et al.: Milestones in autonomous driving and intelligent vehicles: survey of surveys. IEEE Trans. Intell. Veh. 8(2), 1046–1056 (2022)
Chen, L., et al.: Driving with LLMS: fusing object-level vector modality for explainable autonomous driving. arXiv preprint arXiv:2310.01957 (2023)
Cui, C., Ma, Y., Cao, X., Ye, W., Wang, Z.: Receive, reason, and react: drive as you say with large language models in autonomous vehicles. arXiv preprint arXiv:2310.08034 (2023)
Cui, C., et al.: A survey on multimodal large language models for autonomous driving. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 958–979 (2024)
Dauner, D., Hallgarten, M., Geiger, A., Chitta, K.: Parting with misconceptions about learning-based vehicle motion planning. arXiv preprint arXiv:2306.07962 (2023)
Fan, H., et al.: Baidu Apollo EM motion planner. arXiv preprint arXiv:1807.08048 (2018)
Fu, D., et al.: Drive like a human: rethinking autonomous driving with large language models. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 910–919 (2024)
Hallgarten, M., Stoll, M., Zell, A.: From prediction to planning with goal conditioned lane graph traversals. arXiv preprint arXiv:2302.07753 (2023)
Han, W., Guo, D., Xu, C.Z., Shen, J.: DME-driver: integrating human decision logic and 3D scene perception in autonomous driving. arXiv preprint arXiv:2401.03641 (2024)
Hu, E.J., et al.: LoRA: low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021)
Hu, S., Chen, L., Wu, P., Li, H., Yan, J., Tao, D.: ST-P3: end-to-end vision-based autonomous driving via spatial-temporal feature learning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13698, pp. 533–549. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19839-7_31
Hu, Y., et al.: Planning-oriented autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17853–17862 (2023)
Huang, Z., Liu, H., Lv, C.: GameFormer: game-theoretic modeling and learning of transformer-based interactive prediction and planning for autonomous driving. arXiv preprint arXiv:2303.05760 (2023)
Jin, Y., et al.: SurrealDriver: designing generative driver agent simulation framework in urban contexts based on large language model. arXiv preprint arXiv:2309.13193 (2023)
Kendall, A., et al.: Learning to drive in a day. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 8248–8254. IEEE (2019)
Kesting, A., Treiber, M., Helbing, D.: General lane-changing model MOBIL for car-following models. Transp. Res. Rec. J. Transp. Res. Board, 86–94 (2007). https://doi.org/10.3141/1999-10. http://dx.doi.org/10.3141/1999-10
Keysan, A., et al.: Can you text what is happening? Integrating pre-trained language encoders into trajectory prediction models for autonomous driving. arXiv preprint arXiv:2309.05282 (2023)
Leonard, J., et al.: A perception-driven autonomous urban vehicle. J. Field Robot. 25(10), 727–774 (2008)
Li, Z., Nie, F., Sun, Q., Da, F., Zhao, H.: Boosting offline reinforcement learning for autonomous driving with hierarchical latent skills. arXiv preprint arXiv:2309.13614 (2023)
Liu, J., Hang, P., Qi, X., Wang, J., Sun, J.: MTD-GPT: a multi-task decision-making GPT model for autonomous driving at unsignalized intersections. In: 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC), pp. 5154–5161. IEEE (2023)
Ma, Y., Cao, Y., Sun, J., Pavone, M., Xiao, C.: Dolphins: multimodal language model for driving. arXiv preprint arXiv:2312.00438 (2023)
Ma, Y., et al.: LaMPilot: an open benchmark dataset for autonomous driving with language model programs. arXiv preprint arXiv:2312.04372 (2023)
Mao, J., Qian, Y., Zhao, H., Wang, Y.: GPT-driver: learning to drive with GPT. arXiv preprint arXiv:2310.01415 (2023)
Mao, J., Ye, J., Qian, Y., Pavone, M., Wang, Y.: A language agent for autonomous driving. arXiv preprint arXiv:2311.10813 (2023)
Motional: nuPlan challange (2023). https://github.com/motional/nuplan-devkit
Nie, M., et al.: Reason2drive: towards interpretable and chain-based reasoning for autonomous driving. arXiv preprint arXiv:2312.03661 (2023)
Renz, K., Chitta, K., Mercea, O.B., Koepke, A., Akata, Z., Geiger, A.: Plant: explainable planning transformers via object-level representations. arXiv preprint arXiv:2210.14222 (2022)
Scheel, O., Bergamini, L., Wolczyk, M., Osiński, B., Ondruska, P.: Urban driver: learning to drive from real-world demonstrations using policy gradients. In: Conference on Robot Learning, pp. 718–728. PMLR (2022)
Sha, H., et al.: LanguageMPC: large language models as decision makers for autonomous driving. arXiv preprint arXiv:2310.03026 (2023)
Shao, H., Hu, Y., Wang, L., Waslander, S.L., Liu, Y., Li, H.: LMDrive: closed-loop end-to-end driving with large language models. arXiv preprint arXiv:2312.07488 (2023)
Sharan, S., Pittaluga, F., Chandraker, M., et al.: LLM-assist: enhancing closed-loop planning with language-based reasoning. arXiv preprint arXiv:2401.00125 (2023)
Sima, C., et al.: DriveLM: driving with graph visual question answering. arXiv preprint arXiv:2312.14150 (2023)
Touvron, H., et al.: Llama 2: open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023)
Treiber, M., Hennecke, A., Helbing, D.: Congested traffic states in empirical observations and microscopic simulations. Phys. Rev. E, 1805–1824 (2002). https://doi.org/10.1103/physreve.62.1805. http://dx.doi.org/10.1103/physreve.62.1805
Urmson, C., et al.: Autonomous driving in urban environments: boss and the urban challenge. J. field Robot. 25(8), 425–466 (2008)
Wang, S., Zhu, Y., Li, Z., Wang, Y., Li, L., He, Z.: ChatGPT as your vehicle co-pilot: an initial attempt. IEEE Trans. Intell. Veh. 8, 4706–4721 (2023)
Wang, W., et al.: DriveMLM: aligning multi-modal large language models with behavioral planning states for autonomous driving. arXiv preprint arXiv:2312.09245 (2023)
Wang, Y., et al.: Empowering autonomous driving with large language models: a safety perspective. arXiv preprint arXiv:2312.00812 (2023)
Wen, L., et al.: DiLu: a knowledge-driven approach to autonomous driving with large language models. arXiv preprint arXiv:2309.16292 (2023)
Xu, Z., et al.: DriveGPT4: interpretable end-to-end autonomous driving via large language model. arXiv preprint arXiv:2310.01412 (2023)
Yang, Z., Jia, X., Li, H., Yan, J.: A survey of large language models for autonomous driving. arXiv preprint arXiv:2311.01043 (2023)
Yuan, J., et al.: RAG-driver: generalisable driving explanations with retrieval-augmented in-context learning in multi-modal large language model. arXiv preprint arXiv:2402.10828 (2024)
Zhou, X., Liu, M., Zagar, B.L., Yurtsever, E., Knoll, A.C.: Vision language models in autonomous driving and intelligent transportation systems. arXiv preprint arXiv:2310.14414 (2023)
Acknowledgements
This research is supported in part by the National Science and Technology Major Project (No. 2022ZD0115502), the National Natural Science Foundation of China (No. 62122010, U23B2010), Zhejiang Provincial Natural Science Foundation of China (No. LDT23F02022F02), Beijing Natural Science Foundation (No. L231011), Beihang World TOP University Cooperation Program, and Lenovo Research.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, Y., Ding, Zh., Wang, Z., Wang, Y., Zhang, L., Liu, S. (2025). Asynchronous Large Language Model Enhanced Planner for Autonomous Driving. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15094. Springer, Cham. https://doi.org/10.1007/978-3-031-72764-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-72764-1_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72763-4
Online ISBN: 978-3-031-72764-1
eBook Packages: Computer ScienceComputer Science (R0)