Abstract
The capability of path planning is a necessity for an agent to accomplish tasks autonomously. Traditional path planning methods fail to complete tasks that are constrained by temporal properties, such as conditional reachability, safety, and liveness. Our work presents an integrated approach that combines reinforcement learning (RL) with multi-objective optimization to address path planning problems with the consideration of temporal logic constraints. The main contributions of this paper are as follows. (1) We propose an algorithm LCAP\(^2\) to design extra rewards and accelerate training by tackling a multi-objective optimization problem. The experimental results show that the method effectively accelerates the convergence of the path lengths traversed during the agent’s training. (2) We provide a convergence theorem based on the fixed-point theory and contraction mapping theorem.
This research is supported by National Natural Science Foundation of China under Grant Nos. 62272359 and 62172322; Natural Science Basic Research Program of Shaanxi Province under Grant Nos. 2023JC-XJ-13 and 2022JM-367.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aggarwal, S., Kumar, N.: Path planning techniques for unmanned aerial vehicles: a review, solutions, and challenges. Comput. Commun. 149, 270–299 (2020)
Babiak, T., Křetínský, M., Řehák, V., Strejček, J.: LTL to Büchi automata translation: fast and more deterministic. In: Flanagan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 95–109. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28756-5_8
Baier, C., Katoen, J.P.: Principles of Model Checking. MIT Press, Cambridge (2008)
Duan, Z.: Temporal logic and temporal logic programming. Science Press (2005)
Duan, Z., Tian, C., Yang, M., He, J.: Bounded model checking for propositional projection temporal logic. In: Du, D.-Z., Zhang, G. (eds.) COCOON 2013. LNCS, vol. 7936, pp. 591–602. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38768-5_52
Duan, Z., Tian, C., Zhang, L.: A decision procedure for propositional projection temporal logic with infinite models. Acta Informatica 45(1), 43–78 (2008)
Duan, Z., Tian, C., Zhang, N.: A canonical form based decision procedure and model checking approach for propositional projection temporal logic. Theor. Comput. Sci. 609, 544–560 (2016)
Gao, Q.: Deep Reinforcement Learning with Temporal Logic Specifications. Ph.D. thesis, Duke University (2018)
Gasparetto, A., Boscariol, P., Lanzutti, A., Vidoni, R.: Path planning and trajectory planning algorithms: a general overview. Motion Oper. Plan. Robot. Syst. Background Pract. Approach. 29, 3–27 (2015)
Hasanbeig, M., Abate, A., Kroening, D.: Logically-constrained reinforcement learning. arXiv preprint arXiv:1801.08099 (2018)
Hasanbeig, M., Jeppu, N.Y., Abate, A., Melham, T., Kroening, D.: Deepsynth: automata synthesis for automatic task segmentation in deep reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 7647–7656 (2021)
Hayat, S., Yanmaz, E., Brown, T.X., Bettstetter, C.: Multi-objective UAV path planning for search and rescue. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 5569–5574. IEEE (2017)
Masehian, E., Sedighizadeh, D.: Multi-objective robot motion planning using a particle swarm optimization model. J. Zhejiang Univ. Sci. C 11, 607–619 (2010)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Van Moffaert, K., Nowé, A.: Multi-objective reinforcement learning using sets of pareto dominating policies. J. Mach. Learn. Res. 15(1), 3483–3512 (2014)
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)
Xu, Z., Topcu, U.: Transfer of temporal logic formulas in reinforcement learning. In: IJCAI: Proceedings of the Conference, vol. 28, p. 4010. NIH Public Access (2019)
Yijing, Z., Zheng, Z., Xiaoyi, Z., Yang, L.: Q learning algorithm based UAV path learning and obstacle avoidence approach. In: 2017 36th Chinese Control Conference (CCC), pp. 3397–3402. IEEE (2017)
Yu, J., Hou, J., Chen, G.: Improved safety-first a-star algorithm for autonomous vehicles. In: 2020 5th International Conference on Advanced Robotics and Mechatronics (ICARM), pp. 706–710. IEEE (2020)
Zhang, N., Yu, C., Duan, Z., Tian, C.: A proof system for unified temporal logic. Theor. Comput. Sci. 949, 113702 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yu, C., Zhang, N., Duan, Z., Tian, C. (2024). An Approach to Agent Path Planning Under Temporal Logic Constraints. In: Wu, W., Tong, G. (eds) Computing and Combinatorics. COCOON 2023. Lecture Notes in Computer Science, vol 14423. Springer, Cham. https://doi.org/10.1007/978-3-031-49193-1_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-49193-1_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49192-4
Online ISBN: 978-3-031-49193-1
eBook Packages: Computer ScienceComputer Science (R0)