Abstract
Automated valet parking (AVP) is one of the most advanced technologies for improving parking efficiency and security. However, in an AVP system, the traditional vehicle-side greedy search strategy for available parking spaces is likely to achieve low global efficiency and poses a high risk of collision. Therefore, in this study, a system-side deep reinforcement learning (DRL)-based cooperative approach is proposed to solve the parking space allocation problem in a large AVP environment. First, the problem of parking space allocation is formulated as a Markov decision process (MDP). Then, a reward shaping method oriented to the global objective is designed. Next, because the current reinforcement learning methods are difficult to apply to parking space allocation involving large numbers of discrete actions, a cost-based method of parking allocation action embedding is proposed to embed the discrete parking actions in a continuous space, which the actor can generalize. After action embedding, the deep deterministic policy gradient (DDPG) is employed as the training algorithm. The experimental results show that the proposed DRL -based cooperative approach can converge in the parking space allocation problem involving a large AVP system and achieve greater improvement of global AVP efficiency than can the other parking methods.
Similar content being viewed by others
Data Availability
The data used to support the results of this study are available from the corresponding author upon request.
References
Zhao C, Liao F, Li X, Du Y (2021) Macroscopic modeling and dynamic control of on-street cruising-for-parking of autonomous vehicles in a multi-region urban road network. Transp Res C: Emerg Technol 128:103176. https://doi.org/10.1016/j.trc.2021.103176
Bock F, Di Martino S, Origlia A (2020) Smart parking: Using a crowd of taxis to sense on-street parking space availability. IEEE Trans Intell Transp Syst 21(2):496–508. https://doi.org/10.1109/TITS.2019.2899149
Rizvi SR, Zehra S, Olariu S (2019) Aspire: an agent-oriented smart parking recommendation system for smart cities. IEEE Intell Transp Syst Mag 11(4):48–61. https://doi.org/10.1109/MITS.2018.2876569
Khalid M, Wang K, Aslam N, Cao Y, Ahmad N, Khan MK (2021) From smart parking towards autonomous valet parking: a survey, challenges and future works. J Netw Comput Appl 175:102935. https://doi.org/10.1016/j.jnca.2020.102935
Huang C, Lu R, Lin X, Shen X (2018) Secure automated valet parking: A privacy-preserving reservation scheme for autonomous vehicles. IEEE Trans Veh Technol 67(11):11169–11180. https://doi.org/10.1109/TVT.2018.2870167
Cai L, Guan H, Zhang HL, Jia X, Zhan J (2022) Multi-maneuver vertical parking path planning and control in a narrow space. Rob Auton Syst 149:103964 . https://doi.org/10.1016/j.robot.2021.103964
Chen G, Hou J, Dong J, Li Z, Gu S, Zhang B, Yu J, Knoll A (2021) Multiobjective scheduling strategy with genetic algorithm and time-enhanced a* planning for autonomous parking robotics in high-density unmanned parking lots. IEEE ASME Trans Mechatron 26(3):1547–1557. https://doi.org/10.1109/TMECH.2020.3023261
Qin Z, Chen X, Hu M, Chen L, Fan J (2020) A novel path planning methodology for automated valet parking based on directional graph search and geometry curve. Rob Auton Syst 132:103606. https://doi.org/10.1016/j.robot.2020.103606
Zhu Y, He Z, Sun W (2020) Network-wide link travel time inference using trip-based data from automatic vehicle identification detectors. IEEE Trans Intell Trans Syst 21(6):2485–2495. https://doi.org/10.1109/TITS.2019.2919595
Shi S, Xiong Y, Chen J, Xiong C (2019) A bilevel optimal motion planning (bomp) model with application to autonomous parking. Int J Intell Robot Appl 3(4):370–382. https://doi.org/10.1007/s41315-019-00109-z
Wu Y, Xie F, Huang L, Sun R, Yang J, Yu Q (2022) Convolutionally evaluated gradient first search path planning algorithm without prior global maps. Rob Auton Syst 150:103985. https://doi.org/10.1016/j.robot.2021.103985
Hong Y-D, Lee B (2020) Real-time feasible footstep planning for bipedal robots in three-dimensional environments using particle swarm optimization. IEEE ASME Trans Mechatron 25(1):429–437. https://doi.org/10.1109/TMECH.2019.2955701
Bulut Y, Conkur ES (2021) A real-time path-planning algorithm with extremely tight maneuvering capabilities for hyper-redundant manipulators. Int J Eng Sci Technol 24 (1):247–258. https://doi.org/10.1016/j.jestch.2020.07.002
Tazaki Y, Okuda H, Suzuki T (2017) Parking trajectory planning using multiresolution state roadmaps. IEEE Trans Intell Veh 2(4):298–307. https://doi.org/10.1109/TIV.2017.2769882
Nakrani NM, Joshi MM (2022) A human-like decision intelligence for obstacle avoidance in autonomous vehicle parking. Appl Intell 52(4):3728–3747. https://doi.org/10.1007/s10489-021-02653-3
Hu B, Mishra S (2019) Time-optimal trajectory generation for landing a quadrotor onto a moving platform. IEEE/ASME Trans Mechatron 24(2):585–596. https://doi.org/10.1109/TMECH.2019.2896075
Kneissl M, Madhusudhanan AK, Molin A, Esen H, Hirche S (2021) A multi-vehicle control framework with application to automated valet parking. IEEE Trans Intell Transp Syst 22(9):5697–5707. https://doi.org/10.1109/TITS.2020.2990294
Duan M, Wu D, Liu H (2020) Bi-level programming model for resource-shared parking lots allocation. Transp Lett 12(7):501–511. https://doi.org/10.1080/19427867.2019.1631596
Mladenović M, Delot T, Laporte G, Wilbaut C (2021) A scalable dynamic parking allocation framework. Comput Oper Res 125:105080. https://doi.org/10.1016/j.cor.2020.105080
Errousso H, El Ouadi J, Benhadou S et al (2021) Dynamic parking space allocation at urban scale: problem formulation and resolution. J King Saud Univ - Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2021.11.011https://doi.org/10.1016/j.jksuci.2021.11.011
Arellano-Verdejo J, Alonso-Pecina F, Alba E, Guzman Arenas A (2019) Optimal allocation of public parking spots in a smart city: Problem characterisation and first algorithms. J Exp Theor Artif Intell 31(4):575–597. https://doi.org/10.1080/0952813X.2019.1591522
Wu J, Hong Q, Cao M, Liu Y, Fujita H (2022) A group consensus-based travel destination evaluation method with online reviews. Appl Intell 52(2):1306–1324. https://doi.org/10.1007/s10489-021-02410-6
Tu J-F (2019) Parking lot guiding with iot way. Microelectron Reliab 94:19–23. https://doi.org/10.1016/j.microrel.2019.01.011
Shin J-H, Jun H-B, Kim J-G (2018) Dynamic control of intelligent parking guidance using neural network predictive control. Comput Ind Eng 120:15–30. https://doi.org/10.1016/j.cie.2018.04.023
Dogaroglu B, Caliskanelli SP, Tanyel S (2021) Comparison of intelligent parking guidance system and conventional system with regard to capacity utilisation. Sustainable Cities and Society 74:103152. https://doi.org/10.1016/j.scs.2021.103152
Chen G, Hou J, Dong J, Li Z, Gu S, Zhang B, Yu J, Knoll A (2021) Multiobjective scheduling strategy with genetic algorithm and time-enhanced a* planning for autonomous parking robotics in high-density unmanned parking lots. IEEE/ASME Trans Mechatron 26(3):1547–1557. https://doi.org/10.1109/TMECH.2020.3023261
Hao J, Wang C, Yang M, Wang B (2020) Hybrid genetic algorithm based dispatch and conflict-free routing method of agv systems in unmanned underground parking lots. In: 2020 IEEE international conference on real-time computing and robotics (RCAR), pp 475–480. https://doi.org/10.1109/RCAR49640.2020.9303275
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge
Zhang J, Li Z, Li L, Li Y, Dong H (2021) A bi-level cooperative operation approach for agv based automated valet parking. Transportation Research Part C: Emerging Technologies 128:103140. https://doi.org/10.1016/j.trc.2021.103140
Ma A, Ouimet M, Cortés J (2020) Hierarchical reinforcement learning via dynamic subspace search for multi-agent planning. Auton Robot 44(3):485–503. https://doi.org/10.1007/s10514-019-09871-2
Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A (2017) Mastering the game of go without human knowledge. Nature 550 (7676):354–359. https://doi.org/10.1038/nature24270
Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: a brief survey. IEEE Signal Proc Mag 34(6):26–38 . https://doi.org/10.1109/MSP.2017.2743240
Hu H, Jia X, He Q, Fu S, Liu K (2020) Deep reinforcement learning based agvs real-time scheduling with mixed rule for flexible shop floor in industry 4.0. Comput Ind Eng 149:106749 . https://doi.org/10.1016/j.cie.2020.106749
Shahrabi J, Adibi MA, Mahootchi M (2017) A reinforcement learning approach to parameter estimation in dynamic job shop scheduling. Comput Ind Eng 110:75–82. https://doi.org/10.1016/j.cie.2017.05.026
Li Y, Gu W, Yuan M, Tang Y (2022) Real-time data-driven dynamic scheduling for flexible job shop with insufficient transportation resources using hybrid deep q network. Robot Comput Integr Manuf 74:102283. https://doi.org/10.1016/j.rcim.2021.102283
Zhu Y, He Z, Li G (2022) A bi-hierarchical game-theoretic approach for network-wide traffic signal control using trip-based data. IEEE Trans Intell Trans Syst: 1–12. https://doi.org/10.1109/TITS.2022.3140511
Dulac-Arnold G, Evans R, van Hasselt H, Sunehag P, Lillicrap T, Hunt J, Mann T, Weber T, Degris T, Coppin B (2015) Deep reinforcement learning in large discrete action spaces. arXiv:1512.07679
Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. arXiv:1509.02971
Hou Y, Liu L, Wei Q, Xu X, Chen C (2017) A novel ddpg method with prioritized experience replay. In: 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp 316–321. https://doi.org/10.1109/SMC.2017.8122622
Chen N, Qiu T, Mu C, Han M, Zhou P (2020) Deep actor–critic learning-based robustness enhancement of internet of thing. IEEE Internet Things J 7(7):6191–6200. https://doi.org/10.1109/JIOT.2019.2963499
Shen Z, Yang K, Xi Z, Zou J, Du W (2021) Deepapp: a deep reinforcement learning framework for mobile application usage prediction. IEEE Trans Mob Comput:1–1. https://doi.org/10.1109/TMC.2021.3093619
Qiu C, Hu Y, Chen Y, Zeng B (2019) Deep deterministic policy gradient (ddpg)-based energy harvesting wireless communications. IEEE Internet Things J 6(5):8577–8588. https://doi.org/10.1109/JIOT.2019.2921159
Luo B, Liu D, Wu H-N, Wang D, Lewis FL (2017) Policy gradient adaptive dynamic programming for data-based optimal control. IEEE Trans Cybern 47 (10):3341–3354. https://doi.org/10.1109/TCYB.2016.2623859
Xi L, Wu J, Xu Y, Sun H (2021) Automatic generation control based on multiple neural networks with actor-critic strategy. IEEE Trans Neural Netw Learn Syst 32(6):2483–2493. https://doi.org/10.1109/TNNLS.2020.3006080
Gupta P, Pal A, Vittal V (2022) Coordinated wide-area damping control using deep neural networks and reinforcement learning. IEEE Trans Power Syst 37(1):365–376. https://doi.org/10.1109/TPWRS.2021.3091940
Urquiza-Aguiar L, Coloma-Gómez W, Bautista PB, Calderón-Hinojosa X (2020) Comparison of sumo’s vehicular demand generators in vehicular communications via graph-theory metrics. Ad Hoc Networks 106:102217. https://doi.org/10.1016/j.adhoc.2020.102217
Zhang D, Chen X, Wang J, Wang Y, Sun J (2021) A comprehensive comparison study of four classical car-following models based on the large-scale naturalistic driving experiment. Simul Model Pract Theory 113:102383. https://doi.org/10.1016/j.simpat.2021.102383
Bi J, Wang F, Ding C, Xie D, Zhao X (2022) The airport gate assignment problem: a branch-and-price approach for improving utilization of jetways. Comput Ind Eng 164:107878. https://doi.org/10.1016/j.cie.2021.107878
Liu X, Zhu T, Jiang C, Ye D, Zhao F (2022) Prioritized experience replay based on multi-armed bandit. Expert Systems with Applications 189:116023. https://doi.org/10.1016/j.eswa.2021.116023
Vanvuchelen N, Gijsbrechts J, Boute R (2020) Use of proximal policy optimization for the joint replenishment problem. Computers in Industry 119:103239. https://doi.org/10.1016/j.compind.2020.103239
Xu D, Cui Y, Ye J, Cha SW, Li A, Zheng C (2022) A soft actor-critic-based energy management strategy for electric vehicles with hybrid energy storage systems. Journal of Power Sources 524:231099. https://doi.org/10.1016/j.jpowsour.2022.231099
Shi Q, Lam H-K, Xuan C, Chen M (2020) Adaptive neuro-fuzzy pid controller based on twin delayed deep deterministic policy gradient algorithm. Neurocomputing 402:183–194. https://doi.org/10.1016/j.neucom.2020.03.063
Acknowledgements
This work was sponsored by the National Natural Science Foundation of China (No. U1811463 and No. U21B2090) and the Fundamental Research Funds for the Central Universities, Sun Yat-sen University (No. 22qntd1713).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xie, J., He, Z. & Zhu, Y. A DRL based cooperative approach for parking space allocation in an automated valet parking system. Appl Intell 53, 5368–5387 (2023). https://doi.org/10.1007/s10489-022-03757-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03757-0