Abstract
The flight path planning of the missile is important in long-range air-to-ground strike missions. Constraints about missile guidance and guidance handover are considered, and path planning is required to conform to the missile motion model. Therefore, the missile’s allowable flight space and flight mode are further restricted, and the decision-making scale and difficulty of the path planning problem are significantly increased. A genetic algorithm incorporated twin delayed deep deterministic policy gradient (GA-TD3) algorithm is proposed for missile path planning, which uses high-quality data generated by GA to improve the TD3 training effect. Firstly, a missile path planning model is established based on the missile’s motion equations, and the missile guidance and guidance handover constraints are stated in detail. Then a fast path generation method is proposed, which uses several leading points to generate a leading path based on the optimal control theory, and the genetic algorithm is used to improve the leading path quality. Finally, the deep reinforcement learning model for the missile path planning problem is established based on the TD3 framework, and the leading paths participate in the leading training to improve the training effect. Simulation cases of 4 threat areas and 3 guidance platforms demonstrate the efficiency of the GA-TD3. Furthermore, the influence of three factors on the algorithm’s performance is tested, including the leading path quality, leading path number, and leading training cycle.


















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Ma XD, Tian ZF, He XL, Wang XG, Zhao HY, Li JH (2020) Analysis of architecture framework and intelligent development of air-to-ground precision strike system. J Phys Conf Ser 1507(10):102030. https://doi.org/10.1088/1742-6596/1507/10/102030
Chen Q, Zhao Q, Zou Z (2022) Threat-oriented collaborative path planning of unmanned reconnaissance mission for the target group. Aerospace 9(10):577. https://doi.org/10.3390/aerospace9100577
Li Z, Yang X, Sun X, Liu G, Hu C (2019) Improved artificial potential field based lateral entry guidance for waypoints passage and no-fly zones avoidance. Aerosp Sci Technol 86:119–131. https://doi.org/10.1016/j.ast.2019.01.015
He S, Shin H-S, Tsourdos A (2021) Computational missile guidance: a deep reinforcement learning approach. J Aerosp Inf Syst 18(8):571–582. https://doi.org/10.2514/1.I010970
Wu Y, Song M, Chen X, Zhang Y, Zhang Z, Zhang J (2020) Cooperative relay guidance task allocation technology based on dragonfly algorithm. In: 2020 IEEE 16th international conference on control and automation (ICCA). IEEE, Singapore, pp 708–712. https://doi.org/10.1109/ICCA51439.2020.9264436
Liu DW, Sun J, Huang DG, Wang XY, Cheng K, Yang WQ, Ding JY (2021) Research on development status and technology trend of intelligent autonomous ammunition. J Phys Conf Ser 1721(1):012032. https://doi.org/10.1088/1742-6596/1721/1/012032
Liu J, Zhao T, Liu K, Sun B, Bai C (2021) Optimization of structure parameters in a coal pyrolysis filtration system based on CFD and quadratic regression orthogonal combination and a genetic algorithm. Eng Appl Comp Fluid Mech 15(1):815–829. https://doi.org/10.1080/19942060.2021.1918258
Kazemi SMR, Bidgoli BM, Shamshirband S, Karimi SM, Ghorbani MA, Chau K-W, Pour RK (2018) Novel genetic-based negative correlation learning for estimating soil temperature. Eng Appl Comp Fluid Mech 12(1):506–516. https://doi.org/10.1080/19942060.2018.1463871
Phung MD, Ha QP (2021) Safety-enhanced UAV path planning with spherical vector-based particle swarm optimization. Appl Soft Comput 107:107376. https://doi.org/10.1016/j.asoc.2021.107376
Miao C, Chen G, Yan C, Wu Y (2021) Path planning optimization of indoor mobile robot based on adaptive ant colony algorithm. Comput Ind Eng 156:107230. https://doi.org/10.1016/j.cie.2021.107230
Zhang A, Li C, Bi W (2016) Rectangle expansion a* pathfinding for grid maps. Chin J Aeronaut 29(5):1385–1396. https://doi.org/10.1016/j.cja.2016.04.023
Kawabata K, Ma L, Xue J, Zhu C, Zheng N (2015) A path generation for automated vehicle based on Bezier curve and via-points. Robot Auton Syst 74:243–252. https://doi.org/10.1016/j.robot.2015.08.001
François-Lavet V, Henderson P, Islam R, Bellemare MG, Pineau J (2018) An introduction to deep reinforcement learning. Found Trends Mach Learn 11(3–4):219–354. https://doi.org/10.1561/2200000071
Wang W, Zhang A, Bi W, Mao Z, Li M (2023) A novel UAV path planning method based on layered PER-DDQN. In: Sangchul L, Cheolheui H, Jeong-Yeol C, Seungkeun K, Ho KJ (eds) The Proceedings of the 2021 Asia-Pacific international symposium on aerospace technology (APISAT 2021), vol 2. Springer, Singapore, pp 693–702. https://doi.org/10.1007/978-981-19-2635-8_51
Wu Y (2021) A survey on population-based meta-heuristic algorithms for motion planning of aircraft. Swarm Evol Comput 62:100844. https://doi.org/10.1016/j.swevo.2021.100844
Saeed RA, Omri M, Abdel-Khalek S, Ali ES, Alotaibi MF (2022) Optimal path planning for drones based on swarm intelligence algorithm. Neural Comput Appl 34:10133–10155. https://doi.org/10.1007/s00521-022-06998-9
Kiani F, Seyyedabbasi A, Aliyev R, Gulle MU, Basyildiz H, Shah MA (2021) Adapted-RRT: novel hybrid method to solve three-dimensional path planning problem using sampling and metaheuristic-based algorithms. Neural Comput Appl 33:15569–15599. https://doi.org/10.1007/s00521-021-06179-0
Li C, Li J, Liu X (2020) Static rectangle expansion a* algorithm for pathfinding. IEEE Trans Games 14(1):23–35. https://doi.org/10.1109/TG.2020.3012602
Pehlivanoglu YV, Pehlivanoglu P (2021) An enhanced genetic algorithm for path planning of autonomous UAV in target coverage problems. Appl Soft Comput 112:107796. https://doi.org/10.1016/j.asoc.2021.107796
Song B, Wang Z, Zou L, Xu L, Alsaadi FE (2019) A new approach to smooth global path planning of mobile robots with kinematic constraints. Int J Mach Learn Cybern 10:107–119. https://doi.org/10.1007/s13042-017-0703-7
Dian S, Zhong J, Guo B, Liu J, Guo R (2022) A smooth path planning method for mobile robot using a BES-incorporated modified QPSO algorithm. Expert Syst Appl 208:118256. https://doi.org/10.1016/j.eswa.2022.118256
Song B, Wang Z, Zou L (2021) An improved PSO algorithm for smooth path planning of mobile robots using continuous high-degree Bezier curve. Appl Soft Comput 100:106960. https://doi.org/10.1016/j.asoc.2020.106960
Banan A, Nasiri A, Taheri-Garavand A (2020) Deep learning-based appearance features extraction for automated carp species identification. Aquac Eng 89:102053. https://doi.org/10.1016/j.aquaeng.2020.102053
Afan HA, Osman AIA, Essam Y, Ahmed AN, Huang YF, Kisi O, Sherif M, Sefelnasr A, Chau K-W, El-Shafie A (2021) Modeling the fluctuations of groundwater level by employing ensemble deep learning techniques. Eng Appl Comput Fluid Mech 15(1):1420–1439. https://doi.org/10.1080/19942060.2021.1974093
Chen W, Sharifrazi D, Liang G, Band SS, Chau KW, Mosavi A (2022) Accurate discharge coefficient prediction of streamlined weirs by coupling linear regression and deep convolutional gated recurrent unit. Eng Appl Comput Fluid Mech 16(1):965–976. https://doi.org/10.1080/19942060.2022.2053786
Ghalandari M, Ziamolki A, Mosavi A, Shamshirband S, Chau K-W, Bornassi S (2019) Aeromechanical optimization of first row compressor test stand blades using a hybrid machine learning model of genetic algorithm, artificial neural networks and design of experiments. Eng Appl Comput Fluid Mech 13(1):892–904. https://doi.org/10.1080/19942060.2019.1649196
Chen C, Chen X, Ma F, Zeng X, Wang J (2019) A knowledge-free path planning approach for smart ships based on reinforcement learning. Ocean Eng 189:106299. https://doi.org/10.1016/j.oceaneng.2019.106299
Guo S, Zhang X, Zheng Y, Du Y (2020) An autonomous path planning model for unmanned ships based on deep reinforcement learning. Sensors 20(2):426. https://doi.org/10.3390/s20020426
Yu L, Shao X, Wei Y, Zhou K (2021) Intelligent land-vehicle model transfer trajectory planning method based on deep reinforcement learning. Sensors 18(9):2905. https://doi.org/10.3390/s18092905
Li W, Li J, Li N, Shao L, Li M (2023) Online trajectory planning method for midcourse guidance phase based on deep reinforcement learning. Aerospace 10(5):441. https://doi.org/10.3390/aerospace10050441
Hong D, Park S (2022) Avoiding obstacles via missile real-time inference by reinforcement learning. Appl Sci 12(9):4142. https://doi.org/10.3390/app12094142
Hong D, Lee S, Cho YH, Baek D, Kim J, Chang N (2021) Energy-efficient online path planning of multiple drones using reinforcement learning. IEEE Trans Veh Technol 70(10):9725–9740. https://doi.org/10.1109/TVT.2021.3102589
Lai X, Li J, Chambers J (2021) Enhanced center constraint weighted a* algorithm for path planning of petrochemical inspection robot. J Intell Robot Syst 102:78. https://doi.org/10.1007/s10846-021-01437-8
Zhou X, Wu P, Zhang H, Guo W, Liu Y (2019) Learn to navigate: cooperative path planning for unmanned surface vehicles using deep reinforcement learning. IEEE Access 7:165262–165278. https://doi.org/10.1109/ACCESS.2019.2953326
Xu S, Bi W, Zhang A, Mao Z (2022) Optimization of flight test tasks allocation and sequencing using genetic algorithm. Appl Soft Comput 115:108241. https://doi.org/10.1016/j.asoc.2021.108241
Hou W, Wang Y, Wang J, Cheng P, Li L (2021) Intuitionistic fuzzy c-means clustering algorithm based on a novel weighted proximity measure and genetic algorithm. Int J Mach Learn Cybern 12:859–875. https://doi.org/10.1007/s13042-020-01206-3
Shinar J, Steinberg D (1977) Analysis of optimal evasive maneuvers based on a linearized two-dimensional kinematic model. J Aircr 14(8):795–802. https://doi.org/10.2514/3.58855
Wang J, Wang L, Zhao J, Guo X, Liu K (2022) An online proportional guidance midcourse guidance method for near space targets based on air-breathing hypersonic platform. In: Yan L, Duan H, Deng Y (eds) Advances in Guidance, Navigation and Control. Springer, Singapore, pp 6073–6085. https://doi.org/10.1007/978-981-19-6613-2_587
Zhang H, Huang C, Zhang Z, Wang X, Han B, Wei Z, Li Y, Wang L, Zhu W (2020) The trajectory generation of UCAV evading missiles based on neural networks. Neural Comput Appl 1486(2):022025. https://doi.org/10.1088/1742-6596/1486/2/022025
Gu W, Zhao H (2006) Research on the optimal guidance law for antiship missile based on the virtual targets. Mod Def Technol 34(4):56–60. https://doi.org/10.3969/j.issn.1009-086X.2006.04.014
Gao A, Dong Z, Ye H, Song J, Guo Q (2021) Loitering munition penetration control decision based on deep reinforcement learning. Acta Armamentarii 42(5):1101–1110. https://doi.org/10.3969/j.issn.1000-1093.2021.05.023
Li M, Huang T, Zhu W (2021) Adaptive exploration policy for exploration-exploitation tradeoff in continuous action control optimization. Int J Mach Learn Cybern 12:3491–3501. https://doi.org/10.1007/s13042-021-01387-5
Song B, Wang Z, Li S (2016) A new genetic algorithm approach to smooth path planning for mobile robots. Assem Autom 36(2):138–145. https://doi.org/10.1108/AA-11-2015-094
Oliva D, Martins MSR, Hinojosa S, Elaziz MA, dos Santos PV, da Cruz G, Mousavirad SJ (2022) A hyper-heuristic guided by a probabilistic graphical model for single-objective real-parameter optimization. Int J Mach Learn Cybern 13:3743–3772. https://doi.org/10.1007/s13042-022-01623-6
Funding
This work is supported by the Innovation Foundation for Doctor Dissertation of Northwestern Polytechnical University (Grant No. CX2022019), the National Natural Science Foundation of China (Grant Nos. 61903305 and 62073267), and the Fundamental Research Funds for the Central Universities (Grant No. HXGJXM202214).
Author information
Authors and Affiliations
Contributions
SX: Conceptualization, Methodology, Software, Writing—Original Draft. WB: Methodology, Investigation, Writing—Review and Editing. AZ: Conceptualization, Resources, Supervision. YW: Formal analysis, Writing—Review and Editing.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Consent to participate
Not applicable.
Consent to publish
Not applicable.
Ethical approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xu, S., Bi, W., Zhang, A. et al. A deep reinforcement learning approach incorporating genetic algorithm for missile path planning. Int. J. Mach. Learn. & Cyber. 15, 1795–1814 (2024). https://doi.org/10.1007/s13042-023-01998-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-023-01998-0