Abstract
For flight test engineering, the flight test task schedule is of great importance to the delivery node and the development cost of an aircraft, while in the real flight test process, dynamic events may frequently occur, which affect the schedule implementation and flight test progress. To adaptively adjust the real-world flight test schedule, this paper proposes a deep reinforcement learning (DRL) approach to solve the dynamic task scheduling problem for flight tests, with the objectives of flight test duration and task tardiness. Firstly, the task scheduling characteristics for flight tests are introduced, and a mixed-integer programming (MIP) model is constructed. Then, the addressed problem is formulated as a Markov decision process (MDP), including the well-designed state features, reward functions, and action space based on the heuristic rules for selecting the uncompleted flight test task and allocating the selected task to an appropriate aircraft. Proximal policy optimization (PPO) is adopted to train and learn the optimal policy. Finally, extensive experiments are carried out to verify the proposed method’s effectiveness and efficiency in constructing a high-quality flight test task schedule in a dynamic flight test environment.










Similar content being viewed by others
Data availability
The data are available from the corresponding author upon reasonable request.
References
Landi A, Nicholson M (2011) Arp4754a/ ed-79a - guidelines for development of civil aircraft and systems - enhancements, novelties and key topics. SAE Int J Aerosp 4:871–879. https://doi.org/10.4271/2011-01-2564
Gregory JW, Liu T (2021) Introduction to flight testing. Wiley, Newark, https://doi.org/10.1002/9781118949818
Corda S (2017) Introduction to aerospace engineering with a flight test perspective
Hewett M, Tartt D, Agarwal A (1991) Automated flight test management system. Technical report. https://ntrs.nasa.gov/api/citations/19910012804/downloads/19910012804.pdf
Air Force programs: mission planning system. Technical report (2019). https://www.dote.osd.mil/Portals/97/pub/reports/FY2010/af/2010mps.pdf?ver=2019-08-22-112949-083
Yuan C, Xiu Z, Tian H, Ding Z (2014) Research on flight test planning and management for civil aircraft. Civil Aircraft Design and Research, 1–452
Liu Y, Xiao G, Wang M, Li T (2019) A method for flight test subject allocation on multiple test aircrafts based on improved genetic algorithm. Aerosp Syst. https://doi.org/10.1007/s42401-019-00035-9
Xu S, Bi W, Zhang A, Mao Z (2022) Optimization of flight test tasks allocation and sequencing using genetic algorithm. Appl Soft Comput 115:108241. https://doi.org/10.1016/j.asoc.2021.108241
Jünger M, Reinelt G, Rinaldi G (1995) Chapter 4 the traveling salesman problem 7:225–330 https://doi.org/10.1016/S0927-0507(05)80121-5
Applegate D, Cook W (1991) A computational study of the job-shop scheduling problem. ORSA J Comput 3(2):149–156. https://doi.org/10.1287/ijoc.3.2.149
Toth P, Vigo D (2002) Models, relaxations and exact approaches for the capacitated vehicle routing problem. Discrete Appl Math 123(1):487–512. https://doi.org/10.1016/S0166-218X(01)00351-1
Arunarani A, Manjula D, Sugumaran V (2019) Task scheduling techniques in cloud computing: a literature survey. Futur Gener Comput Syst 91:407–415. https://doi.org/10.1016/j.future.2018.09.014
Peres F, Castelli M (2021) Combinatorial optimization problems and metaheuristics: review, challenges, design, and development. Appl Sci. https://doi.org/10.3390/app11146449
Blum C, Roli A (2003) Metaheuristics in combinatorial optimization: overview and conceptual comparison. ACM Comput Surv 35(3):268–308. https://doi.org/10.1145/937503.937505
Voß S (2001) Meta-heuristics: the state of the art. In: Nareyek A (ed) Local search for planning and scheduling. Springer, Berlin and Heidelberg, pp 1–23
Peres F, Castelli M (2021) Combinatorial optimization problems and metaheuristics: review, challenges, design, and development. Appl Sci. https://doi.org/10.3390/app11146449
François-Lavet V, Henderson P, Islam R, Bellemare MG, Pineau J (2018) An introduction to deep reinforcement learning. Foundat Trends® Mach Learn, 11(3-4):219–354 https://doi.org/10.1561/2200000071
Bengio Y, Lodi A, Prouvost A (2021) Machine learning for combinatorial optimization: a methodological tour d’horizon. Eur J Oper Res 290(2):405–421. https://doi.org/10.1016/j.ejor.2020.07.063
Shakya AK, Pillai G, Chakrabarty S (2023) Reinforcement learning algorithms: a brief survey. Expert Syst Appl 231:120495. https://doi.org/10.1016/j.eswa.2023.120495
Li SE (2023) Deep reinforcement learning, pp. 365–402. Springer, Singapore. https://doi.org/10.1007/978-981-19-7784-8_10
Zhang Y, Zhu H, Tang D, Zhou T, Gui Y (2022) Dynamic job shop scheduling based on deep reinforcement learning for multi-agent manufacturing systems. Robot Comput Integr Manuf 78:102412. https://doi.org/10.1016/j.rcim.2022.102412
Zhang L, Feng Y, Xiao Q, Xu Y, Li D, Yang D, Yang Z (2023) Deep reinforcement learning for dynamic flexible job shop scheduling problem considering variable processing times. J Manuf Syst 71:257–273. https://doi.org/10.1016/j.jmsy.2023.09.009
Shi Q, Li L, Fang Z, Bi X, Liu H, Zhang X, Chen W, Yu J (2024) Efficient and fair PPO-based integrated scheduling method for multiple tasks of satech-01 satellite. Chin J Aeronaut 37(2):417–430. https://doi.org/10.1016/j.cja.2023.10.011
Bellman R (1966) Dynamic programming. Science 153(3731):34–37. https://doi.org/10.1126/science.153.3731.34
Fisher ML (2004) The Lagrangian relaxation method for solving integer programming problems. Manage Sci 50(12):1861–1871. https://doi.org/10.1287/mnsc.1040.0263
Lawler EL, Wood DE (1966) Branch-and-bound methods: a survey. Oper Res 14(4):699–719. https://doi.org/10.1287/opre.14.4.699
Tomazella CP, Nagano MS (2020) A comprehensive review of branch-and-bound algorithms: guidelines and directions for further research on the flowshop scheduling problem. Expert Syst Appl 158:113556. https://doi.org/10.1016/j.eswa.2020.113556
Deng Q, Santos BF, Curran R (2020) A practical dynamic programming based methodology for aircraft maintenance check scheduling optimization. Eur J Oper Res 281(2):256–273. https://doi.org/10.1016/j.ejor.2019.08.025
Asadi-Gangraj E (2017) Lagrangian relaxation approach to minimize makespan for hybrid flow shop scheduling problem with unrelated parallel machines. Scientia Iranica https://doi.org/10.24200/sci.2017.20018
Ouelhadj D, Petrovic S (2009) A survey of dynamic scheduling in manufacturing systems. J Sched 12(4):417–431. https://doi.org/10.1007/s10951-008-0090-8
Rajendran C, Holthaus O (1999) A comparative study of dispatching rules in dynamic flowshops and jobshops. Eur J Oper Res 116(1):156–170. https://doi.org/10.1016/S0377-2217(98)00023-X
Sih GC, Lee EA (1990) Dynamic-level scheduling for heterogeneous processor networks. In: Proceedings of the Second IEEE Symposium on Parallel and Distributed Processing 1990, pp. 42–49. https://doi.org/10.1109/SPDP.1990.143505
Topcuoglu H, Hariri S, Wu M-Y (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260–274. https://doi.org/10.1109/71.993206
Zhang L, Zhou L, Salah A (2020) Efficient scientific workflow scheduling for deadline-constrained parallel tasks in cloud computing environments. Inf Sci 531:31–46. https://doi.org/10.1016/j.ins.2020.04.039
Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence, 1st MIT Press Edn. MIT Press, Cambridge. https://doi.org/10.7551/mitpress/1090.001.0001
Wang D, Tan D, Liu L (2018) Particle swarm optimization algorithm: an overview. Soft Comput. https://doi.org/10.1007/s00500-016-2474-6
Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53–66. https://doi.org/10.1109/4235.585892
Hajek B (1985) A tutorial survey of theory and applications of simulated annealing. In: 1985 24th IEEE Conference on Decision and Control, pp. 755–760. https://doi.org/10.1109/CDC.1985.268599
Glover F (1986) Future paths for integer programming and links to artificial intelligence. Comput Oper Res 13(5):533–549. https://doi.org/10.1016/0305-0548(86)90048-1
Zhou Z, Li F, Zhu H, Xie H, Abawajy JH, Chowdhury MU (2020) An improved genetic algorithm using greedy strategy toward task scheduling optimization in cloud environments. Neural Comput Appl 32(6):1531–1541. https://doi.org/10.1007/s00521-019-04119-7
Jana B, Chakraborty M, Mandal T (2019) A task scheduling technique based on particle swarm optimization algorithm in cloud environment. In: Ray K, Sharma TK, Rawat S, Saini RK, Bandyopadhyay A (eds) Soft computing: theories and applications. Springer, Singapore, pp 525–536
Tran LV, Huynh BH, Akhtar H (2019) Ant colony optimization algorithm for maintenance, repair and overhaul scheduling optimization in the context of industrie 40. Appl Sci. https://doi.org/10.3390/app9224815
Pang J, Zhou H, Tsai Y-C, Chou F-D (2018) A scatter simulated annealing algorithm for the bi-objective scheduling problem for the wet station of semiconductor manufacturing. Comput Ind Eng 123:54–66. https://doi.org/10.1016/j.cie.2018.06.017
Gao J, Chen R, Deng W (2013) An efficient tabu search algorithm for the distributed permutation flowshop scheduling problem. Int J Prod Res 51(3):641–651. https://doi.org/10.1080/00207543.2011.644819
Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in Neural Information Processing Systems, vol 28, pp 2692–2700
Bello I, Pham H, Le Q, Norouzi M, Bengio S (2016) Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940
Ma Q, Ge S, He D, Thaker D, Drori I (2019) Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning. arXiv preprint arXiv:1911.04936
Ju X, Su S, Xu C, Wang H (2023) Computation offloading and tasks scheduling for the internet of vehicles in edge computing: a deep reinforcement learning-based pointer network approach. Computer Netw 223:109572. https://doi.org/10.1016/j.comnet.2023.109572
Chen R, Yang B, Li S, Wang S (2020) A self-learning genetic algorithm based on reinforcement learning for flexible job-shop scheduling problem. Comput Ind Eng 149:106778. https://doi.org/10.1016/j.cie.2020.106778
Etheve M, Alès Z, Bissuel C, Juan O, Kedad-Sidhoum S (2020) Reinforcement learning for variable selection in a branch and bound algorithm. In: Hebrard E, Musliu N (eds) Integration of constraint programming, artificial intelligence, and operations research. Springer, Cham, pp 176–185
Li T, Meng Y, Tang L (2023) Scheduling of continuous annealing with a multi-objective differential evolution algorithm based on deep reinforcement learning. IEEE Trans Autom Sci Eng. https://doi.org/10.1109/TASE.2023.3244331
Li R, Gong W, Wang L, Lu C, Dong C (2024) Co-evolution with deep reinforcement learning for energy-aware distributed heterogeneous flexible job shop scheduling. IEEE Trans Syst Man Cybern Syst 54(1):201–211. https://doi.org/10.1109/TSMC.2023.3305541
Qu S, Wang J, Shivani G (2016) Learning adaptive dispatching rules for a manufacturing process system by using reinforcement learning approach. In: 2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA), pp. 1–8. https://doi.org/10.1109/ETFA.2016.7733712
Zhang Y, Bai R, Qu R, Tu C, Jin J (2022) A deep reinforcement learning based hyper-heuristic for combinatorial optimisation with uncertainties. Eur J Oper Res 300(2):418–427. https://doi.org/10.1016/j.ejor.2021.10.032
Du Y, Li J, Li C, Duan P (2022) A reinforcement learning approach for flexible job shop scheduling problem with crane transportation and setup times. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2022.3208942
Wang Z, Liao W (2023) Smart scheduling of dynamic job shop based on discrete event simulation and deep reinforcement learning. J Intell Manuf. https://doi.org/10.1007/s10845-023-02161-w
Gui Y, Tang D, Zhu H, Zhang Y, Zhang Z (2023) Dynamic scheduling for flexible job shop using a deep reinforcement learning approach. Comput Ind Eng 180:109255. https://doi.org/10.1016/j.cie.2023.109255
Zhou J, Zheng L, Fan W (2024) Multirobot collaborative task dynamic scheduling based on multiagent reinforcement learning with heuristic graph convolution considering robot service performance. J Manuf Syst 72:122–141. https://doi.org/10.1016/j.jmsy.2023.11.010
Meng L, Zhang C, Shao X, Ren Y (2019) Milp models for energy-aware flexible job shop scheduling problem. J Clean Prod 210:710–723. https://doi.org/10.1016/j.jclepro.2018.11.021
Meng L, Zhang C, Ren Y, Zhang B, Lv C (2020) Mixed-integer linear programming and constraint programming formulations for solving distributed flexible job shop scheduling problem. Comput Ind Eng 142:106347. https://doi.org/10.1016/j.cie.2020.106347
Puterman ML (2009) Markov decision processes: discrete stochastic dynamic programming. Vol. 414. Wiley, Hoboken. https://doi.org/10.1002/9780470316887
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347
Schulman J, Moritz P, Levine S, Jordan M, Abbeel P (2015) High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438
Van Veldhuizen DA, Lamont GB (2000) On measuring multiobjective evolutionary algorithm performance. In: Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512), vol 1, pp 204–2111. https://doi.org/10.1109/CEC.2000.870296
Zitzler E, Thiele L (1999) Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach. IEEE Trans Evol Comput 3(4):257–271. https://doi.org/10.1109/4235.797969
Li J-Q, Pan Q-K, Liang Y-C (2010) An effective hybrid tabu search algorithm for multi-objective flexible job-shop scheduling problems. Comput Ind Eng 59(4):647–662. https://doi.org/10.1016/j.cie.2010.07.014
Murata T, Ishibuchi H (1996) Positive and negative combination effects of crossover and mutation operators in sequencing problems. In: Proceedings of IEEE International Conference on Evolutionary Computation, pp 170–175. https://doi.org/10.1109/ICEC.1996.542355
Miller BL, Goldberg DE (1995) Genetic algorithms, tournament selection, and the effects of noise. Complex Syst 9:193–921
Funding
This paper is sponsored by National Natural Science Foundation of China (61673270; 61973212) and Science and Technology Program of Zhejiang Province (2022C01013).
Author information
Authors and Affiliations
Contributions
Bei Tian wrote the main manuscript text and performed the experiment; Gang Xiao contributed to the conception of the study; and Yu Shen contributed to the manuscript preparation. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tian, B., Xiao, G. & Shen, Y. A deep reinforcement learning approach for dynamic task scheduling of flight tests. J Supercomput 80, 18761–18796 (2024). https://doi.org/10.1007/s11227-024-06167-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-024-06167-w