A deep reinforcement learning approach for dynamic task scheduling of flight tests

Tian, Bei; Xiao, Gang; Shen, Yu

doi:10.1007/s11227-024-06167-w

A deep reinforcement learning approach for dynamic task scheduling of flight tests

Published: 22 May 2024

Volume 80, pages 18761–18796, (2024)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Bei Tian¹,
Gang Xiao¹ &
Yu Shen¹

290 Accesses
Explore all metrics

Abstract

For flight test engineering, the flight test task schedule is of great importance to the delivery node and the development cost of an aircraft, while in the real flight test process, dynamic events may frequently occur, which affect the schedule implementation and flight test progress. To adaptively adjust the real-world flight test schedule, this paper proposes a deep reinforcement learning (DRL) approach to solve the dynamic task scheduling problem for flight tests, with the objectives of flight test duration and task tardiness. Firstly, the task scheduling characteristics for flight tests are introduced, and a mixed-integer programming (MIP) model is constructed. Then, the addressed problem is formulated as a Markov decision process (MDP), including the well-designed state features, reward functions, and action space based on the heuristic rules for selecting the uncompleted flight test task and allocating the selected task to an appropriate aircraft. Proximal policy optimization (PPO) is adopted to train and learn the optimal policy. Finally, extensive experiments are carried out to verify the proposed method’s effectiveness and efficiency in constructing a high-quality flight test task schedule in a dynamic flight test environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A predictive-reactive strategy for flight test task scheduling with aircraft grounding

Article Open access 05 March 2024

Flight Gate Assignment Problem with Reinforcement Learning

6-DOF Reinforcement Learning Control for Multi-rotor and Fixed-Wing Aircrafts

Data availability

The data are available from the corresponding author upon reasonable request.

References

Landi A, Nicholson M (2011) Arp4754a/ ed-79a - guidelines for development of civil aircraft and systems - enhancements, novelties and key topics. SAE Int J Aerosp 4:871–879. https://doi.org/10.4271/2011-01-2564
Article Google Scholar
Gregory JW, Liu T (2021) Introduction to flight testing. Wiley, Newark, https://doi.org/10.1002/9781118949818
Corda S (2017) Introduction to aerospace engineering with a flight test perspective
Hewett M, Tartt D, Agarwal A (1991) Automated flight test management system. Technical report. https://ntrs.nasa.gov/api/citations/19910012804/downloads/19910012804.pdf
Air Force programs: mission planning system. Technical report (2019). https://www.dote.osd.mil/Portals/97/pub/reports/FY2010/af/2010mps.pdf?ver=2019-08-22-112949-083
Yuan C, Xiu Z, Tian H, Ding Z (2014) Research on flight test planning and management for civil aircraft. Civil Aircraft Design and Research, 1–452
Liu Y, Xiao G, Wang M, Li T (2019) A method for flight test subject allocation on multiple test aircrafts based on improved genetic algorithm. Aerosp Syst. https://doi.org/10.1007/s42401-019-00035-9
Article Google Scholar
Xu S, Bi W, Zhang A, Mao Z (2022) Optimization of flight test tasks allocation and sequencing using genetic algorithm. Appl Soft Comput 115:108241. https://doi.org/10.1016/j.asoc.2021.108241
Article Google Scholar
Jünger M, Reinelt G, Rinaldi G (1995) Chapter 4 the traveling salesman problem 7:225–330 https://doi.org/10.1016/S0927-0507(05)80121-5
Applegate D, Cook W (1991) A computational study of the job-shop scheduling problem. ORSA J Comput 3(2):149–156. https://doi.org/10.1287/ijoc.3.2.149
Article Google Scholar
Toth P, Vigo D (2002) Models, relaxations and exact approaches for the capacitated vehicle routing problem. Discrete Appl Math 123(1):487–512. https://doi.org/10.1016/S0166-218X(01)00351-1
Article MathSciNet Google Scholar
Arunarani A, Manjula D, Sugumaran V (2019) Task scheduling techniques in cloud computing: a literature survey. Futur Gener Comput Syst 91:407–415. https://doi.org/10.1016/j.future.2018.09.014
Article Google Scholar
Peres F, Castelli M (2021) Combinatorial optimization problems and metaheuristics: review, challenges, design, and development. Appl Sci. https://doi.org/10.3390/app11146449
Article Google Scholar
Blum C, Roli A (2003) Metaheuristics in combinatorial optimization: overview and conceptual comparison. ACM Comput Surv 35(3):268–308. https://doi.org/10.1145/937503.937505
Article Google Scholar
Voß S (2001) Meta-heuristics: the state of the art. In: Nareyek A (ed) Local search for planning and scheduling. Springer, Berlin and Heidelberg, pp 1–23
Google Scholar
Peres F, Castelli M (2021) Combinatorial optimization problems and metaheuristics: review, challenges, design, and development. Appl Sci. https://doi.org/10.3390/app11146449
Article Google Scholar
François-Lavet V, Henderson P, Islam R, Bellemare MG, Pineau J (2018) An introduction to deep reinforcement learning. Foundat Trends® Mach Learn, 11(3-4):219–354 https://doi.org/10.1561/2200000071
Bengio Y, Lodi A, Prouvost A (2021) Machine learning for combinatorial optimization: a methodological tour d’horizon. Eur J Oper Res 290(2):405–421. https://doi.org/10.1016/j.ejor.2020.07.063
Article MathSciNet Google Scholar
Shakya AK, Pillai G, Chakrabarty S (2023) Reinforcement learning algorithms: a brief survey. Expert Syst Appl 231:120495. https://doi.org/10.1016/j.eswa.2023.120495
Article Google Scholar
Li SE (2023) Deep reinforcement learning, pp. 365–402. Springer, Singapore. https://doi.org/10.1007/978-981-19-7784-8_10
Zhang Y, Zhu H, Tang D, Zhou T, Gui Y (2022) Dynamic job shop scheduling based on deep reinforcement learning for multi-agent manufacturing systems. Robot Comput Integr Manuf 78:102412. https://doi.org/10.1016/j.rcim.2022.102412
Article Google Scholar
Zhang L, Feng Y, Xiao Q, Xu Y, Li D, Yang D, Yang Z (2023) Deep reinforcement learning for dynamic flexible job shop scheduling problem considering variable processing times. J Manuf Syst 71:257–273. https://doi.org/10.1016/j.jmsy.2023.09.009
Article Google Scholar
Shi Q, Li L, Fang Z, Bi X, Liu H, Zhang X, Chen W, Yu J (2024) Efficient and fair PPO-based integrated scheduling method for multiple tasks of satech-01 satellite. Chin J Aeronaut 37(2):417–430. https://doi.org/10.1016/j.cja.2023.10.011
Article Google Scholar
Bellman R (1966) Dynamic programming. Science 153(3731):34–37. https://doi.org/10.1126/science.153.3731.34
Article Google Scholar
Fisher ML (2004) The Lagrangian relaxation method for solving integer programming problems. Manage Sci 50(12):1861–1871. https://doi.org/10.1287/mnsc.1040.0263
Article Google Scholar
Lawler EL, Wood DE (1966) Branch-and-bound methods: a survey. Oper Res 14(4):699–719. https://doi.org/10.1287/opre.14.4.699
Article MathSciNet Google Scholar
Tomazella CP, Nagano MS (2020) A comprehensive review of branch-and-bound algorithms: guidelines and directions for further research on the flowshop scheduling problem. Expert Syst Appl 158:113556. https://doi.org/10.1016/j.eswa.2020.113556
Article Google Scholar
Deng Q, Santos BF, Curran R (2020) A practical dynamic programming based methodology for aircraft maintenance check scheduling optimization. Eur J Oper Res 281(2):256–273. https://doi.org/10.1016/j.ejor.2019.08.025
Article MathSciNet Google Scholar
Asadi-Gangraj E (2017) Lagrangian relaxation approach to minimize makespan for hybrid flow shop scheduling problem with unrelated parallel machines. Scientia Iranica https://doi.org/10.24200/sci.2017.20018
Ouelhadj D, Petrovic S (2009) A survey of dynamic scheduling in manufacturing systems. J Sched 12(4):417–431. https://doi.org/10.1007/s10951-008-0090-8
Article MathSciNet Google Scholar
Rajendran C, Holthaus O (1999) A comparative study of dispatching rules in dynamic flowshops and jobshops. Eur J Oper Res 116(1):156–170. https://doi.org/10.1016/S0377-2217(98)00023-X
Article Google Scholar
Sih GC, Lee EA (1990) Dynamic-level scheduling for heterogeneous processor networks. In: Proceedings of the Second IEEE Symposium on Parallel and Distributed Processing 1990, pp. 42–49. https://doi.org/10.1109/SPDP.1990.143505
Topcuoglu H, Hariri S, Wu M-Y (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260–274. https://doi.org/10.1109/71.993206
Article Google Scholar
Zhang L, Zhou L, Salah A (2020) Efficient scientific workflow scheduling for deadline-constrained parallel tasks in cloud computing environments. Inf Sci 531:31–46. https://doi.org/10.1016/j.ins.2020.04.039
Article MathSciNet Google Scholar
Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence, 1st MIT Press Edn. MIT Press, Cambridge. https://doi.org/10.7551/mitpress/1090.001.0001
Wang D, Tan D, Liu L (2018) Particle swarm optimization algorithm: an overview. Soft Comput. https://doi.org/10.1007/s00500-016-2474-6
Article Google Scholar
Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53–66. https://doi.org/10.1109/4235.585892
Article Google Scholar
Hajek B (1985) A tutorial survey of theory and applications of simulated annealing. In: 1985 24th IEEE Conference on Decision and Control, pp. 755–760. https://doi.org/10.1109/CDC.1985.268599
Glover F (1986) Future paths for integer programming and links to artificial intelligence. Comput Oper Res 13(5):533–549. https://doi.org/10.1016/0305-0548(86)90048-1
Article MathSciNet Google Scholar
Zhou Z, Li F, Zhu H, Xie H, Abawajy JH, Chowdhury MU (2020) An improved genetic algorithm using greedy strategy toward task scheduling optimization in cloud environments. Neural Comput Appl 32(6):1531–1541. https://doi.org/10.1007/s00521-019-04119-7
Article Google Scholar
Jana B, Chakraborty M, Mandal T (2019) A task scheduling technique based on particle swarm optimization algorithm in cloud environment. In: Ray K, Sharma TK, Rawat S, Saini RK, Bandyopadhyay A (eds) Soft computing: theories and applications. Springer, Singapore, pp 525–536
Chapter Google Scholar
Tran LV, Huynh BH, Akhtar H (2019) Ant colony optimization algorithm for maintenance, repair and overhaul scheduling optimization in the context of industrie 40. Appl Sci. https://doi.org/10.3390/app9224815
Article Google Scholar
Pang J, Zhou H, Tsai Y-C, Chou F-D (2018) A scatter simulated annealing algorithm for the bi-objective scheduling problem for the wet station of semiconductor manufacturing. Comput Ind Eng 123:54–66. https://doi.org/10.1016/j.cie.2018.06.017
Article Google Scholar
Gao J, Chen R, Deng W (2013) An efficient tabu search algorithm for the distributed permutation flowshop scheduling problem. Int J Prod Res 51(3):641–651. https://doi.org/10.1080/00207543.2011.644819
Article Google Scholar
Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in Neural Information Processing Systems, vol 28, pp 2692–2700
Bello I, Pham H, Le Q, Norouzi M, Bengio S (2016) Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940
Ma Q, Ge S, He D, Thaker D, Drori I (2019) Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning. arXiv preprint arXiv:1911.04936
Ju X, Su S, Xu C, Wang H (2023) Computation offloading and tasks scheduling for the internet of vehicles in edge computing: a deep reinforcement learning-based pointer network approach. Computer Netw 223:109572. https://doi.org/10.1016/j.comnet.2023.109572
Article Google Scholar
Chen R, Yang B, Li S, Wang S (2020) A self-learning genetic algorithm based on reinforcement learning for flexible job-shop scheduling problem. Comput Ind Eng 149:106778. https://doi.org/10.1016/j.cie.2020.106778
Article Google Scholar
Etheve M, Alès Z, Bissuel C, Juan O, Kedad-Sidhoum S (2020) Reinforcement learning for variable selection in a branch and bound algorithm. In: Hebrard E, Musliu N (eds) Integration of constraint programming, artificial intelligence, and operations research. Springer, Cham, pp 176–185
Chapter Google Scholar
Li T, Meng Y, Tang L (2023) Scheduling of continuous annealing with a multi-objective differential evolution algorithm based on deep reinforcement learning. IEEE Trans Autom Sci Eng. https://doi.org/10.1109/TASE.2023.3244331
Article Google Scholar
Li R, Gong W, Wang L, Lu C, Dong C (2024) Co-evolution with deep reinforcement learning for energy-aware distributed heterogeneous flexible job shop scheduling. IEEE Trans Syst Man Cybern Syst 54(1):201–211. https://doi.org/10.1109/TSMC.2023.3305541
Article Google Scholar
Qu S, Wang J, Shivani G (2016) Learning adaptive dispatching rules for a manufacturing process system by using reinforcement learning approach. In: 2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA), pp. 1–8. https://doi.org/10.1109/ETFA.2016.7733712
Zhang Y, Bai R, Qu R, Tu C, Jin J (2022) A deep reinforcement learning based hyper-heuristic for combinatorial optimisation with uncertainties. Eur J Oper Res 300(2):418–427. https://doi.org/10.1016/j.ejor.2021.10.032
Article MathSciNet Google Scholar
Du Y, Li J, Li C, Duan P (2022) A reinforcement learning approach for flexible job shop scheduling problem with crane transportation and setup times. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2022.3208942
Article Google Scholar
Wang Z, Liao W (2023) Smart scheduling of dynamic job shop based on discrete event simulation and deep reinforcement learning. J Intell Manuf. https://doi.org/10.1007/s10845-023-02161-w
Article Google Scholar
Gui Y, Tang D, Zhu H, Zhang Y, Zhang Z (2023) Dynamic scheduling for flexible job shop using a deep reinforcement learning approach. Comput Ind Eng 180:109255. https://doi.org/10.1016/j.cie.2023.109255
Article Google Scholar
Zhou J, Zheng L, Fan W (2024) Multirobot collaborative task dynamic scheduling based on multiagent reinforcement learning with heuristic graph convolution considering robot service performance. J Manuf Syst 72:122–141. https://doi.org/10.1016/j.jmsy.2023.11.010
Article Google Scholar
Meng L, Zhang C, Shao X, Ren Y (2019) Milp models for energy-aware flexible job shop scheduling problem. J Clean Prod 210:710–723. https://doi.org/10.1016/j.jclepro.2018.11.021
Article Google Scholar
Meng L, Zhang C, Ren Y, Zhang B, Lv C (2020) Mixed-integer linear programming and constraint programming formulations for solving distributed flexible job shop scheduling problem. Comput Ind Eng 142:106347. https://doi.org/10.1016/j.cie.2020.106347
Article Google Scholar
Puterman ML (2009) Markov decision processes: discrete stochastic dynamic programming. Vol. 414. Wiley, Hoboken. https://doi.org/10.1002/9780470316887
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347
Schulman J, Moritz P, Levine S, Jordan M, Abbeel P (2015) High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438
Van Veldhuizen DA, Lamont GB (2000) On measuring multiobjective evolutionary algorithm performance. In: Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512), vol 1, pp 204–2111. https://doi.org/10.1109/CEC.2000.870296
Zitzler E, Thiele L (1999) Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach. IEEE Trans Evol Comput 3(4):257–271. https://doi.org/10.1109/4235.797969
Article Google Scholar
Li J-Q, Pan Q-K, Liang Y-C (2010) An effective hybrid tabu search algorithm for multi-objective flexible job-shop scheduling problems. Comput Ind Eng 59(4):647–662. https://doi.org/10.1016/j.cie.2010.07.014
Article Google Scholar
Murata T, Ishibuchi H (1996) Positive and negative combination effects of crossover and mutation operators in sequencing problems. In: Proceedings of IEEE International Conference on Evolutionary Computation, pp 170–175. https://doi.org/10.1109/ICEC.1996.542355
Miller BL, Goldberg DE (1995) Genetic algorithms, tournament selection, and the effects of noise. Complex Syst 9:193–921
MathSciNet Google Scholar

Download references

Funding

This paper is sponsored by National Natural Science Foundation of China (61673270; 61973212) and Science and Technology Program of Zhejiang Province (2022C01013).

Author information

Authors and Affiliations

School of Aeronautics and Astronautics, Shanghai Jiao Tong University, DongChuan Road 800, Shanghai, 200240, China
Bei Tian, Gang Xiao & Yu Shen

Authors

Bei Tian
View author publications
You can also search for this author in PubMed Google Scholar
Gang Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Yu Shen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Bei Tian wrote the main manuscript text and performed the experiment; Gang Xiao contributed to the conception of the study; and Yu Shen contributed to the manuscript preparation. All authors reviewed the manuscript.

Corresponding author

Correspondence to Gang Xiao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tian, B., Xiao, G. & Shen, Y. A deep reinforcement learning approach for dynamic task scheduling of flight tests. J Supercomput 80, 18761–18796 (2024). https://doi.org/10.1007/s11227-024-06167-w

Download citation

Accepted: 27 April 2024
Published: 22 May 2024
Issue Date: September 2024
DOI: https://doi.org/10.1007/s11227-024-06167-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A deep reinforcement learning approach for dynamic task scheduling of flight tests

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A predictive-reactive strategy for flight test task scheduling with aircraft grounding

Flight Gate Assignment Problem with Reinforcement Learning

6-DOF Reinforcement Learning Control for Multi-rotor and Fixed-Wing Aircrafts

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now