Skip to main content
Log in

A deep reinforcement learning approach for dynamic task scheduling of flight tests

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

For flight test engineering, the flight test task schedule is of great importance to the delivery node and the development cost of an aircraft, while in the real flight test process, dynamic events may frequently occur, which affect the schedule implementation and flight test progress. To adaptively adjust the real-world flight test schedule, this paper proposes a deep reinforcement learning (DRL) approach to solve the dynamic task scheduling problem for flight tests, with the objectives of flight test duration and task tardiness. Firstly, the task scheduling characteristics for flight tests are introduced, and a mixed-integer programming (MIP) model is constructed. Then, the addressed problem is formulated as a Markov decision process (MDP), including the well-designed state features, reward functions, and action space based on the heuristic rules for selecting the uncompleted flight test task and allocating the selected task to an appropriate aircraft. Proximal policy optimization (PPO) is adopted to train and learn the optimal policy. Finally, extensive experiments are carried out to verify the proposed method’s effectiveness and efficiency in constructing a high-quality flight test task schedule in a dynamic flight test environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The data are available from the corresponding author upon reasonable request.

References

  1. Landi A, Nicholson M (2011) Arp4754a/ ed-79a - guidelines for development of civil aircraft and systems - enhancements, novelties and key topics. SAE Int J Aerosp 4:871–879. https://doi.org/10.4271/2011-01-2564

    Article  Google Scholar 

  2. Gregory JW, Liu T (2021) Introduction to flight testing. Wiley, Newark, https://doi.org/10.1002/9781118949818

  3. Corda S (2017) Introduction to aerospace engineering with a flight test perspective

  4. Hewett M, Tartt D, Agarwal A (1991) Automated flight test management system. Technical report. https://ntrs.nasa.gov/api/citations/19910012804/downloads/19910012804.pdf

  5. Air Force programs: mission planning system. Technical report (2019). https://www.dote.osd.mil/Portals/97/pub/reports/FY2010/af/2010mps.pdf?ver=2019-08-22-112949-083

  6. Yuan C, Xiu Z, Tian H, Ding Z (2014) Research on flight test planning and management for civil aircraft. Civil Aircraft Design and Research, 1–452

  7. Liu Y, Xiao G, Wang M, Li T (2019) A method for flight test subject allocation on multiple test aircrafts based on improved genetic algorithm. Aerosp Syst. https://doi.org/10.1007/s42401-019-00035-9

    Article  Google Scholar 

  8. Xu S, Bi W, Zhang A, Mao Z (2022) Optimization of flight test tasks allocation and sequencing using genetic algorithm. Appl Soft Comput 115:108241. https://doi.org/10.1016/j.asoc.2021.108241

    Article  Google Scholar 

  9. Jünger M, Reinelt G, Rinaldi G (1995) Chapter 4 the traveling salesman problem 7:225–330 https://doi.org/10.1016/S0927-0507(05)80121-5

  10. Applegate D, Cook W (1991) A computational study of the job-shop scheduling problem. ORSA J Comput 3(2):149–156. https://doi.org/10.1287/ijoc.3.2.149

    Article  Google Scholar 

  11. Toth P, Vigo D (2002) Models, relaxations and exact approaches for the capacitated vehicle routing problem. Discrete Appl Math 123(1):487–512. https://doi.org/10.1016/S0166-218X(01)00351-1

    Article  MathSciNet  Google Scholar 

  12. Arunarani A, Manjula D, Sugumaran V (2019) Task scheduling techniques in cloud computing: a literature survey. Futur Gener Comput Syst 91:407–415. https://doi.org/10.1016/j.future.2018.09.014

    Article  Google Scholar 

  13. Peres F, Castelli M (2021) Combinatorial optimization problems and metaheuristics: review, challenges, design, and development. Appl Sci. https://doi.org/10.3390/app11146449

    Article  Google Scholar 

  14. Blum C, Roli A (2003) Metaheuristics in combinatorial optimization: overview and conceptual comparison. ACM Comput Surv 35(3):268–308. https://doi.org/10.1145/937503.937505

    Article  Google Scholar 

  15. Voß S (2001) Meta-heuristics: the state of the art. In: Nareyek A (ed) Local search for planning and scheduling. Springer, Berlin and Heidelberg, pp 1–23

    Google Scholar 

  16. Peres F, Castelli M (2021) Combinatorial optimization problems and metaheuristics: review, challenges, design, and development. Appl Sci. https://doi.org/10.3390/app11146449

    Article  Google Scholar 

  17. François-Lavet V, Henderson P, Islam R, Bellemare MG, Pineau J (2018) An introduction to deep reinforcement learning. Foundat Trends® Mach Learn, 11(3-4):219–354 https://doi.org/10.1561/2200000071

  18. Bengio Y, Lodi A, Prouvost A (2021) Machine learning for combinatorial optimization: a methodological tour d’horizon. Eur J Oper Res 290(2):405–421. https://doi.org/10.1016/j.ejor.2020.07.063

    Article  MathSciNet  Google Scholar 

  19. Shakya AK, Pillai G, Chakrabarty S (2023) Reinforcement learning algorithms: a brief survey. Expert Syst Appl 231:120495. https://doi.org/10.1016/j.eswa.2023.120495

    Article  Google Scholar 

  20. Li SE (2023) Deep reinforcement learning, pp. 365–402. Springer, Singapore. https://doi.org/10.1007/978-981-19-7784-8_10

  21. Zhang Y, Zhu H, Tang D, Zhou T, Gui Y (2022) Dynamic job shop scheduling based on deep reinforcement learning for multi-agent manufacturing systems. Robot Comput Integr Manuf 78:102412. https://doi.org/10.1016/j.rcim.2022.102412

    Article  Google Scholar 

  22. Zhang L, Feng Y, Xiao Q, Xu Y, Li D, Yang D, Yang Z (2023) Deep reinforcement learning for dynamic flexible job shop scheduling problem considering variable processing times. J Manuf Syst 71:257–273. https://doi.org/10.1016/j.jmsy.2023.09.009

    Article  Google Scholar 

  23. Shi Q, Li L, Fang Z, Bi X, Liu H, Zhang X, Chen W, Yu J (2024) Efficient and fair PPO-based integrated scheduling method for multiple tasks of satech-01 satellite. Chin J Aeronaut 37(2):417–430. https://doi.org/10.1016/j.cja.2023.10.011

    Article  Google Scholar 

  24. Bellman R (1966) Dynamic programming. Science 153(3731):34–37. https://doi.org/10.1126/science.153.3731.34

    Article  Google Scholar 

  25. Fisher ML (2004) The Lagrangian relaxation method for solving integer programming problems. Manage Sci 50(12):1861–1871. https://doi.org/10.1287/mnsc.1040.0263

    Article  Google Scholar 

  26. Lawler EL, Wood DE (1966) Branch-and-bound methods: a survey. Oper Res 14(4):699–719. https://doi.org/10.1287/opre.14.4.699

    Article  MathSciNet  Google Scholar 

  27. Tomazella CP, Nagano MS (2020) A comprehensive review of branch-and-bound algorithms: guidelines and directions for further research on the flowshop scheduling problem. Expert Syst Appl 158:113556. https://doi.org/10.1016/j.eswa.2020.113556

    Article  Google Scholar 

  28. Deng Q, Santos BF, Curran R (2020) A practical dynamic programming based methodology for aircraft maintenance check scheduling optimization. Eur J Oper Res 281(2):256–273. https://doi.org/10.1016/j.ejor.2019.08.025

    Article  MathSciNet  Google Scholar 

  29. Asadi-Gangraj E (2017) Lagrangian relaxation approach to minimize makespan for hybrid flow shop scheduling problem with unrelated parallel machines. Scientia Iranica https://doi.org/10.24200/sci.2017.20018

  30. Ouelhadj D, Petrovic S (2009) A survey of dynamic scheduling in manufacturing systems. J Sched 12(4):417–431. https://doi.org/10.1007/s10951-008-0090-8

    Article  MathSciNet  Google Scholar 

  31. Rajendran C, Holthaus O (1999) A comparative study of dispatching rules in dynamic flowshops and jobshops. Eur J Oper Res 116(1):156–170. https://doi.org/10.1016/S0377-2217(98)00023-X

    Article  Google Scholar 

  32. Sih GC, Lee EA (1990) Dynamic-level scheduling for heterogeneous processor networks. In: Proceedings of the Second IEEE Symposium on Parallel and Distributed Processing 1990, pp. 42–49. https://doi.org/10.1109/SPDP.1990.143505

  33. Topcuoglu H, Hariri S, Wu M-Y (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260–274. https://doi.org/10.1109/71.993206

    Article  Google Scholar 

  34. Zhang L, Zhou L, Salah A (2020) Efficient scientific workflow scheduling for deadline-constrained parallel tasks in cloud computing environments. Inf Sci 531:31–46. https://doi.org/10.1016/j.ins.2020.04.039

    Article  MathSciNet  Google Scholar 

  35. Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence, 1st MIT Press Edn. MIT Press, Cambridge. https://doi.org/10.7551/mitpress/1090.001.0001

  36. Wang D, Tan D, Liu L (2018) Particle swarm optimization algorithm: an overview. Soft Comput. https://doi.org/10.1007/s00500-016-2474-6

    Article  Google Scholar 

  37. Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53–66. https://doi.org/10.1109/4235.585892

    Article  Google Scholar 

  38. Hajek B (1985) A tutorial survey of theory and applications of simulated annealing. In: 1985 24th IEEE Conference on Decision and Control, pp. 755–760. https://doi.org/10.1109/CDC.1985.268599

  39. Glover F (1986) Future paths for integer programming and links to artificial intelligence. Comput Oper Res 13(5):533–549. https://doi.org/10.1016/0305-0548(86)90048-1

    Article  MathSciNet  Google Scholar 

  40. Zhou Z, Li F, Zhu H, Xie H, Abawajy JH, Chowdhury MU (2020) An improved genetic algorithm using greedy strategy toward task scheduling optimization in cloud environments. Neural Comput Appl 32(6):1531–1541. https://doi.org/10.1007/s00521-019-04119-7

    Article  Google Scholar 

  41. Jana B, Chakraborty M, Mandal T (2019) A task scheduling technique based on particle swarm optimization algorithm in cloud environment. In: Ray K, Sharma TK, Rawat S, Saini RK, Bandyopadhyay A (eds) Soft computing: theories and applications. Springer, Singapore, pp 525–536

    Chapter  Google Scholar 

  42. Tran LV, Huynh BH, Akhtar H (2019) Ant colony optimization algorithm for maintenance, repair and overhaul scheduling optimization in the context of industrie 40. Appl Sci. https://doi.org/10.3390/app9224815

    Article  Google Scholar 

  43. Pang J, Zhou H, Tsai Y-C, Chou F-D (2018) A scatter simulated annealing algorithm for the bi-objective scheduling problem for the wet station of semiconductor manufacturing. Comput Ind Eng 123:54–66. https://doi.org/10.1016/j.cie.2018.06.017

    Article  Google Scholar 

  44. Gao J, Chen R, Deng W (2013) An efficient tabu search algorithm for the distributed permutation flowshop scheduling problem. Int J Prod Res 51(3):641–651. https://doi.org/10.1080/00207543.2011.644819

    Article  Google Scholar 

  45. Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in Neural Information Processing Systems, vol 28, pp 2692–2700

  46. Bello I, Pham H, Le Q, Norouzi M, Bengio S (2016) Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940

  47. Ma Q, Ge S, He D, Thaker D, Drori I (2019) Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning. arXiv preprint arXiv:1911.04936

  48. Ju X, Su S, Xu C, Wang H (2023) Computation offloading and tasks scheduling for the internet of vehicles in edge computing: a deep reinforcement learning-based pointer network approach. Computer Netw 223:109572. https://doi.org/10.1016/j.comnet.2023.109572

    Article  Google Scholar 

  49. Chen R, Yang B, Li S, Wang S (2020) A self-learning genetic algorithm based on reinforcement learning for flexible job-shop scheduling problem. Comput Ind Eng 149:106778. https://doi.org/10.1016/j.cie.2020.106778

    Article  Google Scholar 

  50. Etheve M, Alès Z, Bissuel C, Juan O, Kedad-Sidhoum S (2020) Reinforcement learning for variable selection in a branch and bound algorithm. In: Hebrard E, Musliu N (eds) Integration of constraint programming, artificial intelligence, and operations research. Springer, Cham, pp 176–185

    Chapter  Google Scholar 

  51. Li T, Meng Y, Tang L (2023) Scheduling of continuous annealing with a multi-objective differential evolution algorithm based on deep reinforcement learning. IEEE Trans Autom Sci Eng. https://doi.org/10.1109/TASE.2023.3244331

    Article  Google Scholar 

  52. Li R, Gong W, Wang L, Lu C, Dong C (2024) Co-evolution with deep reinforcement learning for energy-aware distributed heterogeneous flexible job shop scheduling. IEEE Trans Syst Man Cybern Syst 54(1):201–211. https://doi.org/10.1109/TSMC.2023.3305541

    Article  Google Scholar 

  53. Qu S, Wang J, Shivani G (2016) Learning adaptive dispatching rules for a manufacturing process system by using reinforcement learning approach. In: 2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA), pp. 1–8. https://doi.org/10.1109/ETFA.2016.7733712

  54. Zhang Y, Bai R, Qu R, Tu C, Jin J (2022) A deep reinforcement learning based hyper-heuristic for combinatorial optimisation with uncertainties. Eur J Oper Res 300(2):418–427. https://doi.org/10.1016/j.ejor.2021.10.032

    Article  MathSciNet  Google Scholar 

  55. Du Y, Li J, Li C, Duan P (2022) A reinforcement learning approach for flexible job shop scheduling problem with crane transportation and setup times. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2022.3208942

    Article  Google Scholar 

  56. Wang Z, Liao W (2023) Smart scheduling of dynamic job shop based on discrete event simulation and deep reinforcement learning. J Intell Manuf. https://doi.org/10.1007/s10845-023-02161-w

    Article  Google Scholar 

  57. Gui Y, Tang D, Zhu H, Zhang Y, Zhang Z (2023) Dynamic scheduling for flexible job shop using a deep reinforcement learning approach. Comput Ind Eng 180:109255. https://doi.org/10.1016/j.cie.2023.109255

    Article  Google Scholar 

  58. Zhou J, Zheng L, Fan W (2024) Multirobot collaborative task dynamic scheduling based on multiagent reinforcement learning with heuristic graph convolution considering robot service performance. J Manuf Syst 72:122–141. https://doi.org/10.1016/j.jmsy.2023.11.010

    Article  Google Scholar 

  59. Meng L, Zhang C, Shao X, Ren Y (2019) Milp models for energy-aware flexible job shop scheduling problem. J Clean Prod 210:710–723. https://doi.org/10.1016/j.jclepro.2018.11.021

    Article  Google Scholar 

  60. Meng L, Zhang C, Ren Y, Zhang B, Lv C (2020) Mixed-integer linear programming and constraint programming formulations for solving distributed flexible job shop scheduling problem. Comput Ind Eng 142:106347. https://doi.org/10.1016/j.cie.2020.106347

    Article  Google Scholar 

  61. Puterman ML (2009) Markov decision processes: discrete stochastic dynamic programming. Vol. 414. Wiley, Hoboken. https://doi.org/10.1002/9780470316887

  62. Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347

  63. Schulman J, Moritz P, Levine S, Jordan M, Abbeel P (2015) High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438

  64. Van Veldhuizen DA, Lamont GB (2000) On measuring multiobjective evolutionary algorithm performance. In: Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512), vol 1, pp 204–2111. https://doi.org/10.1109/CEC.2000.870296

  65. Zitzler E, Thiele L (1999) Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach. IEEE Trans Evol Comput 3(4):257–271. https://doi.org/10.1109/4235.797969

    Article  Google Scholar 

  66. Li J-Q, Pan Q-K, Liang Y-C (2010) An effective hybrid tabu search algorithm for multi-objective flexible job-shop scheduling problems. Comput Ind Eng 59(4):647–662. https://doi.org/10.1016/j.cie.2010.07.014

    Article  Google Scholar 

  67. Murata T, Ishibuchi H (1996) Positive and negative combination effects of crossover and mutation operators in sequencing problems. In: Proceedings of IEEE International Conference on Evolutionary Computation, pp 170–175. https://doi.org/10.1109/ICEC.1996.542355

  68. Miller BL, Goldberg DE (1995) Genetic algorithms, tournament selection, and the effects of noise. Complex Syst 9:193–921

    MathSciNet  Google Scholar 

Download references

Funding

This paper is sponsored by National Natural Science Foundation of China (61673270; 61973212) and Science and Technology Program of Zhejiang Province (2022C01013).

Author information

Authors and Affiliations

Authors

Contributions

Bei Tian wrote the main manuscript text and performed the experiment; Gang Xiao contributed to the conception of the study; and Yu Shen contributed to the manuscript preparation. All authors reviewed the manuscript.

Corresponding author

Correspondence to Gang Xiao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tian, B., Xiao, G. & Shen, Y. A deep reinforcement learning approach for dynamic task scheduling of flight tests. J Supercomput 80, 18761–18796 (2024). https://doi.org/10.1007/s11227-024-06167-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-024-06167-w

Keywords