Abstract
For the complex environment and massive multi-source data, the capability of multi-domain cooperative resource scheduling has become extremely important. Optimal scheduling can reduce operating costs and time, and MDLS is still the most commonly utilized algorithm in combat task scheduling today, despite of its defects. This research provides a plausible new method for the MDCRS problem, a resource scheduling method based on deep reinforcement learning (DRL), which has proven to be effective for other scheduling problems. Aiming at the resource scheduling problem in the multi-domain cooperative operation, under timing constraints, an MDCRS model is created using the shortest completion time as the objective function. On this premise, this paper presents an MDCRS-MDP model based on Markov decision processes, in which a two-dimensional action space that can simultaneously allocate action and match platform is designed and a dense reward function with strong connections to the criterion for sparse makespan minimization is provided. A resource scheduling approach utilizing DRL is proposed, including task-platform matching and task sequencing, based on the MDCRS-MDP model. Finally, combined with the joint landing operation, the experimental results verify the effectiveness of the proposed method for solving MDCRS and demonstrate the significant advantages over traditional dispatching rules and meta-heuristic optimization algorithms.
Similar content being viewed by others
Data availability
Data will be made available on request.
References
Zhang WM, Huang SP, Huang JC, Zhu C, Ding ZY (2020) Analysis on multi-domain operation and its command and control problems. Comm Inf Syst Technol 11(01):1–6. https://doi.org/10.15908/j.cnki.cist.2020.01.001
Liu K (2021) Theoretical thinking on the joint all-domain command and control system of the U.S. army. J China Acad Electron Inf Technol 16(07):722–727. https://doi.org/10.3969/j.issn.1673-5692.2021.07.014
Han X, Mandal S, Pattipati KR, Kleinman DL, Mishra M (2013) An optimization-based distributed planning algorithm: a blackboard-based collaborative framework. IEEE Trans Syst Man Cybern Syst 44(6):673–686. https://doi.org/10.1109/TSMC.2013.2276392
Aramesh S, Aickelin U, Khorshidi HA (2022) A hybrid projection method for resource-constrained project scheduling problem under uncertainty. Neural Comput Appl 34:14557–14576. https://doi.org/10.1007/s00521-022-07321-2
Gabi D, Dankolo NM, Muslim AA, Abraham A, Usmanjoda M, Zainal A, Zakaria Z (2022) Dynamic scheduling of heterogeneous resources across mobile edge-cloud continuum using fruit fly-based simulated annealing optimization scheme. Neural Comput Appl 34:14085–14105. https://doi.org/10.1007/s00521-022-07260-y
Xie B, Lin H (2013) Survey on joint battlefield resources scheduling problem. Ship Electron Eng 33(10):23–26. https://doi.org/10.3969/j.issn1672-9730.2013.10.009
Levchuk GM, Levchuk YN, Luo J, Pattipati KR, Kleinman DL (2002) Normative design of organization -Part I: Mission planning. IEEE Trans on Syst Man Cybern Part A Syst Humans 32(3):346–359. https://doi.org/10.1109/TSMCA.2002.802819
Zhou Y, Zhao H, Chen J, Jia Y (2020) A novel mission planning method for UAVs’ course of action. Comput Commun 152:345–356. https://doi.org/10.1016/j.comcom.2020.01.006
Fu Z, Qu L (2019) Research on resource rescheduling of joint operations based on GA-MDLS. In: 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), pp 1944–1948. https://doi.org/10.1109/ITNEC.2019.8729238
Zhang J, Huang S, Sun P, Chen G (2018) Task scheduling method based on feasible task execution sequence and greedy strategy. J Phys Conf Ser 1060(1):012051–012056. https://doi.org/10.1088/1742-6596/1060/1/012051
Tian J, Hao XC, Gen M (2019) A hybrid multi-objective EDA for robust resource constraint project scheduling with uncertainty. Comput Ind Eng 130:317–326. https://doi.org/10.1016/j.cie.2019.02.039
Poppenborg J, Knust S (2016) A flow-based tabu search algorithm for the RCPSP with transfer times. OR Spectrum 38:305–334. https://doi.org/10.1007/s00291-015-0402-2
Ding H, Gu X (2020) Hybrid of human learning optimization algorithm and particle swarm optimization algorithm with scheduling strategies for the flexible job-shop scheduling problem. Neurocomputing 414:313–332. https://doi.org/10.1016/j.neucom.2020.07.004
Khurshid B, Maqsood S, Omair M, Sarkar B, Ahmad I, Muhammad K (2021) An improved evolution strategy hybridization with simulated annealing for permutation flow shop scheduling problems. IEEE Access 9:94505–94522. https://doi.org/10.1109/ACCESS.2021.3093336
Girish BS, Jawahar N (2009) Scheduling job shop associated with multiple routings with genetic and ant colony heuristics. Int J Prod Res 47(14):3891–3917. https://doi.org/10.1080/00207540701824845
Rana N, Abd Latiff MS, Abdulhamid SIM, Misra S (2022) A hybrid whale optimization algorithm with differential evolution optimization for multi-objective virtual machine scheduling in cloud computing. Eng Optim 54(12):1999–2016. https://doi.org/10.1080/0305215X.2021.1969560
Li Y, Qiu X, Liu X, Xia Q (2020) Deep reinforcement learning and its application in autonomous fitting optimization for attack areas of UCAVs. J Syst Eng Electron 31(4):734–742. https://doi.org/10.23919/JSEE.2020.000048
Han BA, Yang JJ (2020) Research on adaptive job shop scheduling problems based on dueling double DQN. IEEE Access 8:186474–186495. https://doi.org/10.1109/ACCESS.2020.3029868
Cheng F, Huang Y, Tanpure B, Sawalani P, Cheng L (2022) Cost-aware job scheduling for cloud instances using deep reinforcement learning. Clust Comput 25:619–631. https://doi.org/10.1007/s10586-021-03436-8
Zhao FQ, Zhang LX, Cao J et al (2021) A cooperative water wave optimization algorithm with reinforcement learning for the distributed assembly no-idle flowshop scheduling problem. Comput Ind Eng 153:107082. https://doi.org/10.1016/j.cie.2020.107082
Feng Y, Zhang L, Yang Z, Guo Y, Yang D (2021) Flexible job shop scheduling based on deep reinforcement learning. In: 2021 5th Asian Conference on Artificial Intelligence Technology (ACAIT), pp 660–666. https://doi.org/10.1109/ACAIT53529.2021.9731322
Swarup S, Shakshuki EM, Yasar A (2021) Task scheduling in cloud using deep reinforcement learning. Procedia Comput Sci 184:42–51. https://doi.org/10.1016/j.procs.2021.03.016
Zeng Z, Li X, Bai C (2022) A deep reinforcement learning approach to flexible job shop scheduling. In: 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp 884–890. https://doi.org/10.1109/SMC53654.2022.9945107
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. https://doi.org/10.48550/arXiv.1707.06347
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction, 2nd edn. MIT Press, Cambridge
Tassel P, Gebser M, Schekotihin K (2021) A reinforcement learning environment for job-shop scheduling. https://doi.org/10.48550/arXiv.2104.03760
Pasaraba WL (2000) The conduct and assessment of A2C2 experiment 7. Naval Postgraduate School, Monterey
Liang E, Liaw R, Nishihara R, Moritz P, Fox R, Goldberg K, Gonzalez J, Jordan M, Stoica I (2018) RLlib: Abstractions for distributed reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning. PMLR, 80, pp 3053–3062. http://proceedings.mlr.press/v80/liang18b/liang18b.pdf
Kronheim BS, Kuchera MP, Prosper HB (2022) TensorBNN: bayesian inference for neural networks using TensorFlow. Comput Phys Commun 270:108168. https://doi.org/10.1016/j.cpc.2021.108168
Li H, Bi L, Jin BF (2018) Application of improved particle swarm optimization in multi-target working workshop scheduling. Comput Appl Softw 35(03):49–53. https://doi.org/10.3969/j.issn.1000-386x.2018.03.009
Acknowledgements
Research for this paper was supported by the Equipment advance research project (50912020401), the project of Xiangjiang Laboratory (No.22XJ02003), the National Natural Science Foundation (No.62122093), and the Natural Science Basic Research Plan in Shanxi Province of China (No.2018JM6011). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.
Author information
Authors and Affiliations
Contributions
HL contributed to conceptualization, methodology, and research management. ZH was involved in methodology and provided software. RW was involved in conceptualization and methodology. KH contributed to conceptualization and methodology. GC was involved in conceptualization and methodology.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, H., He, Z., Wang, R. et al. A new multi-domain cooperative resource scheduling method using proximal policy optimization. Neural Comput & Applic 36, 4931–4945 (2024). https://doi.org/10.1007/s00521-023-09326-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-09326-x