A new multi-domain cooperative resource scheduling method using proximal policy optimization

Liu, Haiying; He, Zhaoyi; Wang, Rui; Huang, Kuihua; Cheng, Guangquan

doi:10.1007/s00521-023-09326-x

A new multi-domain cooperative resource scheduling method using proximal policy optimization

Original Article
Published: 22 December 2023

Volume 36, pages 4931–4945, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Haiying Liu ORCID: orcid.org/0000-0002-4672-0033^1,4,
Zhaoyi He²,
Rui Wang^3,5,
Kuihua Huang³ &
…
Guangquan Cheng³

142 Accesses
Explore all metrics

Abstract

For the complex environment and massive multi-source data, the capability of multi-domain cooperative resource scheduling has become extremely important. Optimal scheduling can reduce operating costs and time, and MDLS is still the most commonly utilized algorithm in combat task scheduling today, despite of its defects. This research provides a plausible new method for the MDCRS problem, a resource scheduling method based on deep reinforcement learning (DRL), which has proven to be effective for other scheduling problems. Aiming at the resource scheduling problem in the multi-domain cooperative operation, under timing constraints, an MDCRS model is created using the shortest completion time as the objective function. On this premise, this paper presents an MDCRS-MDP model based on Markov decision processes, in which a two-dimensional action space that can simultaneously allocate action and match platform is designed and a dense reward function with strong connections to the criterion for sparse makespan minimization is provided. A resource scheduling approach utilizing DRL is proposed, including task-platform matching and task sequencing, based on the MDCRS-MDP model. Finally, combined with the joint landing operation, the experimental results verify the effectiveness of the proposed method for solving MDCRS and demonstrate the significant advantages over traditional dispatching rules and meta-heuristic optimization algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Intelligent Scheduling Method for Multi-domain Cooperative Operation Based on Deep Reinforcement Learning

Cooperative multi-agent target searching: a deep reinforcement learning approach based on parallel hindsight experience replay

Article Open access 21 February 2023

A DRL-based online real-time task scheduling method with ISSA strategy

Article 08 April 2024

Data availability

Data will be made available on request.

References

Zhang WM, Huang SP, Huang JC, Zhu C, Ding ZY (2020) Analysis on multi-domain operation and its command and control problems. Comm Inf Syst Technol 11(01):1–6. https://doi.org/10.15908/j.cnki.cist.2020.01.001
Article Google Scholar
Liu K (2021) Theoretical thinking on the joint all-domain command and control system of the U.S. army. J China Acad Electron Inf Technol 16(07):722–727. https://doi.org/10.3969/j.issn.1673-5692.2021.07.014
Article Google Scholar
Han X, Mandal S, Pattipati KR, Kleinman DL, Mishra M (2013) An optimization-based distributed planning algorithm: a blackboard-based collaborative framework. IEEE Trans Syst Man Cybern Syst 44(6):673–686. https://doi.org/10.1109/TSMC.2013.2276392
Article Google Scholar
Aramesh S, Aickelin U, Khorshidi HA (2022) A hybrid projection method for resource-constrained project scheduling problem under uncertainty. Neural Comput Appl 34:14557–14576. https://doi.org/10.1007/s00521-022-07321-2
Article Google Scholar
Gabi D, Dankolo NM, Muslim AA, Abraham A, Usmanjoda M, Zainal A, Zakaria Z (2022) Dynamic scheduling of heterogeneous resources across mobile edge-cloud continuum using fruit fly-based simulated annealing optimization scheme. Neural Comput Appl 34:14085–14105. https://doi.org/10.1007/s00521-022-07260-y
Article Google Scholar
Xie B, Lin H (2013) Survey on joint battlefield resources scheduling problem. Ship Electron Eng 33(10):23–26. https://doi.org/10.3969/j.issn1672-9730.2013.10.009
Article Google Scholar
Levchuk GM, Levchuk YN, Luo J, Pattipati KR, Kleinman DL (2002) Normative design of organization -Part I: Mission planning. IEEE Trans on Syst Man Cybern Part A Syst Humans 32(3):346–359. https://doi.org/10.1109/TSMCA.2002.802819
Article Google Scholar
Zhou Y, Zhao H, Chen J, Jia Y (2020) A novel mission planning method for UAVs’ course of action. Comput Commun 152:345–356. https://doi.org/10.1016/j.comcom.2020.01.006
Article Google Scholar
Fu Z, Qu L (2019) Research on resource rescheduling of joint operations based on GA-MDLS. In: 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), pp 1944–1948. https://doi.org/10.1109/ITNEC.2019.8729238
Zhang J, Huang S, Sun P, Chen G (2018) Task scheduling method based on feasible task execution sequence and greedy strategy. J Phys Conf Ser 1060(1):012051–012056. https://doi.org/10.1088/1742-6596/1060/1/012051
Article Google Scholar
Tian J, Hao XC, Gen M (2019) A hybrid multi-objective EDA for robust resource constraint project scheduling with uncertainty. Comput Ind Eng 130:317–326. https://doi.org/10.1016/j.cie.2019.02.039
Article Google Scholar
Poppenborg J, Knust S (2016) A flow-based tabu search algorithm for the RCPSP with transfer times. OR Spectrum 38:305–334. https://doi.org/10.1007/s00291-015-0402-2
Article MathSciNet Google Scholar
Ding H, Gu X (2020) Hybrid of human learning optimization algorithm and particle swarm optimization algorithm with scheduling strategies for the flexible job-shop scheduling problem. Neurocomputing 414:313–332. https://doi.org/10.1016/j.neucom.2020.07.004
Article Google Scholar
Khurshid B, Maqsood S, Omair M, Sarkar B, Ahmad I, Muhammad K (2021) An improved evolution strategy hybridization with simulated annealing for permutation flow shop scheduling problems. IEEE Access 9:94505–94522. https://doi.org/10.1109/ACCESS.2021.3093336
Article Google Scholar
Girish BS, Jawahar N (2009) Scheduling job shop associated with multiple routings with genetic and ant colony heuristics. Int J Prod Res 47(14):3891–3917. https://doi.org/10.1080/00207540701824845
Article Google Scholar
Rana N, Abd Latiff MS, Abdulhamid SIM, Misra S (2022) A hybrid whale optimization algorithm with differential evolution optimization for multi-objective virtual machine scheduling in cloud computing. Eng Optim 54(12):1999–2016. https://doi.org/10.1080/0305215X.2021.1969560
Article MathSciNet Google Scholar
Li Y, Qiu X, Liu X, Xia Q (2020) Deep reinforcement learning and its application in autonomous fitting optimization for attack areas of UCAVs. J Syst Eng Electron 31(4):734–742. https://doi.org/10.23919/JSEE.2020.000048
Article Google Scholar
Han BA, Yang JJ (2020) Research on adaptive job shop scheduling problems based on dueling double DQN. IEEE Access 8:186474–186495. https://doi.org/10.1109/ACCESS.2020.3029868
Article Google Scholar
Cheng F, Huang Y, Tanpure B, Sawalani P, Cheng L (2022) Cost-aware job scheduling for cloud instances using deep reinforcement learning. Clust Comput 25:619–631. https://doi.org/10.1007/s10586-021-03436-8
Article Google Scholar
Zhao FQ, Zhang LX, Cao J et al (2021) A cooperative water wave optimization algorithm with reinforcement learning for the distributed assembly no-idle flowshop scheduling problem. Comput Ind Eng 153:107082. https://doi.org/10.1016/j.cie.2020.107082
Article Google Scholar
Feng Y, Zhang L, Yang Z, Guo Y, Yang D (2021) Flexible job shop scheduling based on deep reinforcement learning. In: 2021 5th Asian Conference on Artificial Intelligence Technology (ACAIT), pp 660–666. https://doi.org/10.1109/ACAIT53529.2021.9731322
Swarup S, Shakshuki EM, Yasar A (2021) Task scheduling in cloud using deep reinforcement learning. Procedia Comput Sci 184:42–51. https://doi.org/10.1016/j.procs.2021.03.016
Article Google Scholar
Zeng Z, Li X, Bai C (2022) A deep reinforcement learning approach to flexible job shop scheduling. In: 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp 884–890. https://doi.org/10.1109/SMC53654.2022.9945107
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. https://doi.org/10.48550/arXiv.1707.06347
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction, 2nd edn. MIT Press, Cambridge
Google Scholar
Tassel P, Gebser M, Schekotihin K (2021) A reinforcement learning environment for job-shop scheduling. https://doi.org/10.48550/arXiv.2104.03760
Pasaraba WL (2000) The conduct and assessment of A2C2 experiment 7. Naval Postgraduate School, Monterey
Google Scholar
Liang E, Liaw R, Nishihara R, Moritz P, Fox R, Goldberg K, Gonzalez J, Jordan M, Stoica I (2018) RLlib: Abstractions for distributed reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning. PMLR, 80, pp 3053–3062. http://proceedings.mlr.press/v80/liang18b/liang18b.pdf
Kronheim BS, Kuchera MP, Prosper HB (2022) TensorBNN: bayesian inference for neural networks using TensorFlow. Comput Phys Commun 270:108168. https://doi.org/10.1016/j.cpc.2021.108168
Article MathSciNet CAS Google Scholar
Li H, Bi L, Jin BF (2018) Application of improved particle swarm optimization in multi-target working workshop scheduling. Comput Appl Softw 35(03):49–53. https://doi.org/10.3969/j.issn.1000-386x.2018.03.009
Article Google Scholar

Download references

Acknowledgements

Research for this paper was supported by the Equipment advance research project (50912020401), the project of Xiangjiang Laboratory (No.22XJ02003), the National Natural Science Foundation (No.62122093), and the Natural Science Basic Research Plan in Shanxi Province of China (No.2018JM6011). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.

Author information

Authors and Affiliations

College of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China
Haiying Liu
Nanjing Research Institute of Electronic Engineering, Nanjing, 210007, China
Zhaoyi He
College of System Engineering, National University of Defense Technology, Changsha, 410073, China
Rui Wang, Kuihua Huang & Guangquan Cheng
Nanjing Center for Applied Mathematics, Nanjing, 211135, Jiangsu, China
Haiying Liu
Xiangjiang Laboratory, Changsha, 410205, Hunan, China
Rui Wang

Authors

Haiying Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoyi He
View author publications
You can also search for this author in PubMed Google Scholar
Rui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Kuihua Huang
View author publications
You can also search for this author in PubMed Google Scholar
Guangquan Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

HL contributed to conceptualization, methodology, and research management. ZH was involved in methodology and provided software. RW was involved in conceptualization and methodology. KH contributed to conceptualization and methodology. GC was involved in conceptualization and methodology.

Corresponding authors

Correspondence to Haiying Liu or Kuihua Huang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, H., He, Z., Wang, R. et al. A new multi-domain cooperative resource scheduling method using proximal policy optimization. Neural Comput & Applic 36, 4931–4945 (2024). https://doi.org/10.1007/s00521-023-09326-x

Download citation

Received: 21 August 2022
Accepted: 26 November 2023
Published: 22 December 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s00521-023-09326-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new multi-domain cooperative resource scheduling method using proximal policy optimization

Abstract

Access this article

Similar content being viewed by others

An Intelligent Scheduling Method for Multi-domain Cooperative Operation Based on Deep Reinforcement Learning

Cooperative multi-agent target searching: a deep reinforcement learning approach based on parallel hindsight experience replay

A DRL-based online real-time task scheduling method with ISSA strategy

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A new multi-domain cooperative resource scheduling method using proximal policy optimization

Abstract

Access this article

Similar content being viewed by others

An Intelligent Scheduling Method for Multi-domain Cooperative Operation Based on Deep Reinforcement Learning

Cooperative multi-agent target searching: a deep reinforcement learning approach based on parallel hindsight experience replay

A DRL-based online real-time task scheduling method with ISSA strategy

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation