Two-stage graph attention networks and Q-learning based maintenance tasks scheduling

Gao, Xiaoyong; Peng, Diao; Yang, Yixu; Huang, Fuyu; Yuan, Yu; Tan, Chaodong; Li, Feifei

doi:10.1007/s10489-025-06249-z

Two-stage graph attention networks and Q-learning based maintenance tasks scheduling

Published: 16 January 2025

Volume 55, article number 331, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Xiaoyong Gao ORCID: orcid.org/0000-0002-6893-4139¹,
Diao Peng¹,
Yixu Yang¹,
Fuyu Huang¹,
Yu Yuan¹,
Chaodong Tan¹ &
…
Feifei Li²

138 Accesses
Explore all metrics

Abstract

The maintenance tasks scheduling optimization is important and challenging for improving the oil and gas exploitation efficiency. Traditionally, this problem is addressed using exact algorithms, metaheuristic algorithms or solvers. However, due to the large-scale nature of this problem, these trials often fail in practical use. To address this, a compositional message passing neural network (CMPNN) is introduced for graph embedding, and the messages of the whole graph are obtained by aggregating the messages of neighboring nodes, which is used as the input of the subsequent framework. Based on CMPNN, a framework combining two-stage Graph Attention Networks and Q-learning (TSGAT+Q-learning) is proposed in this paper. In the first stage, the agent embedding is completed, i.e., each service technician’s messages are represented by a constructed graph; In the second phase, each maintenance task selects an agent based on probability. In this way, the task assignment scheme is obtained, and finally Q-learning is used for further optimization. In addition, a key contribution is the proposal of a novel exponential reward, designed to speed up model training using REINFORCE in reinforcement learning. To validate the effectiveness of proposed method, scenarios with different scales are provided. In most cases, TSGAT+Q-learning outperforms CPLEX, OR-Tools and other learning-based algorithms. Moreover, the trained networks can also solve the problem with varying numbers of maintenance tasks, which implies that TSGAT+Q-learning has good generalization ability. Finally, the proposed method is also proved to be effective in solving on-site maintenance tasks scheduling problem with multiple constraints.

Graphical abstract

A Two-stage Graph Attention Networks and Q-learning Framework Based Maintenance Tasks Scheduling

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Multi-action Reinforcement Learning Framework via Pointer Graph Neural Network for Flexible Job-Shop Scheduling Problems with Resource Transfer

Deep reinforcement learning-based spatio-temporal graph neural network for solving job shop scheduling problem

Article Open access 16 November 2024

Adaptive disassembly sequence planning for VR maintenance training via deep reinforcement learning

Article 17 November 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Zhang Q, Liu Y, Xiahou T, Huang HZ (2023) A heuristic maintenance scheduling framework for a military aircraft fleet under limited maintenance capacities. ReliabEng Syst Saf 235:109239
George B, Loo J, Jie W (2023) Novel multi-objective optimisation for maintenance activities of floating production storage and offloading facilities. Appl Ocean Res 130:103440
Valet A et al (2022) Opportunistic maintenance scheduling with deep reinforcement learning. J Manuf Syst 64:518–534
Yan Q, Wang H (2022) Double-layer q-learning-based joint decision-making of dual resource-constrained aircraft assembly scheduling and flexible preventive maintenance. IEEE Transactions on aerospace and electronic systems 58:4938–4952
dos Santos Pereira GM et al (2022) Quasi-dynamic operation and maintenance plan for photovoltaic systems in remote areas: The framework of pantanal-ms. Renew Energy 181:404–416
Zhang C, Gao Y, Yang L, Gao Z, Qi J (2020) Joint optimization of train scheduling and maintenance planning in a railway network: A heuristic algorithm using lagrangian relaxation. Transp Res B Methodol 134:64–92
Cheikhrouhou O, Khoufi I (2021) A comprehensive survey on the multiple traveling salesman problem: Applications, approaches and taxonomy. Comput Sci Rev 40:100369
Yang X, Feng R, Xu P, Wang X, Qi M (2023) Internet-of-things-augmented dynamic route planning approach to the airport baggage handling system. Comput Ind Eng 75:108802
Ertem M, As’ ad R, Awad M, Al-Bar A (2022) Workers-constrained shutdown maintenance scheduling with skills flexibility: Models and solution algorithms. Comput Ind Eng 172:108575
Seif Z, Mardaneh E, Loxton R, Lockwood A (2021) Minimizing equipment shutdowns in oil and gas campaign maintenance. J Oper Res Soc 72:1486–1504
Wang X, Wang S, Xu Q (2022) Simultaneous production and maintenance scheduling for refinery front-end process with considerations of risk management and resource availability. Ind Eng Chem Res 61:2152–2166
Santos IM, Hamacher S, Oliveira F (2023) A data-driven optimization model for the workover rig scheduling problem: Case study in an oil company. Comput Chem Eng 170:108088
Sivanandam SN, Deepa SN, Sivanandam SN, Deepa SN (2008) Genetic algorithms. Springer
Dorigo M, Birattari M, Stutzle T (2006) Ant colony optimization. IEEE Computational intelligence magazine 1:28–39
Wang Q, Hao Y, Zhang J (2023) Generative inverse reinforcement learning for learning 2-opt heuristics without extrinsic rewards in routing problems. Journal of King Saud University-Computer and Information Sciences 35:101787
Mathlouthi I, Gendreau M, Potvin JY (2021) A metaheuristic based on tabu search for solving a technician routing and scheduling problem. Comput Oper Res 125:105079
Chen C, Demir E, Huang Y (2021) An adaptive large neighborhood search heuristic for the vehicle routing problem with time windows and delivery robots. European journal of operational research 294:1164–1180
Stodola P, Michenka K, Nohel J, Rybanskỳ M (2020) Hybrid algorithm based on ant colony optimization and simulated annealing applied to the dynamic traveling salesman problem. Entropy 22:884
Shi S, Xiong H, Li G (2023) A no-tardiness job shop scheduling problem with overtime consideration and the solution approaches. Comput Ind Eng 178:109115
Gupta R, Nanda SJ (2021) Solving dynamic many-objective tsp using nsga-iii equipped with svr-rbf kernel predictor, pp 95–102
Rostami AS, Mohanna F, Keshavarz H, Hosseinabadi AAR (2015) Solving multiple traveling salesman problem using the gravitational emulation local search algorithm. Appl Math Inform Sci 9:1–11
Lesch V, König M, Kounev S, Stein A, Krupitzer C (2022) Tackling the rich vehicle routing problem with nature-inspired algorithms. Appl Intell 52:9476–9500
Helsgaun K (2017) An extension of the lin-kernighan-helsgaun tsp solver for constrained traveling salesman and vehicle routing problems. Roskilde: Roskilde University vol 12
Lin S, Kernighan BW (1973) An effective heuristic algorithm for the traveling-salesman problem. Oper Res 21:498–516
Mazyavkina N, Sviridov S, Ivanov S, Burnaev E (2021) Reinforcement learning for combinatorial optimization: A survey. Comput Oper Res 134:105400
Chen X, Tian Y (2019) Learning to perform local rewriting for combinatorial optimization. Adv Neural Inform Process Syst vol 32
Stahlberg F (2020) Neural machine translation: A review. J Artif Intell Res 69:343–418
Kool W, Van Hoof H, Welling M (2018) Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475
Kwon YD et al (2020) Pomo: Policy optimization with multiple optima for reinforcement learning. Adv Neural Inform Process Syst 33:21188–21198
Zhou J et al (2023) Learning large neighborhood search for vehicle routing in airport ground handling. IEEE Transactions on knowledge and data engineering
Qin W, Zhuang Z, Huang Z, Huang H (2021) A novel reinforcement learning-based hyper-heuristic for heterogeneous vehicle routing problem. Comput Ind Eng 156:107252
Wu Y, Song W, Cao Z, Zhang J, Lim A (2021) Learning improvement heuristics for solving routing problems. IEEE Transactions on neural networks and learning systems 33:5057–5069
Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2008) The graph neural network model. IEEE Transactions on neural networks 20:61–80
Vesselinova N, Steinert R, Perez-Ramirez DF, Boman M (2020) Learning combinatorial optimization on graphs: A survey with applications to networking. IEEE Access 8:120388–120416
Hu Y et al (2021) A bidirectional graph neural network for traveling salesman problems on arbitrary symmetric graphs. Eng Appl Artif Intell 97:104061
Wang Q (2022) Varl: a variational autoencoder-based reinforcement learning framework for vehicle routing problems. Appl Intell pp 1–14
Pan W, Liu SQ (2023) Deep reinforcement learning for the dynamic and uncertain vehicle routing problem. Appl Intell 53:405–422
Chen Z, Zhang L, Wang X, Wang K (2023) Cloud–edge collaboration task scheduling in cloud manufacturing: An attention-based deep reinforcement learning approach. Comput Ind Eng 177:109053
Hu J, Wang Y, Pang Y, Liu Y (2022) Optimal maintenance scheduling under uncertainties using linear programming-enhanced reinforcement learning. Eng Appl Artif Intell 109:104655
Huang J, Su J, Chang Q (2022) Graph neural network and multi-agent reinforcement learning for machine-process-system integrated control to optimize production yield. J Manuf Syst 64:81–93
Wang Y, Qiu D, Wang Y, Sun M, Strbac G (2023) Graph learning-based voltage regulation in distribution networks with multi-microgrids. IEEE Transactions on power systems
Ding S et al (2023) Multiagent reinforcement learning with graphical mutual information maximization. IEEE Transactions on neural networks and learning systems
Pu Z, Wang H, Liu Z, Yi J, Wu S (2022) Attention enhanced reinforcement learning for multi agent cooperation. IEEE Transactions on neural networks and learning systems
Gao X et al (2023) Reinforcement learning based optimization algorithm for maintenance tasks scheduling in coalbed methane gas field. Comput Chem Eng 170:108131
Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) titleNeural message passing for quantum chemistry, PMLR, pp 1263–1272
Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8:279–292
Nickel S, Steinhardt C, Schlenker H, Burkart W (2022) in Ibm ilog cplex optimization studio—a primer, Springer, pp 9–21
Perron L, Furnon V (2019) Or-tools. Google.[Online]. Available: https://developers.google.com/optimization

Download references

Funding

This work was supported by the National Natural Science Foundation of China (Grant numbers 22178383 and 21706282), Beijing Natural Science Foundation (Grant number 2232021) and the Research Foundation of China University of Petroleum (Beijing) (Grant number 2462020BJRC004).

Author information

Authors and Affiliations

Department of Automation, China University of Petroleum, Beijing, 102249, China
Xiaoyong Gao, Diao Peng, Yixu Yang, Fuyu Huang, Yu Yuan & Chaodong Tan
Shandong nextAI Tech. Co. Ltd., Dongying, 257000, China
Feifei Li

Authors

Xiaoyong Gao
View author publications
You can also search for this author in PubMed Google Scholar
Diao Peng
View author publications
You can also search for this author in PubMed Google Scholar
Yixu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Fuyu Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yu Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Chaodong Tan
View author publications
You can also search for this author in PubMed Google Scholar
Feifei Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaoyong Gao.

Ethics declarations

Conflict of interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gao, X., Peng, D., Yang, Y. et al. Two-stage graph attention networks and Q-learning based maintenance tasks scheduling. Appl Intell 55, 331 (2025). https://doi.org/10.1007/s10489-025-06249-z

Download citation

Accepted: 30 December 2024
Published: 16 January 2025
DOI: https://doi.org/10.1007/s10489-025-06249-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two-stage graph attention networks and Q-learning based maintenance tasks scheduling

Abstract

Graphical abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Multi-action Reinforcement Learning Framework via Pointer Graph Neural Network for Flexible Job-Shop Scheduling Problems with Resource Transfer

Deep reinforcement learning-based spatio-temporal graph neural network for solving job shop scheduling problem

Adaptive disassembly sequence planning for VR maintenance training via deep reinforcement learning

Explore related subjects

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now