Abstract
The maintenance tasks scheduling optimization is important and challenging for improving the oil and gas exploitation efficiency. Traditionally, this problem is addressed using exact algorithms, metaheuristic algorithms or solvers. However, due to the large-scale nature of this problem, these trials often fail in practical use. To address this, a compositional message passing neural network (CMPNN) is introduced for graph embedding, and the messages of the whole graph are obtained by aggregating the messages of neighboring nodes, which is used as the input of the subsequent framework. Based on CMPNN, a framework combining two-stage Graph Attention Networks and Q-learning (TSGAT+Q-learning) is proposed in this paper. In the first stage, the agent embedding is completed, i.e., each service technician’s messages are represented by a constructed graph; In the second phase, each maintenance task selects an agent based on probability. In this way, the task assignment scheme is obtained, and finally Q-learning is used for further optimization. In addition, a key contribution is the proposal of a novel exponential reward, designed to speed up model training using REINFORCE in reinforcement learning. To validate the effectiveness of proposed method, scenarios with different scales are provided. In most cases, TSGAT+Q-learning outperforms CPLEX, OR-Tools and other learning-based algorithms. Moreover, the trained networks can also solve the problem with varying numbers of maintenance tasks, which implies that TSGAT+Q-learning has good generalization ability. Finally, the proposed method is also proved to be effective in solving on-site maintenance tasks scheduling problem with multiple constraints.
Graphical abstract
A Two-stage Graph Attention Networks and Q-learning Framework Based Maintenance Tasks Scheduling














Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Zhang Q, Liu Y, Xiahou T, Huang HZ (2023) A heuristic maintenance scheduling framework for a military aircraft fleet under limited maintenance capacities. ReliabEng Syst Saf 235:109239
George B, Loo J, Jie W (2023) Novel multi-objective optimisation for maintenance activities of floating production storage and offloading facilities. Appl Ocean Res 130:103440
Valet A et al (2022) Opportunistic maintenance scheduling with deep reinforcement learning. J Manuf Syst 64:518–534
Yan Q, Wang H (2022) Double-layer q-learning-based joint decision-making of dual resource-constrained aircraft assembly scheduling and flexible preventive maintenance. IEEE Transactions on aerospace and electronic systems 58:4938–4952
dos Santos Pereira GM et al (2022) Quasi-dynamic operation and maintenance plan for photovoltaic systems in remote areas: The framework of pantanal-ms. Renew Energy 181:404–416
Zhang C, Gao Y, Yang L, Gao Z, Qi J (2020) Joint optimization of train scheduling and maintenance planning in a railway network: A heuristic algorithm using lagrangian relaxation. Transp Res B Methodol 134:64–92
Cheikhrouhou O, Khoufi I (2021) A comprehensive survey on the multiple traveling salesman problem: Applications, approaches and taxonomy. Comput Sci Rev 40:100369
Yang X, Feng R, Xu P, Wang X, Qi M (2023) Internet-of-things-augmented dynamic route planning approach to the airport baggage handling system. Comput Ind Eng 75:108802
Ertem M, As’ ad R, Awad M, Al-Bar A (2022) Workers-constrained shutdown maintenance scheduling with skills flexibility: Models and solution algorithms. Comput Ind Eng 172:108575
Seif Z, Mardaneh E, Loxton R, Lockwood A (2021) Minimizing equipment shutdowns in oil and gas campaign maintenance. J Oper Res Soc 72:1486–1504
Wang X, Wang S, Xu Q (2022) Simultaneous production and maintenance scheduling for refinery front-end process with considerations of risk management and resource availability. Ind Eng Chem Res 61:2152–2166
Santos IM, Hamacher S, Oliveira F (2023) A data-driven optimization model for the workover rig scheduling problem: Case study in an oil company. Comput Chem Eng 170:108088
Sivanandam SN, Deepa SN, Sivanandam SN, Deepa SN (2008) Genetic algorithms. Springer
Dorigo M, Birattari M, Stutzle T (2006) Ant colony optimization. IEEE Computational intelligence magazine 1:28–39
Wang Q, Hao Y, Zhang J (2023) Generative inverse reinforcement learning for learning 2-opt heuristics without extrinsic rewards in routing problems. Journal of King Saud University-Computer and Information Sciences 35:101787
Mathlouthi I, Gendreau M, Potvin JY (2021) A metaheuristic based on tabu search for solving a technician routing and scheduling problem. Comput Oper Res 125:105079
Chen C, Demir E, Huang Y (2021) An adaptive large neighborhood search heuristic for the vehicle routing problem with time windows and delivery robots. European journal of operational research 294:1164–1180
Stodola P, Michenka K, Nohel J, Rybanskỳ M (2020) Hybrid algorithm based on ant colony optimization and simulated annealing applied to the dynamic traveling salesman problem. Entropy 22:884
Shi S, Xiong H, Li G (2023) A no-tardiness job shop scheduling problem with overtime consideration and the solution approaches. Comput Ind Eng 178:109115
Gupta R, Nanda SJ (2021) Solving dynamic many-objective tsp using nsga-iii equipped with svr-rbf kernel predictor, pp 95–102
Rostami AS, Mohanna F, Keshavarz H, Hosseinabadi AAR (2015) Solving multiple traveling salesman problem using the gravitational emulation local search algorithm. Appl Math Inform Sci 9:1–11
Lesch V, König M, Kounev S, Stein A, Krupitzer C (2022) Tackling the rich vehicle routing problem with nature-inspired algorithms. Appl Intell 52:9476–9500
Helsgaun K (2017) An extension of the lin-kernighan-helsgaun tsp solver for constrained traveling salesman and vehicle routing problems. Roskilde: Roskilde University vol 12
Lin S, Kernighan BW (1973) An effective heuristic algorithm for the traveling-salesman problem. Oper Res 21:498–516
Mazyavkina N, Sviridov S, Ivanov S, Burnaev E (2021) Reinforcement learning for combinatorial optimization: A survey. Comput Oper Res 134:105400
Chen X, Tian Y (2019) Learning to perform local rewriting for combinatorial optimization. Adv Neural Inform Process Syst vol 32
Stahlberg F (2020) Neural machine translation: A review. J Artif Intell Res 69:343–418
Kool W, Van Hoof H, Welling M (2018) Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475
Kwon YD et al (2020) Pomo: Policy optimization with multiple optima for reinforcement learning. Adv Neural Inform Process Syst 33:21188–21198
Zhou J et al (2023) Learning large neighborhood search for vehicle routing in airport ground handling. IEEE Transactions on knowledge and data engineering
Qin W, Zhuang Z, Huang Z, Huang H (2021) A novel reinforcement learning-based hyper-heuristic for heterogeneous vehicle routing problem. Comput Ind Eng 156:107252
Wu Y, Song W, Cao Z, Zhang J, Lim A (2021) Learning improvement heuristics for solving routing problems. IEEE Transactions on neural networks and learning systems 33:5057–5069
Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2008) The graph neural network model. IEEE Transactions on neural networks 20:61–80
Vesselinova N, Steinert R, Perez-Ramirez DF, Boman M (2020) Learning combinatorial optimization on graphs: A survey with applications to networking. IEEE Access 8:120388–120416
Hu Y et al (2021) A bidirectional graph neural network for traveling salesman problems on arbitrary symmetric graphs. Eng Appl Artif Intell 97:104061
Wang Q (2022) Varl: a variational autoencoder-based reinforcement learning framework for vehicle routing problems. Appl Intell pp 1–14
Pan W, Liu SQ (2023) Deep reinforcement learning for the dynamic and uncertain vehicle routing problem. Appl Intell 53:405–422
Chen Z, Zhang L, Wang X, Wang K (2023) Cloud–edge collaboration task scheduling in cloud manufacturing: An attention-based deep reinforcement learning approach. Comput Ind Eng 177:109053
Hu J, Wang Y, Pang Y, Liu Y (2022) Optimal maintenance scheduling under uncertainties using linear programming-enhanced reinforcement learning. Eng Appl Artif Intell 109:104655
Huang J, Su J, Chang Q (2022) Graph neural network and multi-agent reinforcement learning for machine-process-system integrated control to optimize production yield. J Manuf Syst 64:81–93
Wang Y, Qiu D, Wang Y, Sun M, Strbac G (2023) Graph learning-based voltage regulation in distribution networks with multi-microgrids. IEEE Transactions on power systems
Ding S et al (2023) Multiagent reinforcement learning with graphical mutual information maximization. IEEE Transactions on neural networks and learning systems
Pu Z, Wang H, Liu Z, Yi J, Wu S (2022) Attention enhanced reinforcement learning for multi agent cooperation. IEEE Transactions on neural networks and learning systems
Gao X et al (2023) Reinforcement learning based optimization algorithm for maintenance tasks scheduling in coalbed methane gas field. Comput Chem Eng 170:108131
Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) titleNeural message passing for quantum chemistry, PMLR, pp 1263–1272
Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8:279–292
Nickel S, Steinhardt C, Schlenker H, Burkart W (2022) in Ibm ilog cplex optimization studio—a primer, Springer, pp 9–21
Perron L, Furnon V (2019) Or-tools. Google.[Online]. Available: https://developers.google.com/optimization
Funding
This work was supported by the National Natural Science Foundation of China (Grant numbers 22178383 and 21706282), Beijing Natural Science Foundation (Grant number 2232021) and the Research Foundation of China University of Petroleum (Beijing) (Grant number 2462020BJRC004).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gao, X., Peng, D., Yang, Y. et al. Two-stage graph attention networks and Q-learning based maintenance tasks scheduling. Appl Intell 55, 331 (2025). https://doi.org/10.1007/s10489-025-06249-z
Accepted:
Published:
DOI: https://doi.org/10.1007/s10489-025-06249-z