Abstract
The Traveling Salesman Problem is formulated as a sequence to sequence problem, and then policy gradient, graph convolutional networks, and multi-head attention techniques are applied to generate the according sequence model. The model is trained and tested by reinforcement learning on small-scale graphs. In addition, we use the 2-optimization algorithm to improve the model’s generation performance during the testing process. The results demonstrate that the proposed method, which is called Graph Sequence Reinforcement Learning Model, can be trained on small-scale graphs effectively without supervision and can be applied to solve TSP with large-scale graphs directly. Moreover, the performance of the proposed surpasses some of the state-of-the-art heuristic algorithms with high performance, and the ablation experiment shows that each part of the model is helpful for improving performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hartmanis, J.: Computers and intractability: a guide to the theory of np-completeness (michael r. garey and david s. johnson). Siam Rev. 24, 90 (1982)
Robust, F., Daganzo, C.F., Souleyrette, R.R., II.: Implementing vehicle routing models. Transp. Res. Part B Methodol. 24, 263–286 (1990)
Zunic, E., Besirevic, A., Skrobo, R., Hasic, H., Hodzic, K., Djedovic, A.: Design of optimization system for warehouse order picking in real environment. In: XXVI International Conference on Information, Communication and Automation Technologies (ICAT), pp. 1–6. IEEE (2017)
Anbuudayasankar, S., Ganesh, K., Mohapatra, S.: Models for Practical Routing Problems in Logistics. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-05035-5
Goh, Y.L., Lee, W.S., Bresson, X., Laurent, T., Lim, N.: Combining reinforcement learning and optimal transport for the traveling salesman problem. arXiv preprint arXiv:2203.00903 (2022)
Le, N., Rathour, V.S., Yamazaki, K., Luu, K., Savvides, M.: Deep reinforcement learning in computer vision: a comprehensive survey. Artif. Intell. Rev. 55(4), 2733–2819 (2021). https://doi.org/10.1007/s10462-021-10061-9
Arkhangelskaya, E.O., Nikolenko, S.I.: Deep learning for natural language processing: a survey. J. Math. Sci. 273, 533–582 (2023). https://doi.org/10.1007/s10958-023-06519-6
Bengio, Y., Lodi, A., Prouvost, A.: Machine learning for combinatorial optimization: a methodological tour d’horizon. Eur. J. Oper. Res. 290, 405–421 (2021)
Qiu, R., Sun, Z., Yang, Y.: DIMES: a differentiable meta solver for combinatorial optimization problems. In: Advances in Neural Information Processing Systems, vol. 35, pp. 25531–25546 (2022)
Bello, I., Pham, H., Le, Q.V., Norouzi, M., Bengio, S.: Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940 (2016)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Joshi, C.K., Laurent, T., Bresson, X.: An efficient graph convolutional network technique for the travelling salesman problem. arXiv preprint arXiv:1906.01227 (2019)
Ma, Q., Ge, S., He, D., Thaker, D., Drori, I.: Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning. arXiv preprint arXiv:1911.04936 (2019)
Bresson, X., Laurent, T.: The transformer network for the traveling salesman problem. arXiv preprint arXiv:2103.03012 (2021)
Vinyals, O., Fortunato, M., Jaitly, N.: Pointer networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Kool, W., Van Hoof, H., Welling, M.: Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475 (2018)
Kool, W., van Hoof, H., Gromicho, J., Welling, M.: Deep policy dynamic programming for vehicle routing problems. In: Schaus, P. (ed.) CPAIOR 2022, vol. 13292, pp. 190–213. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08011-1_14
Kielak, K.: Importance of using appropriate baselines for evaluation of data-efficiency in deep reinforcement learning for Atari. arXiv preprint arXiv:2003.10181 (2020)
Cook, W.J., Applegate, D.L., Bixby, R.E., Chvatal, V.: The Traveling Salesman Problem: A Computational Study. Princeton University Press (2011)
Reinelt, G.: TSPLIB—a traveling salesman problem library. ORSA J. Comput. 3, 376–384 (1991)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Liu, Y., Li, L. (2024). Efficient Graph Sequence Reinforcement Learning for Traveling Salesman Problem. In: Tan, Y., Shi, Y. (eds) Data Mining and Big Data. DMBD 2023. Communications in Computer and Information Science, vol 2017. Springer, Singapore. https://doi.org/10.1007/978-981-97-0837-6_18
Download citation
DOI: https://doi.org/10.1007/978-981-97-0837-6_18
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0836-9
Online ISBN: 978-981-97-0837-6
eBook Packages: Computer ScienceComputer Science (R0)