Abstract
The paper focuses on the study of solving the large-scale traveling salesman problem (TSP) based on neurodynamic programming. From this perspective, two methods, temporal difference learning and approximate Sarsa, are presented in detail. In essence, both of them try to learn an appropriate evaluation function on the basis of a finite amount of experience. To evaluate their performances, some computational experiments on both the Euclidean and asymmetric TSP instances are conducted. In contrast with the large size of the state space, only a few training sets have been used to obtain the initial results. Hence, the results are acceptable and encouraging in comparisons with some classical algorithms, and further study of this kind of methods, as well as applications in combinatorial optimization problems, is worth investigating.









Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aras N, Oommen BJ, Altinel IK (1999) The kohonen network incorporating explicit statistics and its application to the traveling salesman problem. Neural Netw 12(9):1273–1284
Bentley J (1992) Fast algorithms for geometric traveling salesman problems. ORSA J Comput 4(4):387–411
Bersini H, Dorigo M, Langerman S, Seront G, Gambardella L (1996) Results of the first international contest on evolutionary optimization (1st iceo). In: Evolutionary computation. Proceedings of IEEE international conference on. Springer-Verlag, Nagoya, pp 611–615
Burke LI, Damany P (1992) The guilty net for the traveling salesman problem. Comput Oper Res 19(3–4):255–265
Christofides N (1976) Worst-case analysis of a new heuristic for the travelling salesman problem. Tech Rep 388. Carnegie-Melon University, Pittsburgh
Christofides N, Eilon S (1972) Algorithms for large-scale traveling salesman problems. Oper Res Quart 23(4):511–518
Clarke G, Wright JW (1964) Scheduling of vehicles from a central depot to a number of delivery points. Oper Res 12(4):568–581
Crites R, Barto A (1996) Improving elevator performance using reinforcement learning. Adv Neural Inf Process Syst 8:1017–1023
Croes GA (1958) A method for solving traveling-salesman problems. Oper Res 6(6):791–812
Dorigo M, Gambardella L (1997) Ant colony system: a cooperative learning approach to the travelingsalesman problem. Evolut Comput IEEE Trans 1(1):53–66
Durbin R, Willshaw D (1987) An analogue approach to the traveling salesman problem using an elastic net method. Nature 326(6114):689–691
Fiechter CN (1994) A parallel tabu search algorithm for large traveling salesman problems. Discrete Appl Math 51(3):243–267
Freisleben B, Merz P (1996) A genetic local search algorithm for solving symmetric and asymmetric traveling salesman problems. In: International conference on evolutionary computation, pp 616–621
Gambardella LM, Dorigo M (1995) Ant-q: a reinforcement learning approach to the traveling salesman problem. In: International conference on machine learning, pp 252–260
Gendreau M, Hertz A, Laporte G (1992) New insertion and postoptimization procedures for the traveling salesman problem. Oper Res 40(6):1086–1094
Golden BL, Stewart WR et al (1985) Empirical analysis of heuristics. In: Lawler EL, Lenstra JK, Kan AHGR, Shmoys DB (eds) The traveling salesman problem. A guided tour of combinatorial optimization. Wiley, Chichester, pp 207–249
Haykin S (1998) Neural networks: a comprehensive foundation. Prentice Hall, PTR Upper Saddle River
Homaifar A, Guan S, Liepins GE (1993) A new approach on the traveling salesman problem by genetic algorithms. In: Proceedings of the 5th international conference on genetic algorithms. Morgan Kaufmann Publishers Inc., San Francisco, pp 460–466
Hopfield JJ, Tank DW (1985) Neural computation of decisions in optimization problems. Biol Cybern 52(3):141–152
Jayalakshmi G, Sathiamoorthy S, Rajaram R (2001) An hybrid genetic algorithm—a new approach to solve traveling salesman problem. Int J Comput Eng Sci 2(2):339–355
Johnson DS (1990) Local optimization and the traveling salesman problem. In: Goos G, Hartmanis J (eds) ICALP ’90: proceedings of the 17th international colloquium on automata, languages and programming. Springer-Verlag, London, pp 446–461
Johnson DS, McGeoch LA (1997) The travelling salesman: a case study in local optimization. In: Aarts EHL, Lenstra JK (eds) Local search in combinatorial optimization. Wiley, New York, pp 215–310
Kaelbling LP, Littman ML, Moore AP (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285
Kanellakis P, Papadimitriou C (1980) Local Search for the asymmetric traveling salesman problem. Oper Res 28(5):1086–1099
Laarhoven P, Aarts E (1987) Simulated annealing: theory and applications. Kluwer, Norwell
Lin S (1965) Computer solutions of the traveling salesman problem. Bell Syst Tech J 44(10):2245–2269
Lin S, Kernighan BW (1973) an effective heuristic algorithm for the traveling-salesman problem. Oper Res 21(2):498-516
Miagkikh VV, Punch WF (1999) An approach to solving combinatorial optimization problems using a population of reinforcement learning agents. In: Banzhaf W, Daida J, Eiben AE, Garzon MH, Honavar V, Jakiela M, Smith RE (eds) Proceedings of the genetic and evolutionary computation conference, vol 2. Morgan Kaufmann, Orlando, pp 1358–1365
Padberg M, Rinaldi G (1990) Facet identification for the symmetric traveling salesman polytope. Math Programm 47(1):219–257
Potvin J (1993) The traveling salesman problem: a neural network perspective. ORSA J Comput 5:328–348
Reinelt G (1991) TSPLIB—a traveling salesman problem library. ORSA J Comput 3(4):376–384
Singh S, Bertsekas D (1997) Reinforcement learning for dynamic channel allocation in cellular telephone systems. In: Mozer MC, Jordan MI, Petsche T (eds) Advances in neural information processing systems, vol 9. The MIT Press, Cambridge, pp 974–980
Sutton R (1988) Learning to predict by the methods of temporal differences. Mach Learn 3(1):9–44
Sutton R, Barto A (1998) Reinforcement learning: an introduction. MIT Press, Cambridge
Tesauro G (1995) Temporal difference learning and TD-Gammon. Commun ACM 38(3):58–68
Watkins C, Dayan P (1992) Technical note: Q-Learning. Mach Learn 8(3):279–292
Willshaw D, von der Malsburg C (1979) A marker induction mechanism for the establishment of ordered neural mappings: its application to the retinotectal problem. Philos Trans R Soc Lond B Biol Sci 287(1021):203–243
Acknowledgments
This research has been supported in part by the National Natural Science Foundation of China (Grants 60205004, 50475179, 60528002, 60621001, and 60635010), the National Basic Research Program (973) of China under Grant 2002CB312200, and the Hi-Tech R&D Program (863) of China (Grant 2006AA04Z258).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ma, J., Yang, T., Hou, ZG. et al. Neurodynamic programming: a case study of the traveling salesman problem. Neural Comput & Applic 17, 347–355 (2008). https://doi.org/10.1007/s00521-007-0127-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-007-0127-5