Abstract
When agent chooses some action and does state transition in present state in reinforcement learning, it is important subject to decide how will reward for conduct that agent chooses. In this paper, by new meta heuristic method to solve hard combinatorial optimization problems, we introduce Ant-Q learning method that has been proposed to solve Traveling Salesman Problem (TSP) to approach that is based for population that use positive feedback as well as greedy search, and suggest ant reinforcement learning model using TD-error(ARLM-TDE). We could know through an experiment that proposed reinforcement learning method converges faster to optimal solution than original ACS and Ant-Q.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Gambardella, L.M., Dorigo, M.: Ant-Q: A Reinforcement Learning Approach to the Traveling Salesman Problem. In: Prieditis, A., Russell, S. (eds.) Proceedings of ML 1995, Twelfth International Conference on Machine Learning, pp. 252–260. Morgan Kaufmann, San Francisco (1995)
Dorigo, M., Gambardella, L.M.: A Study of Some Properties of Ant-Q. In: Voigt, H.-M., Ebeling, W., Rechenberg, I., Schwefel, H.-S. (eds.) Proceedings of PPSN IV-Fourth International Conference on Parallel Problem Solving From Nature, pp. 656–665. Springer, Berlin (1996)
Watkins, C.J.C.H.: Learning from Delayed Rewards. Ph.D. thesis, King’s College, Cambridge, U.K (1989)
Fiecher, C.N.: Efficient Reinforcement Learning. In: Proceedings of the Seventh Annual ACM Conference On Computational Learning Theory, pp. 88–97 (1994)
Barnald, E.: Temporal-difference Methods and Markov Model. IEEE Transactions on Systems, Man, and Cybernetics 23, 357–365 (1993)
Gambardella, L.M., Dorigo, M.: Ant Colony System: A Cooperative Learning Approach to the Traveling Salesman Problem. IEEE Transactions on Evolutionary Computation 1 (1997)
Stutzle, T., Dorigo, M.: ACO Algorithms for the Traveling Salesman Problem. In: Miettinen, K., Makela, M., Neittaanmaki, P., Periaux, J. (eds.) Evolutionary Algorithms in En-gineering and Computer Science, Wiley, Chichester (1999)
Colorni, A., Dorigo, M., Maniezzo, V.: An Investigation of Some Properties of an Ant Algorithm. In: Manner, R., Manderick, B. (eds.) Proceediings of the Parallel Parallel Problem Solving from Nature Conference(PPSn 1992), pp. 509–520. Elsevier Publishing, Amsterdam (1992)
Colorni, A., Dorigo, M., Maniezzo, V.: Distributed Optimization by Ant Colonies. In: Varela, F., Bourgine, P. (eds.) Proceedings of ECAL 1991 - European Conference of Artificial Life, Paris, France, pp. 134–144. Elsevier Publishing, Amsterdam (1991)
Gambardella, L.M., Dorigo, M.: Solving Symmetric and Asymmetric TSPs by Ant Colonies. In: Proceedings of IEEE International Conference of Evolutionary Computation, IEEE-EC 1996, pp. 622–627. IEEE Press, Los Alamitos (1996)
Drigo, M., Maniezzo, V., Colorni, A.: The Ant system: Optimization by a Colony of Cooperation Agents. IEEE Transactions of Systems, Man, and Cybernetics-Part B 26, 29–41 (1996)
Stutzle, T., Hoos, H.: The Ant System and Local Search for the Traveling Salesman Problem. Proceedings of ICEC 1997, IEEE 4th International Conference of Evolutionary (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, S. (2005). Multiagent Reinforcement Learning Algorithm Using Temporal Difference Error. In: Wang, J., Liao, X., Yi, Z. (eds) Advances in Neural Networks – ISNN 2005. ISNN 2005. Lecture Notes in Computer Science, vol 3496. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11427391_100
Download citation
DOI: https://doi.org/10.1007/11427391_100
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25912-1
Online ISBN: 978-3-540-32065-4
eBook Packages: Computer ScienceComputer Science (R0)