Efficient Ant Reinforcement Learning Using Replacing Eligibility Traces

Lee, SeungGwan; Hong, SeokMi

doi:10.1007/11785231_86

SeungGwan Lee²² &
SeokMi Hong²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4029))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

1298 Accesses

Abstract

The eligibility trace is one of the basic mechanisms in reinforcement learning to handle delayed reward. The traces are said to indicate the degree to which each state is eligible for undergoing learning changes should a reinforcing event occur. Formally, there are two kinds of eligibility traces(accumulating trace or replacing traces). In this paper, we propose an ant reinforcement learning algorithms using an eligibility traces which is called replace-trace methods(Ant-TD(λ)). This method is a hybrid of Ant-Q and eligibility traces. With replacing traces, the eligibility trace for the maximum(MaxAQ(s,z)) state visited on the step is reset to 1 and the eligibility traces for another states decay by γλ. Although replacing traces are only slightly different from accumulating traces, it can produce a significant improvement in optimization. We could know through an experiment that proposed reinforcement learning method converges faster to optimal solution than ACS and Ant-Q.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Adaptive Fuzzy Watkins: A New Adaptive Approach for Eligibility Traces in Reinforcement Learning

Article 29 April 2019

Dynamic heuristic acceleration of linearly approximated SARSA(\(\lambda \)): using ant colony optimization to learn heuristics dynamically

Article Open access 03 May 2019

An Improved On-Policy Reinforcement Learning Algorithm

References

Colorni, A., Dorigo, M., Maniezzo, V.: An Investigation of Some Properties of an Ant Algorithm. In: Manner, R., Manderick, B. (eds.) Proceedings of the Parallel Problem Solving from Nature Conference, pp. 509–520. Elsevier, Amsterdam (1992)
Google Scholar
Colorni, A., Dorigo, M., Maniezzo, V.: Distributed Optimization by Ant Colonies. In: Varela, F., Bourgine, P. (eds.) Proceedings of the First European Conference of Artificial Life, pp. 134–144. Elsevier, Amsterdam (1991)
Google Scholar
Watkins, C.J.C.H.: Learning from Delayed Rewards. Ph.D. Thesis, King’s College, Cambridge, U.K. (1989)
Google Scholar
Fiecher, C.N.: Efficient Reinforcement Learning. In: Proceedings of the Seventh Annual ACM Conference on Computational Learning Theory, pp. 88–97 (1994)
Google Scholar
Barnald, E.: Temporal-Difference Methods and Markov Model. IEEE Trans. Systems, Man and Cybernetics 23, 357–365 (1993)
Article Google Scholar
Gambardella, L.M., Dorigo, M.: Solving Symmetric and Asymmetric TSPs by Ant Colonies. In: Proceedings of IEEE International Conference of Evolutionary Computation, pp. 622–627. IEEE Press, Los Alamitos (1996)
Chapter Google Scholar
Gambardella, L.M., Dorigo, M.: Ant-Q: A Reinforcement Learning Approach to the Traveling Salesman Problem. In: Prieditis, A., Russell, S. (eds.) Proceedings of ML-95, Twelfth International Conference on Machine Learning, pp. 252–260. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Dorigo, M., Gambardella, L.M.: A Study of Some Properties of Ant-Q. In: Ebeling, W., Rechenberg, I., Voigt, H.-M., Schwefel, H.-P. (eds.) PPSN 1996. LNCS, vol. 1141, pp. 656–665. Springer, Heidelberg (1996)
Chapter Google Scholar
Dorigo, M., Maniezzo, V., Colorni, A.: The Ant System: Optimization by a Colony of Cooperation Agents. IEEE Trans. Systems, Man and Cybernetics-Part B 26(1), 29–41 (1996)
Article Google Scholar
Stutzle, T., Hoos, H.: The Ant System and Local Search for the Traveling Salesman Problem. In: Proceedings of IEEE 4th International Conference of Evolutionary (1997)
Google Scholar
Gambardella, L.M., Dorigo, M.: Ant Colony System: A Cooperative Learning Approach to the Traveling Salesman Problem. IEEE Trans. Evolutionary Computation 1(1) (1997)
Google Scholar
Stutzle, T., Dorigo, M.: ACO Algorithms for the Traveling Salesman Problem. In: Miettinen, K., Makela, M., Neittaanmaki, P., Periaux, J. (eds.) Evolutionary Algorithms in Engineering and Computer Science, Wiley, Chichester (1999)
Google Scholar
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Lee, S.G.: Multiagent Reinforcement Learning Algorithm Using Temporal Difference Error. In: Wang, J., Liao, X.-F., Yi, Z. (eds.) ISNN 2005. LNCS, vol. 3496, pp. 627–633. Springer, Heidelberg (2005)
Chapter Google Scholar
http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95

Download references

Author information

Authors and Affiliations

School of Computer Science and Information Engineering, Catholic University, 43-1, Yeokgok 2-Dong, Wonmi-Gu, Bucheon-Si, Gyeonggi-Do, 420-743, Korea
SeungGwan Lee
School of Computer Information and Communication Engineering, Sangji University, 660 USan-Dong, WonJu-Si, KangWon-Do, 220-702, Korea
SeokMi Hong

Authors

SeungGwan Lee
View author publications
You can also search for this author in PubMed Google Scholar
SeokMi Hong
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Artificial Intelligence, Academy of Humanities and Economics, Poland
Leszek Rutkowski
Institute of Automatics, AGH University of Science and Technology, Al. Mickiewicza 30, PL-30-059, Kraków, Poland
Ryszard Tadeusiewicz
Department of Electrical Engineering and Computer Sciences, Berkeley Initiative in Soft Computing (BISC), University of California, 94720-1776, Berkeley, CA, USA
Lotfi A. Zadeh
Department of Electrical Engineering, University of Louisville, 40292, Louisville, KY, U.S.A
Jacek M. Żurada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, S., Hong, S. (2006). Efficient Ant Reinforcement Learning Using Replacing Eligibility Traces. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds) Artificial Intelligence and Soft Computing – ICAISC 2006. ICAISC 2006. Lecture Notes in Computer Science(), vol 4029. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11785231_86

Download citation

DOI: https://doi.org/10.1007/11785231_86
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35748-3
Online ISBN: 978-3-540-35750-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics