Skip to main content

Multiagent Reinforcement Learning Algorithm Using Temporal Difference Error

  • Conference paper
Book cover Advances in Neural Networks – ISNN 2005 (ISNN 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3496))

Included in the following conference series:

Abstract

When agent chooses some action and does state transition in present state in reinforcement learning, it is important subject to decide how will reward for conduct that agent chooses. In this paper, by new meta heuristic method to solve hard combinatorial optimization problems, we introduce Ant-Q learning method that has been proposed to solve Traveling Salesman Problem (TSP) to approach that is based for population that use positive feedback as well as greedy search, and suggest ant reinforcement learning model using TD-error(ARLM-TDE). We could know through an experiment that proposed reinforcement learning method converges faster to optimal solution than original ACS and Ant-Q.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gambardella, L.M., Dorigo, M.: Ant-Q: A Reinforcement Learning Approach to the Traveling Salesman Problem. In: Prieditis, A., Russell, S. (eds.) Proceedings of ML 1995, Twelfth International Conference on Machine Learning, pp. 252–260. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  2. Dorigo, M., Gambardella, L.M.: A Study of Some Properties of Ant-Q. In: Voigt, H.-M., Ebeling, W., Rechenberg, I., Schwefel, H.-S. (eds.) Proceedings of PPSN IV-Fourth International Conference on Parallel Problem Solving From Nature, pp. 656–665. Springer, Berlin (1996)

    Chapter  Google Scholar 

  3. Watkins, C.J.C.H.: Learning from Delayed Rewards. Ph.D. thesis, King’s College, Cambridge, U.K (1989)

    Google Scholar 

  4. Fiecher, C.N.: Efficient Reinforcement Learning. In: Proceedings of the Seventh Annual ACM Conference On Computational Learning Theory, pp. 88–97 (1994)

    Google Scholar 

  5. Barnald, E.: Temporal-difference Methods and Markov Model. IEEE Transactions on Systems, Man, and Cybernetics 23, 357–365 (1993)

    Article  Google Scholar 

  6. Gambardella, L.M., Dorigo, M.: Ant Colony System: A Cooperative Learning Approach to the Traveling Salesman Problem. IEEE Transactions on Evolutionary Computation 1 (1997)

    Google Scholar 

  7. Stutzle, T., Dorigo, M.: ACO Algorithms for the Traveling Salesman Problem. In: Miettinen, K., Makela, M., Neittaanmaki, P., Periaux, J. (eds.) Evolutionary Algorithms in En-gineering and Computer Science, Wiley, Chichester (1999)

    Google Scholar 

  8. Colorni, A., Dorigo, M., Maniezzo, V.: An Investigation of Some Properties of an Ant Algorithm. In: Manner, R., Manderick, B. (eds.) Proceediings of the Parallel Parallel Problem Solving from Nature Conference(PPSn 1992), pp. 509–520. Elsevier Publishing, Amsterdam (1992)

    Google Scholar 

  9. Colorni, A., Dorigo, M., Maniezzo, V.: Distributed Optimization by Ant Colonies. In: Varela, F., Bourgine, P. (eds.) Proceedings of ECAL 1991 - European Conference of Artificial Life, Paris, France, pp. 134–144. Elsevier Publishing, Amsterdam (1991)

    Google Scholar 

  10. Gambardella, L.M., Dorigo, M.: Solving Symmetric and Asymmetric TSPs by Ant Colonies. In: Proceedings of IEEE International Conference of Evolutionary Computation, IEEE-EC 1996, pp. 622–627. IEEE Press, Los Alamitos (1996)

    Chapter  Google Scholar 

  11. Drigo, M., Maniezzo, V., Colorni, A.: The Ant system: Optimization by a Colony of Cooperation Agents. IEEE Transactions of Systems, Man, and Cybernetics-Part B 26, 29–41 (1996)

    Article  Google Scholar 

  12. Stutzle, T., Hoos, H.: The Ant System and Local Search for the Traveling Salesman Problem. Proceedings of ICEC 1997, IEEE 4th International Conference of Evolutionary (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, S. (2005). Multiagent Reinforcement Learning Algorithm Using Temporal Difference Error. In: Wang, J., Liao, X., Yi, Z. (eds) Advances in Neural Networks – ISNN 2005. ISNN 2005. Lecture Notes in Computer Science, vol 3496. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11427391_100

Download citation

  • DOI: https://doi.org/10.1007/11427391_100

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25912-1

  • Online ISBN: 978-3-540-32065-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics