An Analysis of the Pheromone Q-Learning Algorithm

Monekosso, Ndedi; Remagnino, Paolo

doi:10.1007/3-540-36131-6_23

Ndedi Monekosso³ &
Paolo Remagnino³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2527))

Included in the following conference series:

Ibero-American Conference on Artificial Intelligence

846 Accesses
3 Citations

Abstract

The Phe-Q machine learning technique, a modified Q-learning technique, was developed to enable co-operating agents to communicate in learning to solve a problem. The Phe-Q learning technique combines Q-learning with synthetic pheromone to improve on the speed of convergence. The Phe-Q update equation includes a belief factor that reflects the confidence the agent has in the pheromone (the communication) deposited in the environment by other agents. With the Phe-Q update equation, speed of convergence towards an optimal solution depends on a number parameters including the number of agents solving a problem, the amount of pheromone deposited, and the evaporation rate. In this paper, work carried out to optimise speed of learning with the Phe-Q technique is described. The objective was to to optimise Phe-Q learning with respect to pheromone deposition rates, evaporation rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R. Beckers, J. L. Deneubourg, S. Goss, and J. M. Pasteels. Collective decision making through food recruitment. Ins. Soc., 37:258–267, 1990.
Article Google Scholar
R. Beckers, J.L. Deneubourg, and S. Goss. Trails and u-turns in the selection of the shortest path by the ant lasius niger. Journal of Theoretical Biology, 159:397–4151, 1992.
Article Google Scholar
D.P. Bertsekas and J.N. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, 1996.
Google Scholar
E. Bonabeau, M. Dorigo, and G. Theraulaz. Swarm intelligence, From Natural to Artificial Systems. Oxford University Press, 1999.
Google Scholar
M. C. Cammaerts-Tricot. Piste et pheromone attraction chez la fourmi myrmica ruba. Journal of Computational Physiology, 88:373–382, 1974.
Article Google Scholar
G. Di Caro and M. Dorigo. Antnet: a mobile agents approach to adaptive routing. Technical Report: IRIDIA/97-12, Universite Libre de Bruxelles, Belgium. http://citeseer.nj.nec.com/dicaro97antnet.html.
A. Colorni, M. Dorigo, and V. Maniezzo. Ant system for job-shop scheduling. Belgian Journal of OR, statistics and computer science, 34:39–53, 1993.
Google Scholar
A. Colorni, M. Dorigo, and G. Theraulaz. Distributed optimzation by ant colonies. In Proceedings First European Conf. on Artificial Life, pages 134–142, 1991.
Google Scholar
J.L. Deneubourg and S. Goss. Collective patterns and decision making. Ethol. Ecol. and Evol., 1:295–311, 1993.
Google Scholar
M. Dorigo and L. M. Gambardella. Ant colony system: A cooperative learning approach to the travelling salesman problem. IEEE Trans. on Evol. Comp., 1:53–66, 1997.
Article Google Scholar
M. Dorigo, V. Maniezzo, and A. Colorni. The ant system: Optimization by a colony of cooperatin agents. IEEE Trans. on Systems, Man, and Cybernetics, 26:1–13, 1996.
Google Scholar
L. M. Gambardella and M. Dorigo. Ant-q:A reinforcement learning approach to the traveling salesman problem. In Proc. 12Th ICML, pages 252–260, 1995.
Google Scholar
L. M. Gambardella, E. D. Taillard, and M. Dorigo. Ant colonies for the qap. Journal of Operational Research society, 1998.
Google Scholar
S. Goss, S. Aron, J.L. Deneubourg, and J. M. Pasteels. Self-organized shorcuts in the argentine ants. Naturwissenschaften, pages 579–581, 1989.
Google Scholar
L. R. Leerink, S. R. Schultz, and M. A. Jabri. A reinforcement learning exploration strategy based on ant foraging mechanisms. In Proc. 6Th Australian Conference on Neural Nets, 1995.
Google Scholar
N. Monekosso and P. Remagnino. Phe-q:Apheromone based q-learning. In AI2001:Advances in Artificial Intelligence, 14Th Australian Joint Conf. on A.I., pages 345–355, 2001.
Google Scholar
H. Van Dyke Parunak and S. Brueckner. Ant-like missionnaries and cannibals: Synthetic pheromones for distributed motion control. In Proc. of ICMAS’00, 2000.
Google Scholar
H. Van Dyke Parunak, S. Brueckner, J. Sauter, and J. Posdamer. Mechanisms and military applications for synthetic pheromones. In Proc. 5Th International Conference Autonomous Agents, Montreal, Canada, 2001.
Google Scholar
R. S. Sutton and A.G. Barto. Reinforcement Learning. MIT Press, 1998.
Google Scholar
T. Jaakkola, M.I. Jordan, and S.P. Singh. On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6:1185–1201, 1994.
Article MATH Google Scholar
R. T. Vaughan, K. Stoy, G. S. Sukhatme, and M. J. Mataric. Whistling in the dark: Cooperative trail following in uncertain localization space. In Proc. 4Th International Conference on Autonomous Agents, Barcelona, Spain, 2000.
Google Scholar
C. J. C. H. Watkins. Learning with delayed rewards. PhD thesis, University of Cambridge, 1989.
Google Scholar

Download references

Author information

Authors and Affiliations

Digital Imaging Research CentreSchool of Computing and Information Systems, Kingston University, UK
Ndedi Monekosso & Paolo Remagnino

Authors

Ndedi Monekosso
View author publications
You can also search for this author in PubMed Google Scholar
Paolo Remagnino
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Telefónica Investigación y Desarrollo, Emilio Vargas 6, 28043, Madrid, Spain
Francisco J. Garijo
Dpto. Lenguajes y Sistemas Informáticos, Universidad de Sevilla, ETS Ingeniería Informática, 41012, Seville, Spain
José C. Riquelme & Miguel Toro &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Monekosso, N., Remagnino, P. (2002). An Analysis of the Pheromone Q-Learning Algorithm. In: Garijo, F.J., Riquelme, J.C., Toro, M. (eds) Advances in Artificial Intelligence — IBERAMIA 2002. IBERAMIA 2002. Lecture Notes in Computer Science(), vol 2527. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36131-6_23

Download citation

DOI: https://doi.org/10.1007/3-540-36131-6_23
Published: 05 November 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00131-7
Online ISBN: 978-3-540-36131-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics