Reinforcement learning technique using agent state occurrence frequency with analysis of knowledge sharing on the agent’s learning process in multiagent environments

Al-Dayaa, H. S.; Megherbi, D. B.

doi:10.1007/s11227-010-0451-x

Reinforcement learning technique using agent state occurrence frequency with analysis of knowledge sharing on the agent’s learning process in multiagent environments

Published: 04 June 2010

Volume 59, pages 526–547, (2012)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

H. S. Al-Dayaa¹ &
D. B. Megherbi¹

144 Accesses
9 Citations
Explore all metrics

Abstract

Reinforcement learning techniques like the Q-Learning one as well as the Multiple-Lookahead-Levels one that we introduced in our prior work require the agent to complete an initial exploratory path followed by as many hypothetical and physical paths as necessary to find the optimal path to the goal. This paper introduces a reinforcement learning technique that uses a distance measure to the goal as a primary gauge for an autonomous agent’s action selection. In this paper, we take advantage of the first random walk to acquire initial information about the goal. Once the agent’s goal is reached, the agent’s first perceived internal model of the environment is updated to reflect and include said goal. This is done by the agent tracing back its steps to its origin starting point. We show in this paper, no exploratory or hypothetical paths are required after the goal is initially reached or detected, and the agent requires a maximum of two physical paths to find the optimal path to the goal. The agent’s state occurrence frequency is introduced as well and used to support the proposed Distance-Only technique. A computation speed performance analysis is carried out, and the Distance-and-Frequency technique is shown to require less computation time than the Q-Learning one. Furthermore, we present and demonstrate how multiple agents using the Distance-and-Frequency technique can share knowledge of the environment and study the effect of that knowledge sharing on the agents’ learning process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

A practical guide to multi-objective reinforcement learning and planning

Article Open access 13 April 2022

References

Al-Dayaa HS, Megherbi DB (2006) A fast reinforcement learning technique via multiple lookahead levels. In: Proceedings of the international conference on machine learning; applications, models, and technologies, Las Vegas, Nevada, USA, June, 2006
Al-Dayaa HS, Megherbi DB (2006) Fast reinforcement learning techniques using the Euclidean distance and agent state occurrence frequency. In: Proceedings of the international conference on machine learning; applications, models, and technologies, Las Vegas, Nevada, USA, June, 2006
American Association for Artificial Intelligence (2010) [online], Machine Learning. Available: http://www.aaai.org/AITopics/html/machine.html, March 01, 2010 [date accessed]
Daneshfar F, Bevrani H (2010) Load-frequency control: a GA-based multi-agent reinforcement learning. IEEE/IET Gener Trans Distrib 4(1):13–26
Article Google Scholar
De Mantaras R (1991) A distance-based attribute selection measure for decision tree induction. Mach Learn 6:81–92
Article Google Scholar
Jiming L (2001) Autonomous agents and multi-agent systems: explorations in learning, self-organization and adaptive computation. World Scientific, Singapore
Google Scholar
Kreyszig E (1993) Advanced engineering mathematics, 7th edn. Wiley, New York
MATH Google Scholar
Liu S, Tian Y (2002) Multi-agent learning methods in an uncertain environment. In: International conference on machine learning and cybernetics, Beijing, 2002
Lozano E, Acuna E (2005) Parallel algorithms for distance-based and density-based outliers. In: Fifth IEEE international conference on data mining, 2005, pp 729–732
Lozano E, Acuna E (2005) Parallel algorithms for distance-based and density-based outliers. In: Fifth IEEE international conference on data mining 2005, pp 729–732
Makar R, Mahadevan S, Ghavamzadeh M (2001) Hierarchical multi-agent reinforcement learning. In: Proceedings of the fifth international conference on autonomous agents, Montreal, Quebec, Canada, 2001, pp 247–253
McArthur SDJ, Davidson EM, Catterson VM, Dimeas AL, Hatziargyriou ND, Ponci F, Funabashi T (2007) Multi-agent systems for power engineering applications—part II: technologies, standards, and tools for building multi-agent systems. IEEE Trans Power Syst 22(4):1743–1752
Article Google Scholar
McArthur SDJ, Davidson EM, Catterson VM, Dimeas AL, Hatziargyriou ND, Ponci F, Funabashi T (2007) Multi-agent systems for power engineering applications—part I: concepts, approaches, and technical challenges. IEEE Trans Power Syst 22(4):1753–1759
Article Google Scholar
Megherbi DB, Al-Dayaa HS (2007) A Lyapunov-stability-based system hardware architecture for a real-time multiple-look-ahead-levels reinforcement learning. In: Proceedings of the 2006 international conference on machine learning; models, technologies & applications, Nevada, USA, 2007
Megherbi DB, Teirelbar A, Boulenouar AJ (2001) A time-varying-environment machine learning technique for autonomous agent shortest path planning. In: Proceedings of the SPIE international conference on defense sensing. Unmanned Ground vehicle Technology, Orlando, Florida, USA, April, 2001, pp 419–428
Mitchell TM (1997) Machine learning. McGraw-Hill, New York
MATH Google Scholar
Murray RM, Li Z, Sastry SS (1994) A mathematical introduction to robotic manipulation. CRC Press LLC, Boca Raton
MATH Google Scholar
Rudek R, Koszalka L, Pozniak-Koszalka I (2005) Introduction to multi-agent modified Q-learning routing for computer networks. In: IEEE advanced industrial conference on telecommunications, 2005
Sutton RS (1990) Integrated architectures for learning, planning, and reaction based on approximating dynamic programming. In: Proceedings of the seventh int conf on machine learning, 1990, pp 216–224
Sutton RS (1991) Dyna an integrated architecture for learning, planning, and reacting. In: Working notes of 1991 AAAI spring symposium, 1991, pp 151–155
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge
Google Scholar
Tabuada P, Pappas GJ, Lima P (2005) Motion feasibility of multi-agent formations. IEEE Trans Robot 21(3):387–392
Article Google Scholar
Vig L, Adams JA (2006) Multi-robot coalition formation. IEEE Trans Robot 22(4):637–649
Article Google Scholar
Watkins C, Dayan P (1992) Q-Learning. Mach Learn 8:279–292
MATH Google Scholar
Weiss G (1999) Multiagent systems: a modern approach to distributed artificial intelligence. MIT Press, Cambridge
Google Scholar
Yamamura T, Umano M, Seta K (2006) Reinforcement learning of agent with a staged view in distance and direction for the pursuit problem. In: IEEE international conference on fuzzy systems, Vancouver, BC, Canada, July 2006

Download references

Author information

Authors and Affiliations

CMINDS Research Center, ECE Department, University of Massachusetts Lowell, Lowell, MA, USA
H. S. Al-Dayaa & D. B. Megherbi

Authors

H. S. Al-Dayaa
View author publications
You can also search for this author in PubMed Google Scholar
D. B. Megherbi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to D. B. Megherbi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Al-Dayaa, H.S., Megherbi, D.B. Reinforcement learning technique using agent state occurrence frequency with analysis of knowledge sharing on the agent’s learning process in multiagent environments. J Supercomput 59, 526–547 (2012). https://doi.org/10.1007/s11227-010-0451-x

Download citation

Published: 04 June 2010
Issue Date: January 2012
DOI: https://doi.org/10.1007/s11227-010-0451-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reinforcement learning technique using agent state occurrence frequency with analysis of knowledge sharing on the agent’s learning process in multiagent environments

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

A practical guide to multi-objective reinforcement learning and planning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Reinforcement learning technique using agent state occurrence frequency with analysis of knowledge sharing on the agent’s learning process in multiagent environments

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

A practical guide to multi-objective reinforcement learning and planning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation