A hybrid cognitive/reactive intelligent agent autonomous path planning technique in a networked-distributed unstructured environment for reinforcement learning

Megherbi, Dalila B.; Malayia, Vikram

doi:10.1007/s11227-010-0510-3

A hybrid cognitive/reactive intelligent agent autonomous path planning technique in a networked-distributed unstructured environment for reinforcement learning

Published: 22 December 2010

Volume 59, pages 1188–1217, (2012)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Dalila B. Megherbi¹ &
Vikram Malayia¹

239 Accesses
9 Citations
Explore all metrics

Abstract

This paper proposes a path planning technique for autonomous agent(s) located in an unstructured networked distributed environment, where each agent has limited and not complete knowledge of the environment. Each agent has only the knowledge available in the distributed memory of the computing node the agent is running on and the agents share some information learned over a distributed network. In particular, the environment is divided into several sectors with each sector located on a single separate distributed computing node. We consider hybrid reactive-cognitive agent(s) where we use autonomous agent motion planning that is based on the use of a potential field model accompanied by a reinforcement learning as well as boundary detection algorithms. Potential fields are used for fast convergence toward a path in a distributed environment while reenforcement learning is used to guarantee a variety of behavior and consistent convergence in a distributed environment. We show how the agent decision making process is enhanced by the combination of the two techniques in a distributed environment. Furthermore, path retracing is a challenging problem in a distributed environment, since the agent does not have complete knowledge of the environment. We propose a backtracking technique to keep the distributed agent informed all the time of its path information and step count including when migrating from one node to another. Note that no node has knowledge of the entire global path from a source to a goal when such a goal resides on a separate node. Each agent has only knowledge of a partial path (internal to a node) and related number of steps corresponding to the portion of the path that agent traversed when running on the node. In particular, we show how each of the agents(s), starting in one of the many sectors with no initial knowledge of the environment, using the proposed distributed technique, develops its intelligence based on its experience and seamlessly discovers the shortest global path to the target, which is located in a different node, while avoiding any obstacle(s) it encounters in its way, including when transitioning and migrating from one distributed computing node to another. The agent(s) use (s) multiple-token-ring message passing interface (MPI) to perform internode communication. Finally, the experimental results of the proposed method show that single and multiagents sharing the same goal and running on the same or different nodes successfully coordinate the sharing of their respective environment states/information to collaboratively perform their respective tasks. The results also show that distributed multiagent sharing information increases by an order of magnitude the speed of convergence to the optimal shortest path to the goal in comparison with the single-agent case or noninformation sharing multiagent case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-agent Rapidly-exploring Pseudo-random Tree

Article 07 March 2017

Armando Alves Neto, Douglas G. Macharet & Mario F. M. Campos

Collaborative Ant Colony Multi-agent Planning System for Autonomous Mobile Robots in a Static Environment

Contribution to the path planning of a multi-robot system: centralized architecture

Article 28 November 2019

Fethi Matoui, Boumedyen Boussaid, … Mohamed Naceur Abdelkrim

References

AAAI (1995) In: Lessor V (ed) Proceedings of the first international conference on multi-agent systems, Menlo Park, CA, June. AAAI Press, Menlo Park
Google Scholar
Al-Dayaa HS, Megherbi DB (2006) Fast reinforcement learning technique via multiple lookahead levels. In: Proceedings of the 2006 international conference on machine learning; models, technologies & applications, Nevada, USA
Google Scholar
Araabi BN, Mastoureshgh S, Ahmadabadi MN (2007) A study on expertise of agents and its effects on cooperative Q-learning. IEEE Trans Syst Man Cybern, Part B, Cybern 32(2):398–409
Article Google Scholar
Bond AH, Gasser L (eds) (1988) Readings in distributed artificial intelligence. Morgan Kaufmann, San Mateo
Google Scholar
Cao J, Spooner DP, Jarvis SA, Nudd GR (2005) Grid load balancing using intelligent agents. J Future Gener Comput Syst
Clausen C, Wechsler H (2000) Quad-Q-learning. IEEE Trans Neural Netw 11(2):279–294
Article Google Scholar
Dai X, Li C-K, Rad AB (2005) An approach to tune fuzzy controllers based on reinforcement learning for autonomous vehicle control. IEEE Trans Intell Transp Syst 6(3):285–293
Article Google Scholar
Decker KS, Williamson M (1987) Intelligent adaptive information agents. The Robotics Institute, Carnegie Mellon University (decker,sycara,mikew)@cs.cmu.edu
Durfee EH, Lesser VR, Corkill DD (1989) Trends in cooperative distributed problem solving. IEEE Trans Data Knowl Eng
Ferber J (1999) Multi-agent systems, an introduction to distributed artificial intelligence. Addison-Wesley, Reading
Google Scholar
Gropp W, Lusk E, Skjellum A (1999) Using MPI. MIT Press, Cambridge
Google Scholar
Guo M, Liu Y, Malec J (2004) A new Q-learning algorithm based on the metropolis criterion. IEEE Trans Syst Man Cybern, Part B, Cybern 34(5):2140–2143
Article Google Scholar
Hadidi R, Jeyasurya B (2010) Selective initial state criteria to enhance convergence rate of Q-learning algorithm in power system stability application. In: IEEE Canadian conference
Google Scholar
Hartvigsen G, Johansen D (2010) Co-operation in distributed artificial intelligence environment—the StromCast application
Hu L, Zhou C, Sun Z (2008) Estimating biped gait using spline-based probability distribution function with Q-learning. IEEE Trans Ind Electron 55(3):1444–1452
Article Google Scholar
Khatib O (1986) Real-time obstacle avoidance for manipulators and mobile robots. Int J Robot Res
Khosla P, Volpe R (1998) Superquadratic artificial potentials for obstacle avoidance and approach. In: Proc. IEEE international conference of robotics and automation, Philadelphia, PA
Google Scholar
Megherbi DB, Malayia V (2007) An autonomous hybrid cognitive/reactive agent path planning technique in a networked distributed unstructured environment for reinforcement learning. In: Proceedings of the international conference on parallel and distributed processing techniques and applications, Las Vegas, June
Google Scholar
Megherbi DB, Radumilo-Franklin J (2009) An intelligent multi-agent distributed battlefield via multi-token message passing. In: IEEE international conference on computational intelligence for measurement systems and applications, China, May 2009
Google Scholar
Megherbi DB, Teirelbar A, Boulenouar AJ (2001) A time-varying-environment machine learning technique for autonomous agent shortest path planning. In: Proceedings of the SPIE international conference on defense sensing, unmanned ground vehicle technology, Orlando, Florida, April, pp 419–428
Google Scholar
Mitchell TM (1997) Machine learning. McGraw Hill, New York
MATH Google Scholar
Newman WS, Hogan N (1985) High-speed robot control and obstacle avoidance using dynamic potential functions. In: Proc IEEE conf on robotics & automation
Google Scholar
Parker L (2000) Current state of the art in distributed robot systems. In: Parker LE, Bekey G, Barhen J (eds) Distributed autonomous robotic systems 4. Springer, Berlin
Google Scholar
Riolo R (1991) Lookahead planning and latent learning in a classifier system. In: Proceedings of the int conf on the simulation of adaptive behavior
Google Scholar
So Y, Durfee E (1992) A distributed problem solving infrastructure for computer network management. Int J Intell Comp Inf Syst
Stone P, Veloso M (2000) Multiagent system: a survey from a machine learning. Auton Robot 8
Sutton RS (1990) Integrated architectures for learning, planning, and reaction based on approximating dynamic programming. In: Proceedings of the seventh international conference on machine learning, pp 216–224
Google Scholar
Sutton RS (1991) Dyna, an integrated architecture for learning, planning, and reacting. In: Working notes of 1991 AAAI spring symposium, pp 151–155
Google Scholar
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge
Google Scholar
Sutton RS, Barto AG, Williams RJ (1992) Reinforcement learning is direct adaptive optimal control. IEEE Control Syst Mag 12(2):19–22
Article Google Scholar
Tan M (1993) Multi-agent reinforcement learning: independent vs. cooperative agents. Readings in agents. Morgan Kaufmann, San Mateo
Google Scholar
Valasek J, Doebbler J, Tandale MD, Meade AJ (2008) Improved adaptive–reinforcement learning control for morphing unmanned air vehicles. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):1014–1020
Article Google Scholar
Van Dyke Parunak H (1996) In: ICMAS proceedings of the second international conference on multi-agent systems
Google Scholar
Watkins C, Dayan P (1992) Q-learning. Mach Learn 8:279–292
MATH Google Scholar
Weiss G (1999) A multiagent framework for planning, reacting and learning. Technical Report FKI-233-99
Weiss G (1999) Multiagent systems a modern approach to distributed artificial intelligence. MIT Press, Cambridge
Google Scholar
Weiss G (1998) A multi-agent perspective of parallel and distributed machine learning. http://wwwbrauer.in.tum.de/~weissg/Docs/weissgaa98.pdf
Wiering MA, van Hasselt H (2008) Ensemble algorithms in reinforcement learning. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):930–936
Article Google Scholar

Download references

Author information

Authors and Affiliations

CMINDS Research Center, Electrical & Computer Engineering Department, University of Massachusetts, Lowell, MA, 01854, USA
Dalila B. Megherbi & Vikram Malayia

Authors

Dalila B. Megherbi
View author publications
You can also search for this author in PubMed Google Scholar
Vikram Malayia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dalila B. Megherbi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Megherbi, D.B., Malayia, V. A hybrid cognitive/reactive intelligent agent autonomous path planning technique in a networked-distributed unstructured environment for reinforcement learning. J Supercomput 59, 1188–1217 (2012). https://doi.org/10.1007/s11227-010-0510-3

Download citation

Published: 22 December 2010
Issue Date: March 2012
DOI: https://doi.org/10.1007/s11227-010-0510-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hybrid cognitive/reactive intelligent agent autonomous path planning technique in a networked-distributed unstructured environment for reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Multi-agent Rapidly-exploring Pseudo-random Tree

Collaborative Ant Colony Multi-agent Planning System for Autonomous Mobile Robots in a Static Environment

Contribution to the path planning of a multi-robot system: centralized architecture

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A hybrid cognitive/reactive intelligent agent autonomous path planning technique in a networked-distributed unstructured environment for reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Multi-agent Rapidly-exploring Pseudo-random Tree

Collaborative Ant Colony Multi-agent Planning System for Autonomous Mobile Robots in a Static Environment

Contribution to the path planning of a multi-robot system: centralized architecture

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation