Loading [a11y]/accessibility-menu.js
Adaptive Decentralized Policies With Attention for Large-Scale Multiagent Environments | IEEE Journals & Magazine | IEEE Xplore

Adaptive Decentralized Policies With Attention for Large-Scale Multiagent Environments


Impact Statement:The introduction of the adaptive decentralized policy with attention (ADPA) marks a significant leap in multiagent reinforcement learning (MARL). ADPA adeptly addresses t...Show More

Abstract:

Multiagent reinforcement learning (MARL) poses unique challenges in real-world applications, demanding the adaptation of reinforcement learning principles to scenarios wh...Show More
Impact Statement:
The introduction of the adaptive decentralized policy with attention (ADPA) marks a significant leap in multiagent reinforcement learning (MARL). ADPA adeptly addresses the complexities of large-scale, dynamic environments through its novel attention mechanism, which adeptly facilitates agent adaptability across cooperative, competitive, and mixed scenarios. This innovation not only broadens the scope for practical applications involving multiple interacting agents but also sets the stage for developing more efficient and scalable solutions in the domain of complex multiagent systems.

Abstract:

Multiagent reinforcement learning (MARL) poses unique challenges in real-world applications, demanding the adaptation of reinforcement learning principles to scenarios where agents interact in dynamically changing environments. This article presents a novel approach, “decentralized policy with attention” (ADPA), designed to address these challenges in large-scale multiagent environments. ADPA leverages an attention mechanism to dynamically select relevant information for estimating critics while training decentralized policies. This enables effective and scalable learning, supporting both cooperative and competitive settings, and scenarios with nonglobal states. In this work, we conduct a comprehensive evaluation of ADPA across a range of multiagent environments, including cooperative treasure collection and rover-tower communication. We compare ADPA with existing centralized training methods and ablated variants to showcase its advantages in terms of scalability, adaptability to vario...
Published in: IEEE Transactions on Artificial Intelligence ( Volume: 5, Issue: 10, October 2024)
Page(s): 4905 - 4914
Date of Publication: 18 June 2024
Electronic ISSN: 2691-4581

Contact IEEE to Subscribe

References

References is not available for this document.