Article

An online POMDP algorithm for complex multiagent environments

Authors:

Sébastien Paquet,

Brahim Chaib-draaAuthors Info & Claims

AAMAS '05: Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems

Pages 970 - 977

https://doi.org/10.1145/1082473.1082620

Published: 25 July 2005 Publication History

Abstract

In this paper, we present an online method for POMDPs, called RTBSS (Real-Time Belief Space Search), which is based on a look-ahead search to find the best action to execute at each cycle in an environment. We thus avoid the overwhelming complexity of computing a policy for each possible situation. By doing so, we show that this method is particularly efficient for large real-time environments where offline approaches are not applicable because of their complexity. We first describe the formalism of our online method, followed by some results on standard POMDPs. Then, we present an adaptation of our method for a complex multiagent environment and results showing its efficiency in such environments.

References

[1]

D. Aberdeen. A (Revised) Survey of Approximate Methods for Solving Partially Observable Markov Decision Processes. Technical report, National ICT Australia, 2003.

[2]

B. Bonet and H. Geffner. Planning with Incomplete Information as Heuristic Search in Belief Space. In Proc. 5th International Conf. on Artificial Intelligence Planning and Scheduling, pages 52--61, Breckenridge, Colorado, 2000. AAAI Press.

[3]

C. Boutilier, T. Dean, and S. Hanks. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage. Journal of Artificial Intelligence Research, 11:1--94, 1999.

Digital Library

[4]

X. Boyen and D. Koller. Tractable Inference for Complex Stochastic Processes. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pages 33--42, Madison, 1998. Morgan Kaufmann Publishers.

Digital Library

[5]

D. Braziunas and C. Boutilier. Stochastic Local Search for POMDP Controllers. In The Nineteenth National Conference on Artificial Intelligence (AAAI--04), 2004.

Digital Library

[6]

H. Geffner and B. Bonet. Solving Large POMDPs Using Real Time Dynamic Programming. In Working Notes Fall AAAI Symposium on POMDPs, pages 61--68, 1998.

[7]

E. A. Hansen. Solving POMDPs by Searching in Policy Space. In Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI-98), pages 211--219, Madison, Wisconsin, 1998.

Digital Library

[8]

E. A. Hansen and S. Zilberstein. LAO: a Heuristic Search Algorithm that Finds Solutions with Loops. Artificial Intelligence, 129(1--2):35--62, 2001.

Digital Library

[9]

M. Hauskrecht. Value-Function Approximations for Partially Observable Markov Decision Processes. Journal of Artificial Intelligence Research, 13:33--94, 2000.

[10]

L. P. Kaelbling, M. L. Littman, and A. R. Cassandra. Planning and acting in partially observable stochastic domains. Technical Report CS-96-08, Brown University, 1996.

Digital Library

[11]

M. Kearns, Y. Mansour, and A. Y. Ng. Approximate Planning in Large POMDPs via Reusable Trajectories. In S. Solla, T. Leen, and K.-R. Muller, editors, Advances in Neural Information Processing Systems 12. MIT Press, 2000.

[12]

H. Kitano. RoboCup Rescue: A Grand Challenge for Multi-Agent Systems. In Proceedings of ICMAS 2000, Boston, MA, 2000.

Digital Library

[13]

M. L. Littman. Algorithms for Sequential Decision Making. PhD thesis, Brown University, 1996.

Digital Library

[14]

D. McAllester and S. Singh. Approximate Planning for Factored POMDPs using Belief State Simplification. In Proceedings of the 15th Annual Conference on Uncertainty in Artificial Intelligence (UAI--99), pages 409--416, San Francisco, CA, 1999. Morgan Kaufmann Publishers.

Digital Library

[15]

N. Meuleau, K.-E. Kim, L. P. Kaelbling, and A. R. Cassandra. Solving POMDPs by Searching the Space of Finite Policies. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI-99), pages 417--426, San Francisco, 1999. Morgan Kaufmann Publishers.

Digital Library

[16]

C. Papadimitriou and J. N. Tsisiklis. The complexity of Markov decision processes. Mathematics of Operations Research, 12(3):441--450, 1987.

Digital Library

[17]

J. Pineau, G. Gordon, and S. Thrun. Point-Based Value Iteration: An Anytime Algorithm for POMDPs. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI-03), pages 1025--1032, Acapulco, Mexico, 2003.

Digital Library

[18]

P. Poupart. Exploiting Structure to Efficiently Solve Large Scale Partially Observable Markov Decision Processes. PhD thesis, University of Toronto, 2005.

Digital Library

[19]

T. Smith and R. Simmons. Heuristic Search Value Iteration for POMDPs. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (UAI-04), Banff, Canada, 2004. AUAI Press.

Digital Library

[20]

E. J. Sondik. The Optimal Control of Partially Observable Markov Processes. PhD thesis, Stanford University, 1971.

[21]

M. T. J. Spaan and N. Vlassis. A Point-Based POMDP Algorithm for Robot Planning. In Proceedings of the IEEE International Conference on Robotics and Automation, pages 2399--2404, New Orleans, Louisiana, 2004.

[22]

R. Washington. BI-POMDP: Bounded, Incremental Partially-Observable Markov-Model Planning. In Proceedings of the 4th European Conference on Planning, volume 1348 of Lecture Notes in Computer Science, pages 440--451, Toulouse, France, 1997. Springer.

Digital Library

Cited By

Maisto DGregoretti FFriston KPezzulo G(2025)Active inference tree search in large POMDPsNeurocomputing10.1016/j.neucom.2024.129319623(129319)Online publication date: Mar-2025
https://doi.org/10.1016/j.neucom.2024.129319
Kim JPark SJung SCordeiro C(2024)Cooperative Multi-UAV Positioning for Aerial Internet Service Management: A Multi-Agent Deep Reinforcement Learning ApproachIEEE Transactions on Network and Service Management10.1109/TNSM.2024.339239321:4(3797-3812)Online publication date: Aug-2024
https://doi.org/10.1109/TNSM.2024.3392393
Pezzato CCorbato CBonhof SWisse M(2023)Active Inference and Behavior Trees for Reactive Action Planning and Execution in RoboticsIEEE Transactions on Robotics10.1109/TRO.2022.322614439:2(1050-1069)Online publication date: Apr-2023
https://doi.org/10.1109/TRO.2022.3226144
Show More Cited By

Index Terms

An online POMDP algorithm for complex multiagent environments
1. Computing methodologies
  1. Artificial intelligence
2. Theory of computation
  1. Design and analysis of algorithms
    1. Algorithm design techniques
      1. Dynamic programming

Recommendations

Potential-based reward shaping for finite horizon online POMDP planning

In this paper, we address the problem of suboptimal behavior during online partially observable Markov decision process (POMDP) planning caused by time constraints on planning. Taking inspiration from the related field of reinforcement learning (RL), ...
Point-based online value iteration algorithm in large POMDP

Partially observable Markov decision process (POMDP) is an ideal framework for sequential decision-making under uncertainty in stochastic domains. However, it is notoriously computationally intractable to solving POMDP in real-time system. In order to ...
Buyers' Choice of Online Search Strategy and Its Managerial Implications

The Internet offers several tools such as shopping bots and search engines that help potential buyers search for lower prices. This paper defines buyers' online search strategy as using one or more of these tools to search for lower prices, and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

AAMAS '05: Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems

July 2005

1407 pages

ISBN:1595930930

DOI:10.1145/1082473

Program Chairs:
Michal Pechoucek
Czech Republic
,
Donald Steiner
USA
,
Simon Thompson
UK

Copyright © 2005 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2005

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

AAMAS05

Sponsor:

SIGAI

AAMAS05: AAMAS '05 - Fourth International Joint Conference on Autonomous Agents and Multiagent Systems 2005

July 25 - 29, 2005

The Netherlands

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

42
Total Citations
View Citations
964
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Maisto DGregoretti FFriston KPezzulo G(2025)Active inference tree search in large POMDPsNeurocomputing10.1016/j.neucom.2024.129319623(129319)Online publication date: Mar-2025
https://doi.org/10.1016/j.neucom.2024.129319
Kim JPark SJung SCordeiro C(2024)Cooperative Multi-UAV Positioning for Aerial Internet Service Management: A Multi-Agent Deep Reinforcement Learning ApproachIEEE Transactions on Network and Service Management10.1109/TNSM.2024.339239321:4(3797-3812)Online publication date: Aug-2024
https://doi.org/10.1109/TNSM.2024.3392393
Pezzato CCorbato CBonhof SWisse M(2023)Active Inference and Behavior Trees for Reactive Action Planning and Execution in RoboticsIEEE Transactions on Robotics10.1109/TRO.2022.322614439:2(1050-1069)Online publication date: Apr-2023
https://doi.org/10.1109/TRO.2022.3226144
Chen YRosolia UAmes A(2021)Decentralized Task and Path Planning for Multi-Robot SystemsIEEE Robotics and Automation Letters10.1109/LRA.2021.30681036:3(4337-4344)Online publication date: Jul-2021
https://doi.org/10.1109/LRA.2021.3068103
Pouya PMadni A(2020)A Probabilistic Online Policy Estimator for Autonomous Systems Planning and Decision Making2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC42975.2020.9282873(2933-2938)Online publication date: 11-Oct-2020
https://doi.org/10.1109/SMC42975.2020.9282873
Cui HKhardon RWallach HLarochelle HBeygelzimer Ad'Alché-Buc FFox E(2019)Sampling networks and aggregate simulation for online POMDP planningProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3455114(9222-9232)Online publication date: 8-Dec-2019
https://dl.acm.org/doi/10.5555/3454287.3455114
Soria Zurita NColby MTumer IHoyle CTumer K(2017)Design of Complex Engineered Systems Using Multi-Agent CoordinationJournal of Computing and Information Science in Engineering10.1115/1.403815818:1(011003)Online publication date: 28-Nov-2017
https://doi.org/10.1115/1.4038158
Yadav AMarcolino LRice EPetering RWinetrobe HRhoades HTambe MCarmichael H(2016)PSINETAI Magazine10.1609/aimag.v37i2.263237:2(47-62)Online publication date: 1-Jun-2016
https://dl.acm.org/doi/10.1609/aimag.v37i2.2632
Si PYue HZhang YFang Y(2016)Spectrum Management for Proactive Video Caching in Information-Centric Cognitive Radio NetworksIEEE Journal on Selected Areas in Communications10.1109/JSAC.2016.257732034:8(2247-2259)Online publication date: 1-Aug-2016
https://dl.acm.org/doi/10.1109/JSAC.2016.2577320
Yadav AChan HJiang ARice EKamar EGrosz BTambe M(2016)POMDPs for Assisting Homeless Shelters – Computational and Deployment ChallengesAutonomous Agents and Multiagent Systems10.1007/978-3-319-46840-2_5(67-87)Online publication date: 24-Sep-2016
https://doi.org/10.1007/978-3-319-46840-2_5
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten