skip to main content
10.1145/1082473.1082620acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
Article

An online POMDP algorithm for complex multiagent environments

Published: 25 July 2005 Publication History

Abstract

In this paper, we present an online method for POMDPs, called RTBSS (Real-Time Belief Space Search), which is based on a look-ahead search to find the best action to execute at each cycle in an environment. We thus avoid the overwhelming complexity of computing a policy for each possible situation. By doing so, we show that this method is particularly efficient for large real-time environments where offline approaches are not applicable because of their complexity. We first describe the formalism of our online method, followed by some results on standard POMDPs. Then, we present an adaptation of our method for a complex multiagent environment and results showing its efficiency in such environments.

References

[1]
D. Aberdeen. A (Revised) Survey of Approximate Methods for Solving Partially Observable Markov Decision Processes. Technical report, National ICT Australia, 2003.
[2]
B. Bonet and H. Geffner. Planning with Incomplete Information as Heuristic Search in Belief Space. In Proc. 5th International Conf. on Artificial Intelligence Planning and Scheduling, pages 52--61, Breckenridge, Colorado, 2000. AAAI Press.
[3]
C. Boutilier, T. Dean, and S. Hanks. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage. Journal of Artificial Intelligence Research, 11:1--94, 1999.
[4]
X. Boyen and D. Koller. Tractable Inference for Complex Stochastic Processes. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pages 33--42, Madison, 1998. Morgan Kaufmann Publishers.
[5]
D. Braziunas and C. Boutilier. Stochastic Local Search for POMDP Controllers. In The Nineteenth National Conference on Artificial Intelligence (AAAI--04), 2004.
[6]
H. Geffner and B. Bonet. Solving Large POMDPs Using Real Time Dynamic Programming. In Working Notes Fall AAAI Symposium on POMDPs, pages 61--68, 1998.
[7]
E. A. Hansen. Solving POMDPs by Searching in Policy Space. In Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI-98), pages 211--219, Madison, Wisconsin, 1998.
[8]
E. A. Hansen and S. Zilberstein. LAO: a Heuristic Search Algorithm that Finds Solutions with Loops. Artificial Intelligence, 129(1--2):35--62, 2001.
[9]
M. Hauskrecht. Value-Function Approximations for Partially Observable Markov Decision Processes. Journal of Artificial Intelligence Research, 13:33--94, 2000.
[10]
L. P. Kaelbling, M. L. Littman, and A. R. Cassandra. Planning and acting in partially observable stochastic domains. Technical Report CS-96-08, Brown University, 1996.
[11]
M. Kearns, Y. Mansour, and A. Y. Ng. Approximate Planning in Large POMDPs via Reusable Trajectories. In S. Solla, T. Leen, and K.-R. Muller, editors, Advances in Neural Information Processing Systems 12. MIT Press, 2000.
[12]
H. Kitano. RoboCup Rescue: A Grand Challenge for Multi-Agent Systems. In Proceedings of ICMAS 2000, Boston, MA, 2000.
[13]
M. L. Littman. Algorithms for Sequential Decision Making. PhD thesis, Brown University, 1996.
[14]
D. McAllester and S. Singh. Approximate Planning for Factored POMDPs using Belief State Simplification. In Proceedings of the 15th Annual Conference on Uncertainty in Artificial Intelligence (UAI--99), pages 409--416, San Francisco, CA, 1999. Morgan Kaufmann Publishers.
[15]
N. Meuleau, K.-E. Kim, L. P. Kaelbling, and A. R. Cassandra. Solving POMDPs by Searching the Space of Finite Policies. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI-99), pages 417--426, San Francisco, 1999. Morgan Kaufmann Publishers.
[16]
C. Papadimitriou and J. N. Tsisiklis. The complexity of Markov decision processes. Mathematics of Operations Research, 12(3):441--450, 1987.
[17]
J. Pineau, G. Gordon, and S. Thrun. Point-Based Value Iteration: An Anytime Algorithm for POMDPs. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI-03), pages 1025--1032, Acapulco, Mexico, 2003.
[18]
P. Poupart. Exploiting Structure to Efficiently Solve Large Scale Partially Observable Markov Decision Processes. PhD thesis, University of Toronto, 2005.
[19]
T. Smith and R. Simmons. Heuristic Search Value Iteration for POMDPs. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (UAI-04), Banff, Canada, 2004. AUAI Press.
[20]
E. J. Sondik. The Optimal Control of Partially Observable Markov Processes. PhD thesis, Stanford University, 1971.
[21]
M. T. J. Spaan and N. Vlassis. A Point-Based POMDP Algorithm for Robot Planning. In Proceedings of the IEEE International Conference on Robotics and Automation, pages 2399--2404, New Orleans, Louisiana, 2004.
[22]
R. Washington. BI-POMDP: Bounded, Incremental Partially-Observable Markov-Model Planning. In Proceedings of the 4th European Conference on Planning, volume 1348 of Lecture Notes in Computer Science, pages 440--451, Toulouse, France, 1997. Springer.

Cited By

View all
  • (2025)Active inference tree search in large POMDPsNeurocomputing10.1016/j.neucom.2024.129319623(129319)Online publication date: Mar-2025
  • (2024)Cooperative Multi-UAV Positioning for Aerial Internet Service Management: A Multi-Agent Deep Reinforcement Learning ApproachIEEE Transactions on Network and Service Management10.1109/TNSM.2024.339239321:4(3797-3812)Online publication date: Aug-2024
  • (2023)Active Inference and Behavior Trees for Reactive Action Planning and Execution in RoboticsIEEE Transactions on Robotics10.1109/TRO.2022.322614439:2(1050-1069)Online publication date: Apr-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AAMAS '05: Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
July 2005
1407 pages
ISBN:1595930930
DOI:10.1145/1082473
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. POMDP
  2. online search

Qualifiers

  • Article

Conference

AAMAS05
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)30
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Active inference tree search in large POMDPsNeurocomputing10.1016/j.neucom.2024.129319623(129319)Online publication date: Mar-2025
  • (2024)Cooperative Multi-UAV Positioning for Aerial Internet Service Management: A Multi-Agent Deep Reinforcement Learning ApproachIEEE Transactions on Network and Service Management10.1109/TNSM.2024.339239321:4(3797-3812)Online publication date: Aug-2024
  • (2023)Active Inference and Behavior Trees for Reactive Action Planning and Execution in RoboticsIEEE Transactions on Robotics10.1109/TRO.2022.322614439:2(1050-1069)Online publication date: Apr-2023
  • (2021)Decentralized Task and Path Planning for Multi-Robot SystemsIEEE Robotics and Automation Letters10.1109/LRA.2021.30681036:3(4337-4344)Online publication date: Jul-2021
  • (2020)A Probabilistic Online Policy Estimator for Autonomous Systems Planning and Decision Making2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC42975.2020.9282873(2933-2938)Online publication date: 11-Oct-2020
  • (2019)Sampling networks and aggregate simulation for online POMDP planningProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3455114(9222-9232)Online publication date: 8-Dec-2019
  • (2017)Design of Complex Engineered Systems Using Multi-Agent CoordinationJournal of Computing and Information Science in Engineering10.1115/1.403815818:1(011003)Online publication date: 28-Nov-2017
  • (2016)PSINETAI Magazine10.1609/aimag.v37i2.263237:2(47-62)Online publication date: 1-Jun-2016
  • (2016)Spectrum Management for Proactive Video Caching in Information-Centric Cognitive Radio NetworksIEEE Journal on Selected Areas in Communications10.1109/JSAC.2016.257732034:8(2247-2259)Online publication date: 1-Aug-2016
  • (2016)POMDPs for Assisting Homeless Shelters – Computational and Deployment ChallengesAutonomous Agents and Multiagent Systems10.1007/978-3-319-46840-2_5(67-87)Online publication date: 24-Sep-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media