skip to main content
10.1145/1865909.1865964acmotherconferencesArticle/Chapter ViewAbstractPublication PagespermisConference Proceedingsconference-collections
research-article

A decision-theoretic formalism for belief-optimal reasoning

Published:21 September 2009Publication History

ABSTRACT

Intelligent systems must often reason with partial or corrupted information, due to noisy sensors, limited representation capabilities, and inherent problem complexity. Gathering new information and reasoning with existing information comes at a computational or physical cost. This paper presents a formalism to model systems that solve logical reasoning problems in the presence of uncertainty and priced information. The system is modeled a decision-making agent that moves in a probabilistic belief space, where each information-gathering or computation step changes the belief state. This forms a Markov decision process (MDP), and the belief-optimal system operates according to the belief-space policy that optimizes the MDP. This formalism makes the strong assertion that belief-optimal systems solve the reasoning problem at minimal expected cost, given the background knowledge, sensing capabilities, and computational resources available to the system. Furthermore, this paper argues that belief-optimal systems are more likely to avoid overfitting to benchmarks than benchmark-optimized systems. These concepts are illustrated on a variety of toy problems as well as a path optimization problem encountered in motion planning.

References

  1. D. Berry and B. Fristedt. Bandit Problems. Chapman and Hall, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  2. C. Boutilier, T. Dean, and S. Hanks. Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11:1--94, 1999.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Burns and O. Brock. Single-query motion planning with utility-guided random trees. In IEEE Int. Conf. Rob. Aut., 2007.Google ScholarGoogle ScholarCross RefCross Ref
  4. V. A. Cicirello and S. F. Smith. The max k-armed bandit: A new model of exploration applied to search heuristic selection. In AAAI, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Fouskakis and D. Draper. Comparing stochastic optimization methods for variable selection in binary outcome prediction, with application to health policy. Journal of the American Statistical Association, 03(484):1367--1381, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  6. R. Geraerts and M. Overmars. Clearance based path optimization for motion planning. In IEEE Int. Conf. Rob. Aut., New Orleans, LA, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  7. R. Geraerts and M. H. Overmars. Creating high-quality paths for motion planning. Intl. J. of Rob. Res., 26(8):845--863, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Greiner, R. Hayward, M. Jankowska, and M. Molloy. Finding optimal satisficing strategies for and-or trees. Artificial Intelligence, 170:19--58, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Hauser. Motion Planning for Legged and Humanoid Robots. PhD thesis, Stanford University, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Hsu, J. Latombe, and H. Kurniawati. On the probabilistic foundations of probabilistic roadmap planning. Int. J. Rob. Res., 25(7):627--643, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Hsu, G. Sánchez-Ante, and Z. Sun. Hybrid prm sampling with a cost-sensitive adaptive strategy. In IEEE Int. Conf. Rob. Aut., pages 3885--3891, Barcelona, Spain, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  12. M. Kearns, Y. Mansour, and A. Y. Ng. Approximate planning in large pomdps via reusable trajectories. In Advances in Neural Information Processing Systems 12. MIT Press, 2000.Google ScholarGoogle Scholar
  13. H. Kurniawati, D. Hsu, and W. Lee. Sarsop: Efficient point-based pomdp planning by approximating optimally reachable belief spaces. In Proc. Robotics: Science and Systems, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  14. S. M. LaValle and J. J. Kuffner, Jr. Rapidly-exploring random trees: progress and prospects. In WAFR, 2000.Google ScholarGoogle Scholar
  15. O. Madani, S. Hanks, and A. Condon. On the undecidability of probabilistic planning and infinite-horizon partially observable markov decision problems. In AAAI/IAAI, pages 541--548, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. H. Papadimitriou and J. N. Tsitsiklis. The complexity of markov chain decision processes. Mathematics of Operations Research, 12(3):441--450, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Rubinstein. Modeling Bounded Rationality. MIT Press, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  18. G. Sánchez and J.-C. Latombe. On delaying collision checking in PRM planning: Application to multi-robot coordination. Int. J. of Rob. Res., 21(1):5--26, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  19. J. F. Traub, H. W. G. W., and Wasilkowski. Information-Based Complexity. Academic Press, New York, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. G. Wilfong. Motion planning in the presence of movable obstacles. In Fourth Annual Symposium on Computational Geometry, pages 279--288, Urbana-Champaign, IL, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Zlochin, M. Birattari, N. Meuleau, and M. Dorigo. Model-based search for combinatorial optimization: A critical survey. Annals of Operations Research, 131(1--4):373--395, October 2004.Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    PerMIS '09: Proceedings of the 9th Workshop on Performance Metrics for Intelligent Systems
    September 2009
    322 pages
    ISBN:9781605587479
    DOI:10.1145/1865909

    Copyright © 2009 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 21 September 2009

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
  • Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader