Heuristic anytime approaches to stochastic decision processes

Fernández, Joaquín L.; Sanz, Rafael; Simmons, Reid G.; Diéguez, Amador R.

doi:10.1007/s10732-006-4834-3

Heuristic anytime approaches to stochastic decision processes

Published: May 2006

Volume 12, pages 181–209, (2006)
Cite this article

Journal of Heuristics Aims and scope Submit manuscript

Joaquín L. Fernández¹,
Rafael Sanz¹,
Reid G. Simmons² &
…
Amador R. Diéguez¹

152 Accesses
Explore all metrics

Abstract

This paper proposes a set of methods for solving stochastic decision problems modeled as partially observable Markov decision processes (POMDPs). This approach (Real Time Heuristic Decision System, RT-HDS) is based on the use of prediction methods combined with several existing heuristic decision algorithms. The prediction process is one of tree creation. The value function for the last step uses some of the classic heuristic decision methods. To illustrate how this approach works, comparative results of different algorithms with a variety of simple and complex benchmark problems are reported. The algorithm has also been tested in a mobile robot supervision architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Aberdeen, D. (2002). “A Survey of Approximate Methods for Solving Partially Observable Markov Decision Processes.” Technical Report. Research School of Information Science and Engineering, The Australian National University.
Barto, A., S. Bradtke, and S. Singh, (1995). “Learning to Act Using Real-Time Dynamic Programming.” Artificial Intelligence 72, 81–138.
Article Google Scholar
Bayer, V. and T. Dietterich. (2001). “Two Heuristics for Solving POMDPs Having a Delayed Need to Observe”. In IJCAI Workshop on Planning under Uncertainty And Incomplete Information.
Bonet, B. and H. Geffner. (1998). “Solving Large POMDPs Using Real Time Dynamic Programming.” In AAAI Fall Symposium on POMDPs.
Bonet, B. and H. Geffner. (2001). “GPT: A Tool for Planning with Uncertainty and Partial Information.” In A. Cimatti, H. Geffner, E. Giunchiglia, and J. Rintanen (eds.), The IJCAI-01 Workshop on Planning with Uncertainty and Partial Information, Seattle, WA pp. 82–87
Boutilier, C. and D. Poole. (1996). “Computing Optimal Policies for Partially Observable Decision Processes using Compact Representations.” AAAI-96, Portland, OR.
Cassandra, A.R. (1998a). “Exact And Approximate Algorithms for Partially Observable Markov Decision Process.” PhD Thesis, Department of Computer Science, Brown University, Providence, Rhode Island.
Cassandra, A.R. (1998b). “A Survey of POMDP Applications.” In Working Notes of AAAI 1998 Fall Symposium on Planning with Partially Observable Markov Decision Processes, pp. 17–24.
Cassandra, A.R. (1999). Tony’s POMDP Web Page: http://www.cs.brown.edu/research/ai/pomdp/index.html.
Cassandra, A.R., L.P. Kaelbling, and J.A. Kurien. (1996). “Acting Under Uncertainty: Discrete Bayesian Models for Mobile-Robot Navigation.” In Proceedings of the International Conference on Intelligent Robots and Systems, IEE/RSJ.
Cassandra, A.R., L.P. Kaelbling, and M.L. Littman. (1994). “Acting Optimally in Partially Observable Stochastic Domains.” In Proceedings of the Twelfth National Conference on Artificial Intelligence.
Cassandra, A.R. M.L. Littman, and N. Zhang. (1997). “Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes.” In Proceedings of the 13 Annual Conference on Uncertainty in Artificial Intelligence (UAI-97).
Dean, T. and M. Boddy. (1988). “An Analysis of Time-Dependent Planning.” In Proceedings of the Seventh National Conference on Artificial Intelligence.
Dean, T., L.P. Kaelbling, J. Kirman, and A. Nicholson. (1995). Planning Under Time Constraints in Stochastic Domains.” Artificial Intelligence 76 (1/2), 35–74.
Google Scholar
Dearden, R. and C. Boutilier. (1994). “Integrating Planning and Execution in Stochastic Domains.” In R. Lopez de Mantaras and D.L. Poole, (eds.), Uncertainty in Artificial Intelligence: In Proceedings of the Tenth Conference, San Francisco, CA.: Morgan Kaufmann Publishers, pp. 162–169.
Dearden, R. and C. Boutilier. (1997). “Abstraction and Approximate Decision Theoretic Planning.”Artificial Intelligence 89 (1/2), 219–283.
MathSciNet Google Scholar
Fernández, J.L. (2000). “Supervision, Detection, Diagnosis and Exception Recovery in Autonomous Mobile Robots.” Ph.D. Thesis, University of Vigo, Spain (English version available for download from: http://www.cs.cmu.edu/~joaquin/research/tesis.tar.gz).
Fernández, J.L., R. Sanz, and A.R. Diéguez. (2003). “Probabilistic Models for Monitoring and Fault Diagnosis: Application and Evaluation in A Mobile Robot.” To appear In Applied Artificial Intelligence.
Fernández, J.L. and R.G. Simmons. (1998). “Robust Execution Monitoring for Navigation Plans.” In Proceedings 1998 IEEE/RSJ International Conference.
Fernández, J.L., R.G. Simmons, R. Sanz, and A.R. Diéguez. (2001). “A Robust Stochastic Supervision Architecture for an Indoor Mobile Robot.” In Proceeding International Conference on Field and Service Robotics (FSR 2001), pp. 269–274.
Hansen, E.A. (1998). “Solving POMDPs by Searching in Policy Space.” Proceedings of XIV International Conference on Uncertainty on Artificial Intelligence (UAI-98).
Hauskrecht, M. (1996). “Planning and Control in Stochastic Domains with Imperfect Information.” PhD Thesis, EECS, MIT.
Hauskrecht, M. (2000). “Value-Functions Approximations for Partially Observable Markov Decision Processes.” Journal of Artificial Intelligence Research 13, 33–94.
MATH MathSciNet Google Scholar
Howard, R.A. and J.E. Matheson. (1981). “Influence Diagrams.” In R.A. Howard and J.E. Matheson (eds.), The Principles and Applications of Decision Analysis, Strategic Decision Group, CA pp. 720–762.
Kaelbling, L.P. M.L. Littman, and A.R. Cassandra. (1998). “Planning and Acting in Partially Observable Stochastic Domains.” Artificial Intelligence 101, 99–134.
Article MathSciNet Google Scholar
Kearns, M., Y. Mansour, and A.Y. Ng. (2000). “Approximate Planning in Large POMDPs via Reusable Trajectories.” Advances in Neural Information Processing Systems, Cambridge.: MIT Press Vol. 12.
Google Scholar
Korf, R. (1990). “Real-Time Heuristic Search.” Artificial Intelligence 42, 189–211.
Article MATH Google Scholar
Kushmerick, N., S. Hanks, and D.S. Weld. (1995). “An Algorithm for Probabilistic Planning.”Artificial Intelligence 76, 239–286.
Article Google Scholar
Littman, M.L. (1994). “The Witness Algorithm for Solving Partially Observable Markov Decision Processes.” Technical Report CS-Brown University, Providence, Rhode Island. pp. 94–40,
Littman, M.L., A.R. Cassandra, and L. Kaelbling. (1995). “Learning Policies for Partially Observable Environments: Scaling Up.” In Proceedings of the 12th International Conference on Machine Learning.
Lovejoy, W.S. (1991). “A Survey of Algorithmic Methods for Partially Observable Markov Decision Processes.” Annals of Operations Research 28, 47–66.
Article MATH MathSciNet Google Scholar
Monahan, G.E. (1982). “A Survey of Partially Observable Markov Decision Process: Theory, Models and Algorithms.” Management Science 28, 1–16.
MATH MathSciNet Google Scholar
Nikovski, D. and I. Nourbakhsh. (2000). “Learning Probabilistic Models for Decision-Theoretic Navigation of Mobile Robots.” In Proceedings 17th International Conference on Machine Learning.
Nourbakhsh, I., R. Powers, and S. Birchfield, (1995). “DERVISH an Office-Navigating Robot.” AI magazine 16, 53–60.
Google Scholar
Parr, R. and S.J. Russell. (1995). “Approximating Optimal Policies for Partially Observable Stochastic Domains.” In Proceedings of IJCAI-95 International Conference.
Pyeatt, L.D. and A.E. Howe. (2000). “A Parallel Algorithm for POMDP Solution.” In Proceedings 5th European Conference on Planning ECP’99, Durham, UK, 1999 (Published In Recent Advances in AI Planning, Lecture Notes in Artificial Intelligence, Springer-Verlag, Vol. 1809.
Satia, J.K. and R.E. Lave. (1973). “Markovian Decision Processes with Probabilistic Observation of States.” Management Science 20, 1–13.
MathSciNet Google Scholar
Sawaki, K. and A. Ichikawa. (1978). “Optimal Control for Partially Observable Markov Decision Processes Over an Infinite Horizon.” Journal of the Operations Research Society of Japan 21, 1–14.
MathSciNet Google Scholar
Simmons, R.G., J.L. Fernández, R. Goodwin, S. Koenig, and J. O’Sullivan. (2000). “Lessons Learned from Xavier.” IEEE Robotics and Automation Magazine 7, 35–39.
Google Scholar
Simmons, R.G. and S. Koenig. (1995). “Probabilistic Navigation in Partially Observable Environments.” In Proceedings of IJCAI-95 International Conference.
Smart, W.D. and L.P. Kaelbling. (2000). “Practical Reinforcement Learning in Continuous Spaces.” In Proceedings 17th International Conference on Machine Learning.
Sondik, E.J. (1978). “The Optimal Control of Partially Observable Markov Process Over The Infinite Horizon: Discounted Cost.” Operations Research 26, 282–304.
MATH MathSciNet Google Scholar
Washington, R. (1996). “Incremental Markov-Model Planning.” In Proceedings 8th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’96).
Washington, R. (1998). “Markov Tracking for Agent Coordination.” In Proceedings of Agents’98: Second International Conference on Autonomous Agents.
Zhang, N.L. and W. Liu. (1996). “Planning in Stochastic Domains: Problem Characteristics and Approximation.” Technical Report HKUST-CS96-31, Department of Computer Science, Hong Kong University of Science and Technology.
Zhang, N.L. and W. Zhang. (2001). “Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Process.” Journal of Artificial Intelligence Research 14, 29–51.
MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of System Engineering, University of Vigo, Campus Lagoas-Marcosende, 36200, Vigo, Spain
Joaquín L. Fernández, Rafael Sanz & Amador R. Diéguez
Computer Science Department, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, 15214, USA
Reid G. Simmons

Authors

Joaquín L. Fernández
View author publications
You can also search for this author inPubMed Google Scholar
Rafael Sanz
View author publications
You can also search for this author inPubMed Google Scholar
Reid G. Simmons
View author publications
You can also search for this author inPubMed Google Scholar
Amador R. Diéguez
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding authors

Correspondence to Joaquín L. Fernández or Rafael Sanz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fernández, J.L., Sanz, R., Simmons, R.G. et al. Heuristic anytime approaches to stochastic decision processes. J Heuristics 12, 181–209 (2006). https://doi.org/10.1007/s10732-006-4834-3

Download citation

Received: 01 July 2003
Accepted: 01 September 2005
Issue Date: May 2006
DOI: https://doi.org/10.1007/s10732-006-4834-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Heuristic anytime approaches to stochastic decision processes

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Planning using hierarchical constrained Markov decision processes

Probabilistic System Modeling for Complex Systems Operating in Uncertain Environments

Probabilistic Approach to Robot Motion Planning in Dynamic Environments

References

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Heuristic anytime approaches to stochastic decision processes

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Planning using hierarchical constrained Markov decision processes

Probabilistic System Modeling for Complex Systems Operating in Uncertain Environments

Probabilistic Approach to Robot Motion Planning in Dynamic Environments

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now