Abstract
The problem of selecting actions in environments that are dynamic and not completely predictable or observable is a central problem in intelligent behavior. In AI, this translates into the problem of designing controllers that can map sequences of observations into actions so that certain goals are achieved. Three main approaches have been used in AI for designing such controllers: the programming approach, where the controller is programmed by hand in a suitable high-level procedural language, the planning approach, where the control is automatically derived from a suitable description of actions and goals, and the learning approach, where the control is derived from a collection of experiences. The three approaches exhibit successes and limitations. The focus of this paper is on the planning approach. More specifically, we present an approach to planning based on various state models that handle various types of action dynamics (deterministic and probabilistic) and sensor feedback (null, partial, and complete). The approach combines high-level representations languages for describing actions, sensors, and goals, mathematical models of sequential decisions for making precise the various planning tasks and their solutions, and heuristic search algorithms for computing those solutions. The approach is supported by a computational tool we have developed that accepts high-level descriptions of actions, sensors, and goals and produces suitable controllers. We also present empirical results and discuss open challenges.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
R. Brooks, “A robust layered control system for a mobile robot,” IEEE J. of Robotics and Automation, vol. 2, pp. 14-27, 1987.
P. Agre and D. Chapman, “What are plans for?,” Robotics and Autonomous Systems, vol. 6, pp. 17-34, 1990.
L. Padulo and M. Arbib, System Theory, Hemisphere Publishing Co., Philadelphia, 1974.
R. Fikes and N. Nilsson, “STRIPS: A new approach to the application of theorem proving to problem solving,” Artificial Intelligence, vol. 1, pp. 27-120, 1971.
R. Sutton and A. Barto, Introduction to Reinforcement Learning, MIT Press, Cambridge, Mass., 1998.
C. Bishop, Neural Networks and Pattern Recognition, Oxford University Press, New York, 1995.
E. Pednault, “ADL: Exploring the middle ground between Strips and the situation calcules,” in Proc. KR-89, 1989, pp. 324-332.
M. Gelfond and V. Lifschitz, “Representing action and change by logic programs,” J. of Logic Programming, vol. 17, pp. 301-322, 1993.
E. Sandewall, Features and Fluents. The Representation of Knowledge about Dynamical Systems, Oxford Univ. Press, New York, 1994.
H. Geffner and J. Wainer, “Modeling action, knowledge and control,” in Proc. ECAI-98, Wiley, New York, 1998.
M. Puterman, Markov Decision Processes-Discrete Stochastic Dynamic Programming, John Wiley and Sons, Inc., New York, 1994.
D. Bertsekas and J. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, Belmont, Mass., 1996.
N. Nilsson, Principles of Artificial Intelligence, Tioga, Palo Alto, CA, 1980.
A. Barto, S. Bradtke, and S. Singh, “Learning to act using real-time dynamic programming,” Artificial Intelligence, vol. 72, pp. 81-138, 1995.
B. Bonet, G. Loerincs, and H. Geffner, “A robust and fast action selection mechanism for planning,” in Proc. of AAAI-97, MIT Press, 1997, pp. 714-719.
B. Bonet and H. Geffner, “Solving large POMDP susing real time dynamic programming,” in Proc. AAAI Fall Symp. on POMDPs, 1998.
E. Hansen and S. Zilberstein, “Heuristic search in cyclic AND/OR graphs,” in Proc. AAAI-98, 1998, pp. 412-418.
T. Dean and M. Wellman, Planning and Control, Morgan Kaufmann, Los Altos, CA, 1991.
S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, Prentice Hall, New Jersey, 1994.
C. Boutilier, T. Dean, and S. Hanks, “Planning under uncertainty: structural assumptions and computational leverage,” in Proc. of EWSP-95, 1995.
L. Kaebling, M. Littman, and T. Cassandra, “Planning and acting in partially observable stochastic domains,” Artificial Intelligence, vol. 101, no. 1/2, pp. 99-134, 1998.
A. Newell and H. Simon, Human Problem Solving, Prentice-Hall: Englewood Cliffs, NJ, New Jersey, 1972.
B. Bonet and H. Geffner, “Planning as heuristic search: New results,” in Proc. of ECP-99. Springer, New York, 1999.
D. Bertsekas, Dynamic Programming and Optimal Control, Vols. 1 and 2, Athena Scientific, Belmont, Mass., 1995.
T. Dean, L. Kaebling, J. Kirman, and A. Nicholson, “Planning with deadlines in stochastic domains,” in Proc. AAAI93, 1993, MIT Press, pp. 574-579.
M.J. Shoppers, “Universal plans for reactive robots in unpredictable environments,” in Proc. IJCAI-87, 1987, pp. 1039-1046.
N. Nilsson, “Teleo-reactive programs for agent control,” JAIR, vol. 1, pp. 139-158, 1994.
E. Sondik, “The Optimal Control of Partially Observable Markov Processes,” PhD thesis, Stanford University, 1971.
A. Cassandra, L. Kaebling, and M. Littman, “Acting optimally in partially observable stochastic domains,” in Proc. AAAI94, 1994, pp. 1023-1028.
K. Astrom, “Optimal control of markov decision processes with incomplete state estimation,” J. Math. Anal. Appl., vol. 10, pp. 174-205, 1965.
H. Levesque, “What is planning in the presence of sensing,” in Proceedings AAAI-96, Portland, Oregon, 1996, MIT Press, pp. 1139-1146.
G. Collins and L. Pryor, “Planning under uncertainty: Some key issues,” in Proc. IJCAI95, 1995.
C. Anderson, D. Weld, and D. Smith, “Extending Graphplan to handle uncertainty and sensing actions,” in Proc. AAAI-98, AAAI Press, 1998, pp. 897-904.
M. Heger, “Consideration of risk in reinforcement learning,” in Proceedings of the Int. Conf. on Machine Learning, 1994, pp. 105-111.
S. Koenig and R. Simmons, “Real-time search in nondeterministic domains,” in Proceedings IJCAI-95, Morgan Kaufmann, 1995, pp. 1660-1667.
D. Smith and D.Weld, “Conformant graphplan,” in Proc. AAAI-98, AAAI Press, 1998, pp. 889-896.
A. Cimatti and M. Roveri, “Conformant planning via model checking,” in Proc. of ECP-99. Springer, New York, 1999.
D. McDermott, AIPS-98 Planning Competition Results. http://ftp.cs.yale.edu/pub/Mcdermott/aipscompresults. html, 1998.
R. Korf, “Real-time heuristic search,” Artificial Intelligence, vol. 42, pp. 189-211, 1990.
R. Bellman, Dynamic Programming, Princeton University Press, Princeton, 1957.
M. Hauskrecht, “Planning and Control in Stochastic Domains with Incomplete Information,” PhD thesis, MIT, 1997.
E. Hansen, “Solving pomdps by searching in policy space,” in Proc. UAI-98. Morgan Kauffman, 1998.
R. Washington, “BI-POMDP: Bounded, incremental partially-observable Markov model planning,” in Proc. 4th European Conf. on Planning, Springer, 1997, LNAI, vol. 1248.
B. Bonet and H. Geffner, “High-level planning and control with incomplete information using POMDPs,” in Proc. AAAI Fall Symp. on Cognitive Robotics, 1998.
B. Bonet and H. Geffner, “Learning sorting and decision trees with POMDPs,” in Proc. ICML-98, 1998.
W. Lovejoy, “Computationally feasible bounds for partially observed markov decision processes,” Operations Research, pp. 162–175, 1991.
N. Kushmerick, S. Hanks, and D.Weld, “An algorithm for probabilistic planning,” Artificial Intelligence, vol. 76, pp. 239–286, 1995.
T. Dean and K. Kanazawa, “A model for reasoning about persistence and causation,” Computational Intelligence, vol. 5, no. 3, pp. 142–150, 1989.
C. Boutilier, R. Dearden, and M. Goldszmidt, “Exploiting structure in policy construction,” in Proceedings of IJCAI-95, 1995.
A. Blum and M. Furst, “Fast planning through planning graph analysis,” in Proceedings of IJCAI-95, Montreal, Canada, 1995.
H. Kautz and B. Selman, “Pushing the envelope: Planning, propositional logic, and stochastic search,” in Proceedings of AAAI-96, 1996, pp. 1194-1201.
D. Draper, S. Hanks, and D. Weld, “Probabilistic planning with information gathering and contingent execution,” in Proc. of the Second Int. Conference on Artificial Intelligence Planning Systems, AAAI Press, Palo Alto, CA, 1994, pp. 31-36.
D. Knuth, The Art of Computer Programming, Vol. III: Sorting and Searching, Addison-Wesley, Reading, 1973.
R. Reiter, “The frame problem in the situation calculus: A simple solution (sometimes) and a completeness result for goal regression,” in Artificial Intelligence and Mathematical Theory of Computation: Papers in Honor of John McCarthy, edited by V. Lifschitz, Academic Press, 1991, pp. 359-380.
F. Giunchiglia and P. Traverso, “Planning as model checking,” in Proceedings of ECP-99. Springer, New York, 1999.
R. Korf, “Finding optimal solutions to to Rubik's cube using pattern databases,” in Proc. of AAAI-98, 1998, pp. 1202-1207.
M. Fox and D. Long, “The detection and exploitation of symmetry in planning domains,” in Proc. IJCAI-99, 1999.
J. Pearl, Heuristics, Addison-Wesley, Reading, Mass., 1983.
H. Geffner, “Modelling intelligent behaviour: The MDP approach,” in Lecture Notes in AI, volume 1484, edited by H. Coelho, Springer, New York, 1998.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Bonet, B., Geffner, H. Planning and Control in Artificial Intelligence: A Unifying Perspective. Applied Intelligence 14, 237–252 (2001). https://doi.org/10.1023/A:1011286518035
Issue Date:
DOI: https://doi.org/10.1023/A:1011286518035