Elsevier

Robotics and Autonomous Systems

Volume 83, September 2016, Pages 214-230
Robotics and Autonomous Systems

Planning and execution through variable resolution planning

https://doi.org/10.1016/j.robot.2016.04.009Get rights and content

Highlights

  • A novel technique for planning and execution in dynamic and stochastic environments.

  • When planning, the technique removes information far into the future.

  • Planning information is abstracted by selecting several predicates.

  • Planning and execution performance are improved by computing plans very fast.

Abstract

Generating sequences of actions–plans–for robots using Automated Planning in stochastic and dynamic environments has been shown to be a difficult task with high computational complexity. These plans are composed of actions whose execution might fail due to different reasons. In many cases, if the execution of an action fails, it prevents the execution of some (or all) of the remainder actions in the plan. Therefore, in most real-world scenarios computing a complete and sound (valid) plan at each (re-)planning step is not worth the computational resources and time required to generate the plan. This is specially true given the high probability of plan execution failure. Besides, in many real-world environments, plans must be generated fast, both at the start of the execution and after every execution failure. In this paper, we present Variable Resolution Planning which uses Automated Planning to quickly compute a reasonable (not necessarily sound) plan. Our approach computes an abstract representation–removing some information from the planning task–which is used once a search depth of k steps has been reached. Thus, our approach generates a plan where the first k actions are applicable if the domain is stationary and deterministic, while the rest of the plan might not be necessarily applicable. The advantages of this approach are that it: is faster than regular full-fledged planning (both in the probabilistic or deterministic settings); does not spend much time on the far future actions that probably will not be executed, since in most cases it will need to replan before executing the end of the plan; and takes into account some information of the far future, as an improvement over pure reactive systems. We present experimental results on different robotics domains that simulate tasks on stochastic environments.

Introduction

Automated Planning (AP) is the branch of Artificial Intelligence that studies the generation of an ordered set of actions–plan–that allows a system to transit from a given initial state to a state where a set of goals have been achieved. AP has been successfully used to solve real world problems such as planning Mars exploration missions  [1] or controlling underwater vehicles  [2]. Despite of these examples, the application of AP systems to stochastic and dynamic environments still presents some challenges, mainly because these scenarios increase the complexity of the planning and execution process: (i) new information about the environment can be discovered during action execution, modifying the structure of the planning task; (ii) actions’ execution can fail which in turn prevents the execution of the rest of the plan; (iii) the execution of the actions in the plan can generate states from which the rest of the plan cannot be successfully executed (dead-ends); and (iv) plans may need to be generated quickly to offer a real time interaction between the AP system and the environment. For these reasons, the process of generating a plan of actions can be prohibitively expensive for this kind of scenarios.

There are two main (extreme) approaches to solve problems in stochastic and dynamic scenarios: deliberative and reactive. At one extreme, we find deliberative systems which are based on interleaving AP and execution with full or partial observability. If we have information about the dynamics of the environment (failures in the actuators of a robot, the structure of the terrain, accuracy of sensors), we can define a domain model with probabilistic information with full observability (such as in PPDDL  [3] or RDDL  [4]). Then, one alternative consists on building conditional plans  [5] where plans take into account all possible outcomes. Another approach consists on generating a set of policies by solving the problem as a Markov Decision Process (MDP)  [6], [7], [8].

But, usually, the dynamics of the environment are not known or cannot be easily modeled. Then, in turn, we have two alternatives. First, we can learn the dynamics and then apply the previous approaches. However, the learning effort is huge except for small tasks  [9]. Another solution, and the most used one, consists of using a deterministic domain model and replan or repair the plan when a failure in execution is detected (e.g. the robot is not in the expected place). When replanning  [10], the planner generates an initial applicable plan and executes it, one action at a time. If an unexpected state is detected, the system generates a new plan from scratch. This process is repeated until the system reaches the problem goals. Therefore, at each planning (re-planning) step, including the initial one, the system is devoting a huge computational effort on computing a valid plan (an applicable plan that achieves the goals), when most of it will not be used. When repairing a running plan  [11], [12], [13], the planner generates an initial applicable plan and executes it. If an unexpected state is detected, the system generates a new plan by reusing the plan generated previously and adding/removing some actions. In general, deliberative systems require a huge computational effort to generate a complete and sound plan. Depending upon the dynamics of the environment, most probably the plan will not be executed fully.

On the other extreme, there are several approaches that solve problems in stochastic and dynamic scenarios using reactive techniques. These systems are based in greedily selecting the next action to be applied according to some knowledge which has been programmed or learned previously. If the knowledge about the environment is only used to select the next action, we can consider a pure reactive system without deliberation, where the system perceives and generates the next action in a continuous cycle. Systems based on the Subsumption architecture  [14], [15] are built using a control layer set, where different layers are interconnected with signals. During each execution step, one layer is chosen depending on the information perceived. Other reactive approaches are based on building reactive behavioral navigation controllers using neural networks  [16], [17] or fuzzy logic  [18], [19]. In general, reactive systems require much less computational effort and are “mostly” blind with respect to the future; they usually ignore the impact of the selected action on the next actions and states. Thus, they often get trapped in local minima or dead-ends.

In this paper, we propose Variable Resolution Planning (vrp) for interleaving planning and execution in stochastic and dynamic environments. Our research has been inspired by the work of Zickler and Veloso  [20], where a motion planning technique is used to generate a collision-free trajectory from an initial state to a goal state. They consider the far future with a different level of detail, by selectively ignoring the physical interactions with dynamic objects. Similarly, vrp is based on two main concepts: (i) most planning effort is devoted to compute a valid plan head of length k; and (ii) the rest of the plan is only generated by checking for potential reachability by relaxing the actions’ model. Actions are simplified by removing some domain details to decrease the computational effort avoiding dead-ends. The main advantage of our approach is that it requires much less search time than traditional planning approaches that compute a valid complete plan (improving over pure deliberative approaches), while retaining their capability of reasoning into the future (improving over pure reactive approaches). In addition, our technique can be easily parameterized by appropriately setting a value for k so that its behavior gradually transits from a more deliberative approach (large values of k) to a more reactive approach (small values of k). In the extremes, if k=1, vrp becomes an almost pure reactive system, while if k=, vrp behaves as a standard deliberative planner.

This paper is organized as follows: first in Section  2, we formally define the representation of the planning task in classical planning. Section  3 presents an overview of vrp. Section  4 introduces the concept of predicate abstraction and how it can be deployed in AP. Section  5 describes the algorithms related to vrp. Section  6 presents a description of the planning and execution architecture used to deploy vrp. Section  7 shows experimental evaluation of vrp in five different domains. Section  8 presents some works related with our approach. Finally, Section  9 concludes and introduces future work.

Section snippets

Planning formalization

There are different types of planning tasks defined in the literature. In this paper, we consider the sequential classical planning task which is encoded in the propositional fragment of Planning Domain Description Language (PDDL) 2.2. It includes advanced features like numeric fluents, ADL conditions, effects and derived predicates (axioms).

Definition 1 Planning Task

A planning task can be defined as a tuple Π=(F,A,I,G), where:

  • F is a finite set of grounded literals (also known as facts or atoms).

  • A is a finite set of

Variable resolution planning architecture

To offer the reader an overview of the vrp’s architecture, Fig. 1 shows the architecture of vrp with its main phases and how these are connected. The vrp technique is composed of three different phases. In the first phase, called Knowledge Gathering, information about the planning task is extracted. There are different ways to extract this information, and some of these approaches are described in Section  5. The information extracted on the previous phase is used by the Abstraction Generation

Definitions

Our approach is based on removing some future details about the planning task to speed up search on hard problems, which are executed in a stochastic and/or dynamic environment. So, we have to see first how PDDL defines planning tasks. PDDL mainly uses a kind of first-order logic. The set of predicates allows to represent actions, states and goals. For instance, in order to describe the current location of a robot, we define the predicate (atrobot1location1). Predicate at describes that robot1

The variable resolution planning algorithm

As we have discussed previously, predicates can be removed from the original planning task to decrease the search space and generate a new smaller abstracted search state. But, first, we need to choose what predicates will be selected (discussed in Knowledge gathering) and when they will be removed during search (discussed in Search).

Planning and execution environment

This section presents the planning and execution environment used to deploy vrp. We have implemented a planning, execution and replanning loop based on a light version of the PELEA architecture4   [32], [33] which uses the simulator MDPSim  [34] to emulate the execution of plans. MPDSim executes the actions

Experimental results

This section presents the experimental results of using akfd for planning and execution. We compare its performance against the closest competitors models in terms of planning and execution systems based on PDDL and its variants:

  • Classical planning, using planning and replanning when an execution failure is detected. We use LAMA11, an anytime planner developed within the Fast-Downward framework  [27]. Once LAMA11 has found a first solution, it continues to search for better solutions until it

Related work

Abstractions and Automated Planning (AP) techniques have been combined in the literature in different ways. Our work focuses on applying abstractions over Automated Planning to decrease the computational overhead in stochastic or dynamic environments. There are three principal trends which are related to our work: (i) approaches based on generating abstraction in AP to speed up the planning process or increasing its capabilities to solve hard problems; (ii) approaches interleaving planning and

Conclusions and future work

In this paper, we have presented Variable Resolution Planning (vrp), a novel technique that uses an abstraction mechanism that dynamically removes some predicates during the planning process. Our approach is able to significantly cut down computational effort of the search process. The corresponding abstraction is only used in nodes of the search tree that are far away from the initial state of the search. The exact computation of a plan in those nodes is not crucial, given that most probably

Acknowledgments

This research has been partially supported by the Spanish MICINN projects TIN2011-27652-C03-02, TIN2012-38079-C03-02 and Comunidad de Madrid—UC3M (CCG10-UC3M/TIC-5597). The main author is supported by a Ph.D. grant from University Carlos III de Madrid. We offer our gratitude and special thanks to Francisco Javier García Polo, for his generous and invaluable comments during the revision of this paper.

Moisés Martínez obtained his Bachelor in Computer Science and Engineering from Universidad Carlos III de Madrid in 2011 and his master in Computer Science and Technology in 2010 with a specialization in artificial intelligence from the same university. He has held a research fellowship since 2011 in the Planning and Learning Group of Universidad Carlos III de Madrid, where he is currently studying his Ph.D. on artificial intelligence. He is also Professor of the bachelor in Computer Science and

References (51)

  • R.E. Korf

    Real-time heuristic search

    Artificial Intelligence

    (1990)
  • M. Shimbo et al.

    Controlling the learning process of real-time heuristic search

    Artificial Intelligence

    (2003)
  • S. Koenig et al.

    Lifelong planning a*

    Artificial Intelligence

    (2004)
  • M. Ai-Chang et al.

    MAPGEN: mixed-initiative planning and scheduling for the mars exploration rover mission

    IEEE Intell. Syst.

    (2004)
  • K. Rajan, C. McGann, F. Py, H. Thomas, Robust mission planning using deliberative autonomy for autonomous underwater...
  • H.L.S. Younes, M.L. Littman, Ppddl1.0: An extension to pddl for expressing planning domains with probabilistic effects,...
  • S. Sanner, Relational dynamic influence diagram language (RDDL): Language description,...
  • M.A. Peot, D.E. Smith, Conditional nonlinear planning, in: M. Kaufmann (Ed.), Proceedings of the First International...
  • B. Bonet et al.

    mGPT: A probabilistic planner based on heuristic search

    J. Artificial Intelligence Res.

    (2005)
  • S.W. Yoon, A. Fern, R. Givan, S. Kambhampati, Probabilistic planning via determinization in hindsight., in: Proceedings...
  • A. Kolobov,  Mausam, D.S. Weld, Classical planning in MDP heuristics: with a little help from generalization, in:...
  • L.S. Zettlemoyer, H. Pasula, L.P. Kaelbling, Learning planning rules in noisy stochastic worlds, in: Proceedings of the...
  • S.W. Yoon, A. Fern, R. Givan, Ff-replan: A baseline for probabilistic planning, in: Proceedings of the Seventeenth...
  • R.V.D. Krogt, M.D. Weerdt, Plan repair as an extension of planning, in: Proceedings of the 15th International...
  • M. Fox, A. Gerevini, D. Long, I. Serina, Plan stability: Replanning versus plan repair, in: Proceedings of the...
  • D. Borrajo et al.

    Probabilistically reusing plans in deterministic planning

  • R.A. Brooks

    A robust layered control system for a mobile robot

    IEEE J. Robot. Autom.

    (1986)
  • G. Butler et al.

    Object-oriented design of the subsumption architecture

    Softw. Pract. Exp.

    (2001)
  • E. Zalama, J. Gómez, M. Paul, P.J. Ramón, Adaptive behavior navigation of a mobile robot, in: IEEE International...
  • S.X. Yang et al.

    Real-time collision-free motion planning of a mobile robot using a neural dynamics-based approach

    Int. J. Robot. Autom.

    (2003)
  • E. Aguirre et al.

    A fuzzy perceptual model for ultrasound sensors applied to intelligent navigation of mobile robots

    Appl. Intell.

    (2003)
  • A. Zhu, S.X. Yang, A goal-oriented fuzzy reactive control for mobile robots with automatic rule optimization., in:...
  • S. Zickler, M. Veloso, Variable level-of-detail motion planning in environments with poorly predictable bodies, in:...
  • A. Gerevini, D. Long, Plan constraints and preferences in pddl3, in: Proceedings of ICAPS’06 Workshop on Soft...
  • M. Helmert

    The fast downward planning system

    J. Artificial Intelligence Res.

    (2006)
  • Cited by (7)

    • Adaptive and intelligent robot task planning for home service: A review

      2023, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      Determining how to make robots adapt to uncertainty and changes as well as re-plan to finish given tasks is a key issue. In many real-world environments, plans must be generated quickly, both at the beginning of execution and after each execution failure (Martinez et al., 2016). In the meantime, timeliness is critical for achieving natural and smooth interactions, especially when robots not only interact with people but also with the environment through sensors such as vision, producing vast quantities of information.

    • Computing Opportunities to Augment Plans for Novel Replanning during Execution

      2021, Proceedings International Conference on Automated Planning and Scheduling, ICAPS
    • On-Line Case-Based Policy Learning for Automated Planning in Probabilistic Environments

      2018, International Journal of Information Technology and Decision Making
    View all citing articles on Scopus

    Moisés Martínez obtained his Bachelor in Computer Science and Engineering from Universidad Carlos III de Madrid in 2011 and his master in Computer Science and Technology in 2010 with a specialization in artificial intelligence from the same university. He has held a research fellowship since 2011 in the Planning and Learning Group of Universidad Carlos III de Madrid, where he is currently studying his Ph.D. on artificial intelligence. He is also Professor of the bachelor in Computer Science and Engineering, where he teaches in two different courses: Machine Learning and Artificial Intelligence in VideoGames. His research fields are Robotics and Automated Planning, having several papers of these topics published in conference proceedings.

    Fernando Fernández is a Professor of Computer Science at Universidad Carlos III de Madrid since 2005. He received his Ph.D. degree in Computer Science from University Carlos III of Madrid (UC3M) in 2003. He received his B.Sc. in 1999 from UC3M, also in Computer Science. Since 2001, he became assistant and associate professor at UC3M. In the fall of 2000, Fernando was a visiting student at the Center for Engineering Science Advanced Research at Oak Ridge National Laboratory (Tennessee). He was also a postdoctoral fellow at the Computer Science Department of Carnegie Mellon University (visit his old web page at CMU) since October 2004 until December 2005. He is the recipient of a pre-doctoral FPU fellowship award from Spanish Ministry of Education (MEC), a Doctoral Prize from UC3M, and a MEC-Fulbright postdoctoral Fellowship. He has more than 30 journal and conference papers, mainly in the field of machine learning and planning. He has advised 3 Ph.D. thesis.

    Daniel Borrajo is a Professor of Computer Science at Universidad Carlos III de Madrid since 1998. He received his Ph.D. in Computer Science in 1990 and B.S. in Computer Science both at Universidad Politcnica de Madrid. He has published over 150 journal and conference papers mainly in the fields of problem solving methods (heuristic search, automated planning and game playing) and machine learning. He has been the PI and/or participated in over 30 research projects and networks funded at all levels (regional, national and European). He has been the Program Co-chair of the International Conference of Automated Planning and Scheduling (ICAPS’13), Conference co-chair of the Symposium of Combinatorial Search (SoCS’12, SoCS’11) and ICAPS’06, Chair of the Spanish conference on AI (CAEPIA’07), PC member of conferences as IJCAI (senior PC at IJCAI’07, ’11 and ’13), AAAI, ICAPS, ICML, or ECML. He has advised 14 Ph.D. thesis, and is currently member and treasurer of the ICAPS Council.

    View full text